Discussion:
mount / unmount and mountcheckdirs()
Andriy Gapon
2016-05-15 13:37:05 UTC
Permalink
I am curious about the purpose of mountcheckdirs() called when mounting and
unmounting a filesystem.

The function is described as such:
/*
* Scan all active processes and prisons to see if any of them have a current
* or root directory of `olddp'. If so, replace them with the new mount point.
*/
and it seems to be used to "lift" processes and jails to a root of a new
filesystem when it is mounted and to "lower" them onto a covered vnode (if any)
when a filesystem is unmounted.

What's the purpose of those actions?
It's strange that the machinations are done at all, but it is stranger that they
are applied only to processes and jails at exactly a covered vnode and a root
vnode. Anything below in a filesystem's tree is left alone. Is there anything
so very special about being at exactly those points?

IMO, the machinations can have unexpected security consequences.

A little bit of history.
mountcheckdirs() was first added in r22521 (circa 1997) as checkdirs with a
rather non-specific commit message. Initially it was used only when a
filesystem was mounted.
Then, in r73241 (circa 2002) the function was added to dounmount():
The checkdirs() function is called at mount time to find any process
fd_cdir or fd_rdir pointers referencing the covered mountpoint
vnode. It transfers these to point at the root of the new filesystem.
However, this process was not reversed at unmount time, so processes
with a cwd/root at a mount point would unexpectedly lose their
cwd/root following a mount-unmount cycle at that mountpoint.
...
Dounmount() now undoes the actions
taken by checkdirs() at mount time; any process cdir/rdir pointers
that reference the root vnode of the unmounted filesystem are
transferred to the now-uncovered vnode.
--
Andriy Gapon
Mateusz Guzik
2016-05-15 16:53:32 UTC
Permalink
Post by Andriy Gapon
I am curious about the purpose of mountcheckdirs() called when mounting and
unmounting a filesystem.
/*
* Scan all active processes and prisons to see if any of them have a current
* or root directory of `olddp'. If so, replace them with the new mount point.
*/
and it seems to be used to "lift" processes and jails to a root of a new
filesystem when it is mounted and to "lower" them onto a covered vnode (if any)
when a filesystem is unmounted.
What's the purpose of those actions?
It's strange that the machinations are done at all, but it is stranger that they
are applied only to processes and jails at exactly a covered vnode and a root
vnode. Anything below in a filesystem's tree is left alone. Is there anything
so very special about being at exactly those points?
IMO, the machinations can have unexpected security consequences.
I don't know why this was implemented. It is also being done in NetBSD.
It is not done in Solaris nor Linux.

Replacement is buggy in at least 2 ways:
1. the process vs jail vnode replacement leaves a time window where
these 2 don't match, which screws up with the look up
2. on fork we can have a 'struct filedesc' object copied but not
assigned to the new process yet, so it ends up with the old vnode

And indeed, interested parties still have access to old vnodes by means
of having a file descriptor.

That said, this likely needs to be simply changed to /deny/ mount
operations which would alter jail roots.
--
Mateusz Guzik <mjguzik gmail.com>
Andriy Gapon
2017-09-14 12:58:13 UTC
Permalink
Post by Mateusz Guzik
Post by Andriy Gapon
I am curious about the purpose of mountcheckdirs() called when mounting and
unmounting a filesystem.
/*
* Scan all active processes and prisons to see if any of them have a current
* or root directory of `olddp'. If so, replace them with the new mount point.
*/
and it seems to be used to "lift" processes and jails to a root of a new
filesystem when it is mounted and to "lower" them onto a covered vnode (if any)
when a filesystem is unmounted.
What's the purpose of those actions?
It's strange that the machinations are done at all, but it is stranger that they
are applied only to processes and jails at exactly a covered vnode and a root
vnode. Anything below in a filesystem's tree is left alone. Is there anything
so very special about being at exactly those points?
IMO, the machinations can have unexpected security consequences.
I don't know why this was implemented. It is also being done in NetBSD.
It is not done in Solaris nor Linux.
I've done some testing on illumos and while it indeed does not do any "lifting",
it does not permit to mount over a directory which is a current directory for
any process. I haven't tested with zones, but suspect that it would also be
prohibited.
Post by Mateusz Guzik
1. the process vs jail vnode replacement leaves a time window where
these 2 don't match, which screws up with the look up
2. on fork we can have a 'struct filedesc' object copied but not
assigned to the new process yet, so it ends up with the old vnode
And indeed, interested parties still have access to old vnodes by means
of having a file descriptor.
Yeah.
Post by Mateusz Guzik
That said, this likely needs to be simply changed to /deny/ mount
operations which would alter jail roots.
That's probably what we should do.
Or rather, Someone Should Do It.
--
Andriy Gapon
Edward Tomasz Napierała
2016-05-16 07:03:14 UTC
Permalink
Post by Andriy Gapon
I am curious about the purpose of mountcheckdirs() called when mounting and
unmounting a filesystem.
[..]

Whatever you do, please make sure you don't break autofs, and reroot,
esp. firmware(9) loading after reroot. I'll happily test patches, just
mail them to me. Thanks :-)
Andriy Gapon
2016-05-16 07:43:08 UTC
Permalink
Post by Edward Tomasz Napierała
Post by Andriy Gapon
I am curious about the purpose of mountcheckdirs() called when mounting and
unmounting a filesystem.
[..]
Whatever you do, please make sure you don't break autofs, and reroot,
esp. firmware(9) loading after reroot. I'll happily test patches, just
mail them to me. Thanks :-)
Well, the only patch I had in mind (besides
https://svnweb.freebsd.org/changeset/base/299913) is completely removing
mountcheckdirs(). But now that you mentioned autofs and reroot I am not sure
that it could be that simple...
--
Andriy Gapon
Kirk McKusick
2016-05-22 06:40:49 UTC
Permalink
Subject: mount / unmount and mountcheckdirs()
Date: Sun, 15 May 2016 16:37:05 +0300
I am curious about the purpose of mountcheckdirs() called when mounting and
unmounting a filesystem.
/*
* Scan all active processes and prisons to see if any of them have a current
* or root directory of `olddp'. If so, replace them with the new mount point.
*/
and it seems to be used to "lift" processes and jails to a root of a new
filesystem when it is mounted and to "lower" them onto a covered vnode (if any)
when a filesystem is unmounted.
What's the purpose of those actions?
It's strange that the machinations are done at all, but it is stranger that they
are applied only to processes and jails at exactly a covered vnode and a root
vnode. Anything below in a filesystem's tree is left alone. Is there anything
so very special about being at exactly those points?
IMO, the machinations can have unexpected security consequences.
A little bit of history.
mountcheckdirs() was first added in r22521 (circa 1997) as checkdirs with a
rather non-specific commit message. Initially it was used only when a
filesystem was mounted.
The checkdirs() function is called at mount time to find any process
fd_cdir or fd_rdir pointers referencing the covered mountpoint
vnode. It transfers these to point at the root of the new filesystem.
However, this process was not reversed at unmount time, so processes
with a cwd/root at a mount point would unexpectedly lose their
cwd/root following a mount-unmount cycle at that mountpoint.
...
Dounmount() now undoes the actions
taken by checkdirs() at mount time; any process cdir/rdir pointers
that reference the root vnode of the unmounted filesystem are
transferred to the now-uncovered vnode.
--
Andriy Gapon
I added the checkdirs functionality in the mount direction only
(I actually did it in 4.4BSD-Lite and it got swept in with commit
22521). The reason is that when a directory that is not empty is
mounted on, the expectation is that the entries in that directory
should no longer be present; rather they should be replaced by the
entries in the newly mounted directory. Thus all processes sitting
in the mounted on directory should see the newly mounted directory
as if they had come to it using a lookup after the mount had been
done. If a process had proceeded through the mounted on directory
into one of its other entries, then they are left alone until such
time as they chdir back into the mount point directory through ".."
at which time they will be passed up to the mounted directory using
the same mechanism that would put them there if they traversed into
the mount point from above it in the tree. I believe this is the
correct behavior, is not a security threat, and should be left alone.

I was not aware that the functionality had been added at unmount
time, and I do not believe that it should have been done. Normally
an unmount will not succeed if any vnodes are busy (for example, if
any directory in the filesystem is a current directory). The only
way that it can succeed in such a case is if a forcible unmount is
done. The forcible unmount will effectively do a revoke(2) on all
current directory vnodes in the unmounted filesystem. Further attempts
to access them will fail with "." not found errors. The only way to
get a valid current directory is to chdir to an absolute pathname.
Gratuitously fixing this if you happen to be in the former root of
the filesystem is wrong. And as you note can lead to unintensionally
giving an escape path from a prison. So I concur with your removing
this added functionality.

Kirk McKusick
that it can succeed if any
Andriy Gapon
2017-09-14 12:45:07 UTC
Permalink
Post by Kirk McKusick
I added the checkdirs functionality in the mount direction only
(I actually did it in 4.4BSD-Lite and it got swept in with commit
22521). The reason is that when a directory that is not empty is
mounted on, the expectation is that the entries in that directory
should no longer be present; rather they should be replaced by the
entries in the newly mounted directory. Thus all processes sitting
in the mounted on directory should see the newly mounted directory
as if they had come to it using a lookup after the mount had been
done. If a process had proceeded through the mounted on directory
into one of its other entries, then they are left alone until such
time as they chdir back into the mount point directory through ".."
at which time they will be passed up to the mounted directory using
the same mechanism that would put them there if they traversed into
the mount point from above it in the tree. I believe this is the
correct behavior, is not a security threat, and should be left alone.
I almost dropped a ball on this issue, but I am now picking it up again.
At the moment I am moving forward with the dounmount change as it seems to be
non-contentious and rather simple to do and test.

Regarding the mount part, I am not sure that I completely agree with you.
Even if mountcheckdirs() does not cause any problems in the mount path, I still
fail to see its usefulness. Specifically, I still do not see any significant
difference between the covered directory and any directory below it. So, if we
leave the lower directories alone, while bother with the covered directory...

The covered directory:
- absolute paths work correctly
- relative paths with enough ".." (one) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original filesystem


The lower directories:
- absolute paths work correctly
- relative paths with enough ".." (> 1) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original filesystem

The only difference I can think of is that the root of the mounted filesystem
cannot be reached with just ".."-s from the covered directory. But is this
difference of any significance?

Mateusz also raised some interesting points.

On the other hand, it seems that illumos and probably Solaris has interesting
parallels to the FreeBSD behavior. It does not allow to mount over a directory
that is a current directory for any process ("Device busy"), but does not object
against processes in directories below the mount point.

So, probably it's just I who misses something about that scenario :-)
Post by Kirk McKusick
I was not aware that the functionality had been added at unmount
time, and I do not believe that it should have been done. Normally
an unmount will not succeed if any vnodes are busy (for example, if
any directory in the filesystem is a current directory). The only
way that it can succeed in such a case is if a forcible unmount is
done. The forcible unmount will effectively do a revoke(2) on all
current directory vnodes in the unmounted filesystem. Further attempts
to access them will fail with "." not found errors. The only way to
get a valid current directory is to chdir to an absolute pathname.
Gratuitously fixing this if you happen to be in the former root of
the filesystem is wrong. And as you note can lead to unintensionally
giving an escape path from a prison. So I concur with your removing
this added functionality.
--
Andriy Gapon
Kirk McKusick
2017-09-15 03:14:36 UTC
Permalink
Subject: Re: mount / unmount and mountcheckdirs()
Date: Thu, 14 Sep 2017 15:45:07 +0300
Post by Kirk McKusick
I added the checkdirs functionality in the mount direction only
(I actually did it in 4.4BSD-Lite and it got swept in with commit
22521). The reason is that when a directory that is not empty is
mounted on, the expectation is that the entries in that directory
should no longer be present; rather they should be replaced by the
entries in the newly mounted directory. Thus all processes sitting
in the mounted on directory should see the newly mounted directory
as if they had come to it using a lookup after the mount had been
done. If a process had proceeded through the mounted on directory
into one of its other entries, then they are left alone until such
time as they chdir back into the mount point directory through ".."
at which time they will be passed up to the mounted directory using
the same mechanism that would put them there if they traversed into
the mount point from above it in the tree. I believe this is the
correct behavior, is not a security threat, and should be left alone.
I almost dropped a ball on this issue, but I am now picking it up again.
At the moment I am moving forward with the dounmount change as it seems to be
non-contentious and rather simple to do and test.
Regarding the mount part, I am not sure that I completely agree with you.
Even if mountcheckdirs() does not cause any problems in the mount path, I still
fail to see its usefulness. Specifically, I still do not see any significant
difference between the covered directory and any directory below it. So, if we
leave the lower directories alone, while bother with the covered directory...
- absolute paths work correctly
- relative paths with enough ".." (one) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original
filesystem
- absolute paths work correctly
- relative paths with enough ".." (> 1) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original
filesystem
The only difference I can think of is that the root of the mounted filesystem
cannot be reached with just ".."-s from the covered directory. But is this
difference of any significance?
Mateusz also raised some interesting points.
On the other hand, it seems that illumos and probably Solaris has
interesting parallels to the FreeBSD behavior. It does not allow
to mount over a directory that is a current directory for any process
("Device busy"), but does not object against processes in directories
below the mount point.
So, probably it's just I who misses something about that scenario :-)
Post by Kirk McKusick
I was not aware that the functionality had been added at unmount
time, and I do not believe that it should have been done. Normally
an unmount will not succeed if any vnodes are busy (for example, if
any directory in the filesystem is a current directory). The only
way that it can succeed in such a case is if a forcible unmount is
done. The forcible unmount will effectively do a revoke(2) on all
current directory vnodes in the unmounted filesystem. Further attempts
to access them will fail with "." not found errors. The only way to
get a valid current directory is to chdir to an absolute pathname.
Gratuitously fixing this if you happen to be in the former root of
the filesystem is wrong. And as you note can lead to unintensionally
giving an escape path from a prison. So I concur with your removing
this added functionality.
--
Andriy Gapon
I had to dig back through some *really* old emails to find out what
triggered the addition of mountcheckdirs(). The problem that it was
specifically solving was that as part of the startup script a minimal
root directory was replaced by the real root directory. The shell
running the startup script needed to be moved to the new mounted-on
root so that the rest of the script would not fail.

That disaster of a hack has been replaced with the much more functional
code that deals with setting up the root and the devfs filesystem on
/dev. So the need for which it was designed no longer exists. But I
still believe that it is the correct thing to do. For example, if you
are using automount code and chdir into your home directory triggering
an auto-mount, you should just be in your home directory after the
mount rather than having to do cd ../$USER to get there.

~Kirk
Konstantin Belousov
2017-09-15 09:20:01 UTC
Permalink
Post by Kirk McKusick
Subject: Re: mount / unmount and mountcheckdirs()
Date: Thu, 14 Sep 2017 15:45:07 +0300
Post by Kirk McKusick
I added the checkdirs functionality in the mount direction only
(I actually did it in 4.4BSD-Lite and it got swept in with commit
22521). The reason is that when a directory that is not empty is
mounted on, the expectation is that the entries in that directory
should no longer be present; rather they should be replaced by the
entries in the newly mounted directory. Thus all processes sitting
in the mounted on directory should see the newly mounted directory
as if they had come to it using a lookup after the mount had been
done. If a process had proceeded through the mounted on directory
into one of its other entries, then they are left alone until such
time as they chdir back into the mount point directory through ".."
at which time they will be passed up to the mounted directory using
the same mechanism that would put them there if they traversed into
the mount point from above it in the tree. I believe this is the
correct behavior, is not a security threat, and should be left alone.
I almost dropped a ball on this issue, but I am now picking it up again.
At the moment I am moving forward with the dounmount change as it seems to be
non-contentious and rather simple to do and test.
Regarding the mount part, I am not sure that I completely agree with you.
Even if mountcheckdirs() does not cause any problems in the mount path, I still
fail to see its usefulness. Specifically, I still do not see any significant
difference between the covered directory and any directory below it. So, if we
leave the lower directories alone, while bother with the covered directory...
- absolute paths work correctly
- relative paths with enough ".." (one) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original filesystem
- absolute paths work correctly
- relative paths with enough ".." (> 1) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original filesystem
The only difference I can think of is that the root of the mounted filesystem
cannot be reached with just ".."-s from the covered directory. But is this
difference of any significance?
Mateusz also raised some interesting points.
On the other hand, it seems that illumos and probably Solaris has
interesting parallels to the FreeBSD behavior. It does not allow
to mount over a directory that is a current directory for any process
("Device busy"), but does not object against processes in directories
below the mount point.
So, probably it's just I who misses something about that scenario :-)
Post by Kirk McKusick
I was not aware that the functionality had been added at unmount
time, and I do not believe that it should have been done. Normally
an unmount will not succeed if any vnodes are busy (for example, if
any directory in the filesystem is a current directory). The only
way that it can succeed in such a case is if a forcible unmount is
done. The forcible unmount will effectively do a revoke(2) on all
current directory vnodes in the unmounted filesystem. Further attempts
to access them will fail with "." not found errors. The only way to
get a valid current directory is to chdir to an absolute pathname.
Gratuitously fixing this if you happen to be in the former root of
the filesystem is wrong. And as you note can lead to unintensionally
giving an escape path from a prison. So I concur with your removing
this added functionality.
--
Andriy Gapon
I had to dig back through some *really* old emails to find out what
triggered the addition of mountcheckdirs(). The problem that it was
specifically solving was that as part of the startup script a minimal
root directory was replaced by the real root directory. The shell
running the startup script needed to be moved to the new mounted-on
root so that the rest of the script would not fail.
If the mountcheckdirs() code not going away, please add your spelunking
results as a comment before the function. This theme is recurring, and
it would be highly beneficial to not loose the non-trivial reasoning
behind the code existence.
Post by Kirk McKusick
That disaster of a hack has been replaced with the much more functional
code that deals with setting up the root and the devfs filesystem on
/dev. So the need for which it was designed no longer exists. But I
still believe that it is the correct thing to do. For example, if you
are using automount code and chdir into your home directory triggering
an auto-mount, you should just be in your home directory after the
mount rather than having to do cd ../$USER to get there.
I believe that the current autofs does not allow a process to get into
this situation at all.

In fact, the behavior implemented by mountcheckdirs() is surprising as
well. For instance, I did expected that the system would operate as if
mountcheckdirs() does not exist, and it caused me some head-scratching
when I see it first time.
Edward Napierala
2017-09-15 10:08:31 UTC
Permalink
Post by Andriy Gapon
Post by Kirk McKusick
Subject: Re: mount / unmount and mountcheckdirs()
Date: Thu, 14 Sep 2017 15:45:07 +0300
Post by Kirk McKusick
I added the checkdirs functionality in the mount direction only
(I actually did it in 4.4BSD-Lite and it got swept in with commit
22521). The reason is that when a directory that is not empty is
mounted on, the expectation is that the entries in that directory
should no longer be present; rather they should be replaced by the
entries in the newly mounted directory. Thus all processes sitting
in the mounted on directory should see the newly mounted directory
as if they had come to it using a lookup after the mount had been
done. If a process had proceeded through the mounted on directory
into one of its other entries, then they are left alone until such
time as they chdir back into the mount point directory through ".."
at which time they will be passed up to the mounted directory using
the same mechanism that would put them there if they traversed into
the mount point from above it in the tree. I believe this is the
correct behavior, is not a security threat, and should be left alone.
I almost dropped a ball on this issue, but I am now picking it up
again.
Post by Kirk McKusick
At the moment I am moving forward with the dounmount change as it
seems to be
Post by Kirk McKusick
non-contentious and rather simple to do and test.
Regarding the mount part, I am not sure that I completely agree with
you.
Post by Kirk McKusick
Even if mountcheckdirs() does not cause any problems in the mount
path, I still
Post by Kirk McKusick
fail to see its usefulness. Specifically, I still do not see any
significant
Post by Kirk McKusick
difference between the covered directory and any directory below it.
So, if we
Post by Kirk McKusick
leave the lower directories alone, while bother with the covered
directory...
Post by Kirk McKusick
- absolute paths work correctly
- relative paths with enough ".." (one) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original
filesystem
- absolute paths work correctly
- relative paths with enough ".." (> 1) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original
filesystem
The only difference I can think of is that the root of the mounted
filesystem
Post by Kirk McKusick
cannot be reached with just ".."-s from the covered directory. But is
this
Post by Kirk McKusick
difference of any significance?
Mateusz also raised some interesting points.
On the other hand, it seems that illumos and probably Solaris has
interesting parallels to the FreeBSD behavior. It does not allow
to mount over a directory that is a current directory for any process
("Device busy"), but does not object against processes in directories
below the mount point.
So, probably it's just I who misses something about that scenario :-)
Post by Kirk McKusick
I was not aware that the functionality had been added at unmount
time, and I do not believe that it should have been done. Normally
an unmount will not succeed if any vnodes are busy (for example, if
any directory in the filesystem is a current directory). The only
way that it can succeed in such a case is if a forcible unmount is
done. The forcible unmount will effectively do a revoke(2) on all
current directory vnodes in the unmounted filesystem. Further attempts
to access them will fail with "." not found errors. The only way to
get a valid current directory is to chdir to an absolute pathname.
Gratuitously fixing this if you happen to be in the former root of
the filesystem is wrong. And as you note can lead to unintensionally
giving an escape path from a prison. So I concur with your removing
this added functionality.
--
Andriy Gapon
I had to dig back through some *really* old emails to find out what
triggered the addition of mountcheckdirs(). The problem that it was
specifically solving was that as part of the startup script a minimal
root directory was replaced by the real root directory. The shell
running the startup script needed to be moved to the new mounted-on
root so that the rest of the script would not fail.
If the mountcheckdirs() code not going away, please add your spelunking
results as a comment before the function. This theme is recurring, and
it would be highly beneficial to not loose the non-trivial reasoning
behind the code existence.
Post by Kirk McKusick
That disaster of a hack has been replaced with the much more functional
code that deals with setting up the root and the devfs filesystem on
/dev. So the need for which it was designed no longer exists. But I
still believe that it is the correct thing to do. For example, if you
are using automount code and chdir into your home directory triggering
an auto-mount, you should just be in your home directory after the
mount rather than having to do cd ../$USER to get there.
I believe that the current autofs does not allow a process to get into
this situation at all.
It does. For example:

[***@v2:~]% cd /media/md0
[***@v2:/media/md0]% mount
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
[***@v2:/media/md0]% ls
[***@v2:/media/md0]% mount
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)

Getting rid of mountcheckdirs() in the unmount path should be fine, I think.
Konstantin Belousov
2017-09-15 10:30:37 UTC
Permalink
Post by Edward Napierala
Post by Konstantin Belousov
I believe that the current autofs does not allow a process to get into
this situation at all.
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Getting rid of mountcheckdirs() in the unmount path should be fine, I think.
How the example proves that mountcheckdirs() can be removed ? How can
we see which directory content was printed by ls, the covered or mounted ?
Edward Napierala
2017-09-15 10:49:17 UTC
Permalink
Post by Edward Napierala
Post by Edward Napierala
Post by Konstantin Belousov
I believe that the current autofs does not allow a process to get into
this situation at all.
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Getting rid of mountcheckdirs() in the unmount path should be fine, I
think.
How the example proves that mountcheckdirs() can be removed ?
In the unmount path? It doesn't; I just don't think autofs would be
affected,
since having a current working directory in a mountpoint prevents the
unmount
(unless it's forced).

In the mount path? See below.
Post by Edward Napierala
How can
we see which directory content was printed by ls, the covered or mounted ?
Well, it would be pretty useless if you got the covered directory. But ok,
here's
a better example, one that actually shows that:

[***@v2:/media/md0]% mount
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
[***@v2:/media/md0]% ls -al
total 9
drwxr-xr-x 3 root wheel 512 Sep 10 10:16 .
drwxr-xr-x 3 root wheel 512 Sep 8 11:42 ..
drwxrwxr-x 2 root operator 512 Sep 10 10:16 .snap
[***@v2:/media/md0]% mount
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)

Notice the .snap/ directory; it's an empty UFS filesystem.
Konstantin Belousov
2017-09-15 11:00:33 UTC
Permalink
Post by Edward Napierala
Post by Edward Napierala
Post by Edward Napierala
Post by Konstantin Belousov
I believe that the current autofs does not allow a process to get into
this situation at all.
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Getting rid of mountcheckdirs() in the unmount path should be fine, I
think.
How the example proves that mountcheckdirs() can be removed ?
In the unmount path? It doesn't; I just don't think autofs would be
affected,
since having a current working directory in a mountpoint prevents the
unmount
(unless it's forced).
In the mount path? See below.
Post by Edward Napierala
How can
we see which directory content was printed by ls, the covered or mounted ?
Well, it would be pretty useless if you got the covered directory.
This is exactly what is being discussed.
Post by Edward Napierala
But ok, here's
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
total 9
drwxr-xr-x 3 root wheel 512 Sep 10 10:16 .
drwxr-xr-x 3 root wheel 512 Sep 8 11:42 ..
drwxrwxr-x 2 root operator 512 Sep 10 10:16 .snap
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Notice the .snap/ directory; it's an empty UFS filesystem.
And the .snap is from the /dev/md0 volume, right ?
If yes, then this demonstration indeed proves that autofs behaves as
intended on mount.
Edward Napierala
2017-09-15 11:31:05 UTC
Permalink
Post by Konstantin Belousov
Post by Edward Napierala
Post by Edward Napierala
Post by Edward Napierala
Post by Konstantin Belousov
I believe that the current autofs does not allow a process to get
into
Post by Edward Napierala
Post by Edward Napierala
Post by Edward Napierala
Post by Konstantin Belousov
this situation at all.
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Getting rid of mountcheckdirs() in the unmount path should be fine, I
think.
How the example proves that mountcheckdirs() can be removed ?
In the unmount path? It doesn't; I just don't think autofs would be
affected,
since having a current working directory in a mountpoint prevents the
unmount
(unless it's forced).
In the mount path? See below.
Post by Edward Napierala
How can
we see which directory content was printed by ls, the covered or
mounted ?
Post by Edward Napierala
Well, it would be pretty useless if you got the covered directory.
This is exactly what is being discussed.
Post by Edward Napierala
But ok, here's
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
total 9
drwxr-xr-x 3 root wheel 512 Sep 10 10:16 .
drwxr-xr-x 3 root wheel 512 Sep 8 11:42 ..
drwxrwxr-x 2 root operator 512 Sep 10 10:16 .snap
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Notice the .snap/ directory; it's an empty UFS filesystem.
And the .snap is from the /dev/md0 volume, right ?
Yes.
Post by Konstantin Belousov
If yes, then this demonstration indeed proves that autofs behaves as
intended on mount.
Exactly. And, from what I understand, that would break if mountcheckdirs()
got removed from the mount path.
Konstantin Belousov
2017-09-15 11:45:33 UTC
Permalink
Post by Edward Napierala
Post by Konstantin Belousov
Post by Edward Napierala
Post by Edward Napierala
Post by Edward Napierala
Post by Konstantin Belousov
I believe that the current autofs does not allow a process to get
into
Post by Edward Napierala
Post by Edward Napierala
Post by Edward Napierala
Post by Konstantin Belousov
this situation at all.
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Getting rid of mountcheckdirs() in the unmount path should be fine, I
think.
How the example proves that mountcheckdirs() can be removed ?
In the unmount path? It doesn't; I just don't think autofs would be
affected,
since having a current working directory in a mountpoint prevents the
unmount
(unless it's forced).
In the mount path? See below.
Post by Edward Napierala
How can
we see which directory content was printed by ls, the covered or
mounted ?
Post by Edward Napierala
Well, it would be pretty useless if you got the covered directory.
This is exactly what is being discussed.
Post by Edward Napierala
But ok, here's
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
total 9
drwxr-xr-x 3 root wheel 512 Sep 10 10:16 .
drwxr-xr-x 3 root wheel 512 Sep 8 11:42 ..
drwxrwxr-x 2 root operator 512 Sep 10 10:16 .snap
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Notice the .snap/ directory; it's an empty UFS filesystem.
And the .snap is from the /dev/md0 volume, right ?
Yes.
Post by Konstantin Belousov
If yes, then this demonstration indeed proves that autofs behaves as
intended on mount.
Exactly. And, from what I understand, that would break if mountcheckdirs()
got removed from the mount path.
Even for autofs ? Don't autofs postpone lookup() completion until the
mount occurs ?
Edward Napierala
2017-09-15 16:06:10 UTC
Permalink
Post by Edward Napierala
Post by Edward Napierala
Post by Konstantin Belousov
Post by Edward Napierala
Post by Edward Napierala
2017-09-15 10:20 GMT+01:00 Konstantin Belousov <
Post by Konstantin Belousov
I believe that the current autofs does not allow a process to
get
Post by Edward Napierala
Post by Konstantin Belousov
into
Post by Edward Napierala
Post by Edward Napierala
Post by Konstantin Belousov
this situation at all.
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Getting rid of mountcheckdirs() in the unmount path should be
fine, I
Post by Edward Napierala
Post by Konstantin Belousov
Post by Edward Napierala
Post by Edward Napierala
think.
How the example proves that mountcheckdirs() can be removed ?
In the unmount path? It doesn't; I just don't think autofs would be
affected,
since having a current working directory in a mountpoint prevents the
unmount
(unless it's forced).
In the mount path? See below.
Post by Edward Napierala
How can
we see which directory content was printed by ls, the covered or
mounted ?
Post by Edward Napierala
Well, it would be pretty useless if you got the covered directory.
This is exactly what is being discussed.
Post by Edward Napierala
But ok, here's
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
total 9
drwxr-xr-x 3 root wheel 512 Sep 10 10:16 .
drwxr-xr-x 3 root wheel 512 Sep 8 11:42 ..
drwxrwxr-x 2 root operator 512 Sep 10 10:16 .snap
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Notice the .snap/ directory; it's an empty UFS filesystem.
And the .snap is from the /dev/md0 volume, right ?
Yes.
Post by Konstantin Belousov
If yes, then this demonstration indeed proves that autofs behaves as
intended on mount.
Exactly. And, from what I understand, that would break if
mountcheckdirs()
Post by Edward Napierala
got removed from the mount path.
Even for autofs ? Don't autofs postpone lookup() completion until the
mount occurs ?
It does, and that case might work - I'm not sure, but it might. But what
happens afterwards, after the syscall that triggered the mount completes?
You would still have a shell with cwd in that dir, now covered by another
filesystem. Autofs doesn't do anything to "lift" the process to the newly
mounted filesystem root.
Konstantin Belousov
2017-09-15 16:27:52 UTC
Permalink
Post by Edward Napierala
Post by Edward Napierala
Post by Edward Napierala
Post by Konstantin Belousov
Post by Edward Napierala
Post by Edward Napierala
2017-09-15 10:20 GMT+01:00 Konstantin Belousov <
Post by Konstantin Belousov
I believe that the current autofs does not allow a process to
get
Post by Edward Napierala
Post by Konstantin Belousov
into
Post by Edward Napierala
Post by Edward Napierala
Post by Konstantin Belousov
this situation at all.
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Getting rid of mountcheckdirs() in the unmount path should be
fine, I
Post by Edward Napierala
Post by Konstantin Belousov
Post by Edward Napierala
Post by Edward Napierala
think.
How the example proves that mountcheckdirs() can be removed ?
In the unmount path? It doesn't; I just don't think autofs would be
affected,
since having a current working directory in a mountpoint prevents the
unmount
(unless it's forced).
In the mount path? See below.
Post by Edward Napierala
How can
we see which directory content was printed by ls, the covered or
mounted ?
Post by Edward Napierala
Well, it would be pretty useless if you got the covered directory.
This is exactly what is being discussed.
Post by Edward Napierala
But ok, here's
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
total 9
drwxr-xr-x 3 root wheel 512 Sep 10 10:16 .
drwxr-xr-x 3 root wheel 512 Sep 8 11:42 ..
drwxrwxr-x 2 root operator 512 Sep 10 10:16 .snap
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Notice the .snap/ directory; it's an empty UFS filesystem.
And the .snap is from the /dev/md0 volume, right ?
Yes.
Post by Konstantin Belousov
If yes, then this demonstration indeed proves that autofs behaves as
intended on mount.
Exactly. And, from what I understand, that would break if
mountcheckdirs()
Post by Edward Napierala
got removed from the mount path.
Even for autofs ? Don't autofs postpone lookup() completion until the
mount occurs ?
It does, and that case might work - I'm not sure, but it might. But what
happens afterwards, after the syscall that triggered the mount completes?
You would still have a shell with cwd in that dir, now covered by another
filesystem. Autofs doesn't do anything to "lift" the process to the newly
mounted filesystem root.
The change of the current directory should have already triggered the action,
because it was preceeded by lookup. At least this is my understanding of
the flow.
Andriy Gapon
2017-09-15 16:40:22 UTC
Permalink
Post by Konstantin Belousov
Post by Edward Napierala
It does, and that case might work - I'm not sure, but it might. But what
happens afterwards, after the syscall that triggered the mount completes?
You would still have a shell with cwd in that dir, now covered by another
filesystem. Autofs doesn't do anything to "lift" the process to the newly
mounted filesystem root.
The change of the current directory should have already triggered the action,
because it was preceeded by lookup. At least this is my understanding of
the flow.
From the earlier example I can only conclude that autofs does NOT do a mount on
lookup in a sense of *vpp being a root of the mounted filesystem. My
interpretation is the mounting is done only if there is a VOP_LOOKUP (or
VOP_READDIR) with a mount point as a dvp.
--
Andriy Gapon
Andriy Gapon
2017-09-15 11:56:55 UTC
Permalink
Post by Edward Napierala
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
Interesting, I thought that /dev/md0 would get mounted as soon as /media/md0 is
looked up. But maybe that would be sub-optimal for some common scenarios...
FWIW, ZFS snapshots get auto-mounted under .zfs as soon as there is a lookup.
Post by Edward Napierala
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
--
Andriy Gapon
Edward Napierala
2017-09-15 16:02:03 UTC
Permalink
Post by Andriy Gapon
Post by Edward Napierala
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
Interesting, I thought that /dev/md0 would get mounted as soon as /media/md0 is
looked up. But maybe that would be sub-optimal for some common scenarios...
FWIW, ZFS snapshots get auto-mounted under .zfs as soon as there is a lookup.
You really don't want it to work that way. If you have a directory with
thousands
of mountpoints inside, you don't want to mount all of them every time you
do "ls"
in that directory. Same problem with mounting on "cd". That's why autofs
triggers
in three cases: in VOP_READDIR(9), VOP_GETATTR(9), and VOP_LOOKUP(9)
with the vnode as a parent dir.
Andriy Gapon
2017-09-15 16:36:04 UTC
Permalink
Post by Andriy Gapon
Post by Edward Napierala
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
Interesting, I thought that /dev/md0 would get mounted as soon as /media/md0 is
looked up.  But maybe that would be sub-optimal for some common scenarios...
FWIW, ZFS snapshots get auto-mounted under .zfs as soon as there is a lookup.
You really don't want it to work that way.  If you have a directory with thousands
of mountpoints inside, you don't want to mount all of them every time you do "ls"
in that directory.
Well, "ls" wouldn't trigger the mounting as it's just VOP_READDIR in the parent
vnode, "ls -l" (or any other options that require file attributes) is a
different story :-)
Post by Andriy Gapon
Same problem with mounting on "cd".  That's why autofs triggers
in three cases: in VOP_READDIR(9), VOP_GETATTR(9), and VOP_LOOKUP(9)
with the vnode as a parent dir.
Well, "cd" is also a different story, you can't cd to hundreds of directories at
once. Of course, the filesystem doesn't really know if it's "cd" or something
else that triggered VOP_LOOKUP. Although VOP_ACCESS could be used as a hint, it
is a rather weak hint.

In any case, I prefer the mount-on-lookup approach as it is "more seamless".
--
Andriy Gapon
Andriy Gapon
2017-09-15 11:58:23 UTC
Permalink
Post by Kirk McKusick
I had to dig back through some *really* old emails to find out what
triggered the addition of mountcheckdirs(). The problem that it was
specifically solving was that as part of the startup script a minimal
root directory was replaced by the real root directory. The shell
running the startup script needed to be moved to the new mounted-on
root so that the rest of the script would not fail.
That disaster of a hack has been replaced with the much more functional
code that deals with setting up the root and the devfs filesystem on
/dev. So the need for which it was designed no longer exists. But I
still believe that it is the correct thing to do. For example, if you
are using automount code and chdir into your home directory triggering
an auto-mount, you should just be in your home directory after the
mount rather than having to do cd ../$USER to get there.
Kirk,
thank you very much both for the historical reference and for the auto-mount
example.
--
Andriy Gapon
Rodney W. Grimes
2017-09-15 15:19:15 UTC
Permalink
Post by Edward Napierala
Post by Andriy Gapon
Post by Kirk McKusick
Subject: Re: mount / unmount and mountcheckdirs()
Date: Thu, 14 Sep 2017 15:45:07 +0300
Post by Kirk McKusick
I added the checkdirs functionality in the mount direction only
(I actually did it in 4.4BSD-Lite and it got swept in with commit
22521). The reason is that when a directory that is not empty is
mounted on, the expectation is that the entries in that directory
should no longer be present; rather they should be replaced by the
entries in the newly mounted directory. Thus all processes sitting
in the mounted on directory should see the newly mounted directory
as if they had come to it using a lookup after the mount had been
done. If a process had proceeded through the mounted on directory
into one of its other entries, then they are left alone until such
time as they chdir back into the mount point directory through ".."
at which time they will be passed up to the mounted directory using
the same mechanism that would put them there if they traversed into
the mount point from above it in the tree. I believe this is the
correct behavior, is not a security threat, and should be left alone.
I almost dropped a ball on this issue, but I am now picking it up
again.
Post by Kirk McKusick
At the moment I am moving forward with the dounmount change as it
seems to be
Post by Kirk McKusick
non-contentious and rather simple to do and test.
Regarding the mount part, I am not sure that I completely agree with
you.
Post by Kirk McKusick
Even if mountcheckdirs() does not cause any problems in the mount
path, I still
Post by Kirk McKusick
fail to see its usefulness. Specifically, I still do not see any
significant
Post by Kirk McKusick
difference between the covered directory and any directory below it.
So, if we
Post by Kirk McKusick
leave the lower directories alone, while bother with the covered
directory...
Post by Kirk McKusick
- absolute paths work correctly
- relative paths with enough ".." (one) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original
filesystem
- absolute paths work correctly
- relative paths with enough ".." (> 1) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original
filesystem
The only difference I can think of is that the root of the mounted
filesystem
Post by Kirk McKusick
cannot be reached with just ".."-s from the covered directory. But is
this
Post by Kirk McKusick
difference of any significance?
Mateusz also raised some interesting points.
On the other hand, it seems that illumos and probably Solaris has
interesting parallels to the FreeBSD behavior. It does not allow
to mount over a directory that is a current directory for any process
("Device busy"), but does not object against processes in directories
below the mount point.
So, probably it's just I who misses something about that scenario :-)
Post by Kirk McKusick
I was not aware that the functionality had been added at unmount
time, and I do not believe that it should have been done. Normally
an unmount will not succeed if any vnodes are busy (for example, if
any directory in the filesystem is a current directory). The only
way that it can succeed in such a case is if a forcible unmount is
done. The forcible unmount will effectively do a revoke(2) on all
current directory vnodes in the unmounted filesystem. Further attempts
to access them will fail with "." not found errors. The only way to
get a valid current directory is to chdir to an absolute pathname.
Gratuitously fixing this if you happen to be in the former root of
the filesystem is wrong. And as you note can lead to unintensionally
giving an escape path from a prison. So I concur with your removing
this added functionality.
--
Andriy Gapon
I had to dig back through some *really* old emails to find out what
triggered the addition of mountcheckdirs(). The problem that it was
specifically solving was that as part of the startup script a minimal
root directory was replaced by the real root directory. The shell
running the startup script needed to be moved to the new mounted-on
root so that the rest of the script would not fail.
If the mountcheckdirs() code not going away, please add your spelunking
results as a comment before the function. This theme is recurring, and
it would be highly beneficial to not loose the non-trivial reasoning
behind the code existence.
Post by Kirk McKusick
That disaster of a hack has been replaced with the much more functional
code that deals with setting up the root and the devfs filesystem on
/dev. So the need for which it was designed no longer exists. But I
still believe that it is the correct thing to do. For example, if you
are using automount code and chdir into your home directory triggering
an auto-mount, you should just be in your home directory after the
mount rather than having to do cd ../$USER to get there.
I believe that the current autofs does not allow a process to get into
this situation at all.
This process should be blocked before the cd is completed and
resumed once the automouter has done its job. Even per the
autofs man page that is what should be happening:

DESCRIPTION
The autofs driver is the kernel component of the automounter
infrastructure. Its job is to pass mount requests to the automountd(8)
daemon, and pause the processes trying to access the automounted
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
filesystem until the mount is completed. It is mounted by the
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
automount(8).
Post by Edward Napierala
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
This looks like a fail of the autofs code to block the
process in cd before it changes the cwd.
Post by Edward Napierala
/dev/ada0s1a on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/md0 on /media/md0 (ufs, local, noatime, nosuid, automounted)
Getting rid of mountcheckdirs() in the unmount path should be fine, I think.
--
Rod Grimes ***@freebsd.org
Rodney W. Grimes
2017-09-15 15:10:25 UTC
Permalink
Post by Konstantin Belousov
Post by Kirk McKusick
Subject: Re: mount / unmount and mountcheckdirs()
Date: Thu, 14 Sep 2017 15:45:07 +0300
Post by Kirk McKusick
I added the checkdirs functionality in the mount direction only
(I actually did it in 4.4BSD-Lite and it got swept in with commit
22521). The reason is that when a directory that is not empty is
mounted on, the expectation is that the entries in that directory
should no longer be present; rather they should be replaced by the
entries in the newly mounted directory. Thus all processes sitting
in the mounted on directory should see the newly mounted directory
as if they had come to it using a lookup after the mount had been
done. If a process had proceeded through the mounted on directory
into one of its other entries, then they are left alone until such
time as they chdir back into the mount point directory through ".."
at which time they will be passed up to the mounted directory using
the same mechanism that would put them there if they traversed into
the mount point from above it in the tree. I believe this is the
correct behavior, is not a security threat, and should be left alone.
I almost dropped a ball on this issue, but I am now picking it up again.
At the moment I am moving forward with the dounmount change as it seems to be
non-contentious and rather simple to do and test.
Regarding the mount part, I am not sure that I completely agree with you.
Even if mountcheckdirs() does not cause any problems in the mount path, I still
fail to see its usefulness. Specifically, I still do not see any significant
difference between the covered directory and any directory below it. So, if we
leave the lower directories alone, while bother with the covered directory...
- absolute paths work correctly
- relative paths with enough ".." (one) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original
filesystem
- absolute paths work correctly
- relative paths with enough ".." (> 1) can access the actual namespace
- other relative paths operate on the shadowed sub-tree of the original
filesystem
The only difference I can think of is that the root of the mounted filesystem
cannot be reached with just ".."-s from the covered directory. But is this
difference of any significance?
Mateusz also raised some interesting points.
On the other hand, it seems that illumos and probably Solaris has
interesting parallels to the FreeBSD behavior. It does not allow
to mount over a directory that is a current directory for any process
("Device busy"), but does not object against processes in directories
below the mount point.
So, probably it's just I who misses something about that scenario :-)
Post by Kirk McKusick
I was not aware that the functionality had been added at unmount
time, and I do not believe that it should have been done. Normally
an unmount will not succeed if any vnodes are busy (for example, if
any directory in the filesystem is a current directory). The only
way that it can succeed in such a case is if a forcible unmount is
done. The forcible unmount will effectively do a revoke(2) on all
current directory vnodes in the unmounted filesystem. Further attempts
to access them will fail with "." not found errors. The only way to
get a valid current directory is to chdir to an absolute pathname.
Gratuitously fixing this if you happen to be in the former root of
the filesystem is wrong. And as you note can lead to unintensionally
giving an escape path from a prison. So I concur with your removing
this added functionality.
--
Andriy Gapon
I had to dig back through some *really* old emails to find out what
triggered the addition of mountcheckdirs(). The problem that it was
specifically solving was that as part of the startup script a minimal
root directory was replaced by the real root directory. The shell
running the startup script needed to be moved to the new mounted-on
root so that the rest of the script would not fail.
If the mountcheckdirs() code not going away, please add your spelunking
results as a comment before the function. This theme is recurring, and
it would be highly beneficial to not loose the non-trivial reasoning
behind the code existence.
Post by Kirk McKusick
That disaster of a hack has been replaced with the much more functional
code that deals with setting up the root and the devfs filesystem on
/dev. So the need for which it was designed no longer exists. But I
still believe that it is the correct thing to do. For example, if you
are using automount code and chdir into your home directory triggering
an auto-mount, you should just be in your home directory after the
mount rather than having to do cd ../$USER to get there.
I believe that the current autofs does not allow a process to get into
this situation at all.
I believe the ancient automountd did not allow a process to
get into this situation either, the process got blocked before
the completion of the cd that triggers the automount, and
continued after the (auto)mount completed.
Post by Konstantin Belousov
In fact, the behavior implemented by mountcheckdirs() is surprising as
well. For instance, I did expected that the system would operate as if
mountcheckdirs() does not exist, and it caused me some head-scratching
when I see it first time.
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
--
Rod Grimes ***@freebsd.org
Loading...