Bug 745781

Summary: Unable to use indirect mounts
Product: [Fedora] Fedora Reporter: Daniel Berrangé <berrange>
Component: autofsAssignee: Ian Kent <ikent>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: ajayr, dhowells, hobbes1069, ikent, lpoetter, marcus.moeller, sandro
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: autofs-5.0.6-3.fc16 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-11-10 17:35:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
autofs debug log of failure
none
Patch - fix fix map source check in file lookup
none
Patch - add disable move mount configure option
none
Proposed F16 update none

Description Daniel Berrangé 2011-10-13 12:12:36 UTC
Created attachment 527962 [details]
autofs debug log of failure

Description of problem:
My /etc/auto.master contains:


  /net    -hosts

Attempting to activate an indirect mount fails though:

# ls /net/marrow.gsslab.fab.redhat.com/
ls: cannot access /net/marrow.gsslab.fab.redhat.com/: No such file or directory

The host is accessible, and can be succesfully mounted manually.

Autofs syslog says:

Oct 13 13:02:39 dhcp-188 automount[1200]: move_mount: failed to move mount from /tmp/autoRGerI0 to /net/marrow.gsslab.fab.redhat.com: No such file or directory


And /net seems unhappy

# ls -al /net
ls: cannot access /net/marrow.gsslab.fab.redhat.com: No such file or directory
total 4
drwxr-xr-x.  3 root root    0 Oct 13 13:06 .
dr-xr-xr-x. 23 root root 4096 Oct 13 13:05 ..
d??????????  ? ?    ?       ?            ? marrow.gsslab.fab.redhat.com


I will attach the full logs shown after setting /etc/sysconfig/autofs to 'debug'


SELinux is in permissive mode.

Version-Release number of selected component (if applicable):

autofs-5.0.6-2.fc16.x86_64
kernel-3.1.0-0.rc9.git0.0.fc16.x86_64

How reproducible:
Seems to be consistent across reboots

Steps to Reproduce:
1. Setup /net as an indirect mount
2. Attempt to mount some servers
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Ian Kent 2011-10-13 14:15:38 UTC
> # ls /net/marrow.gsslab.fab.redhat.com/
> ls: cannot access /net/marrow.gsslab.fab.redhat.com/: No such file or directory

Does the same thing happen if you leave out the trailing slash?

Comment 2 Ian Kent 2011-10-13 14:17:27 UTC
> I will attach the full logs shown after setting /etc/sysconfig/autofs
> to 'debug'

Don't forget to check that syslog is actually logging daemon.*
messages.

Comment 3 Daniel Berrangé 2011-10-13 15:14:12 UTC
> Does the same thing happen if you leave out the trailing slash?

Yes, no difference

> Don't forget to check that syslog is actually logging daemon.*
> messages.

Setting daemon.* didn't increase the amount of log information, over what I've already attached to this ticket, so I presume that attachment contains everything.

Comment 4 Ian Kent 2011-10-13 16:10:26 UTC
(In reply to comment #3)
> > Does the same thing happen if you leave out the trailing slash?
> 
> Yes, no difference
> 
> > Don't forget to check that syslog is actually logging daemon.*
> > messages.
> 
> Setting daemon.* didn't increase the amount of log information, over what I've
> already attached to this ticket, so I presume that attachment contains
> everything.

OK, thanks, I'll update my kernel source and try and duplicate
it.

What is the machine that exports these running?
Are these supposed to be NFSv4 or v3 mounts.

Comment 5 Daniel Berrangé 2011-10-13 16:34:46 UTC
The server is F14. If I mounted it manually I end up with the followig:

marrow.gsslab.fab.redhat.com:/var/lib/libvirt/images/ /tmp/f nfs rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.33.8.114,mountvers=3,mountport=35386,mountproto=udp,local_lock=none,addr=10.33.8.114 0 0

Comment 6 Ian Kent 2011-10-14 01:33:11 UTC
(In reply to comment #5)
> The server is F14. If I mounted it manually I end up with the followig:
> 
> marrow.gsslab.fab.redhat.com:/var/lib/libvirt/images/ /tmp/f nfs
> rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.33.8.114,mountvers=3,mountport=35386,mountproto=udp,local_lock=none,addr=10.33.8.114
> 0 0

This is strange, but it's moving the mount (from the construction
area) that fails in autofs, not the mount itself, so being able
to mount it isn't surprising.

I still need to get the source of this kernel and have a look but
I think it has the recent changes that resulted from a spirited
debate upstream. I tested a kernel with those changes and I thought
this case was covered but maybe not. What's worse is that I hadn't
yet returned to finish testing and I'm pretty sure those subsequent
tests include examples of the case above.

So, let me have a look at the source and run some more tests
against it and get back.

Ian

Comment 7 Ian Kent 2011-10-14 05:04:28 UTC
(In reply to comment #6)
> 
> I still need to get the source of this kernel and have a look but
> I think it has the recent changes that resulted from a spirited
> debate upstream. I tested a kernel with those changes and I thought
> this case was covered but maybe not. What's worse is that I hadn't
> yet returned to finish testing and I'm pretty sure those subsequent
> tests include examples of the case above.
> 
> So, let me have a look at the source and run some more tests
> against it and get back.

So far I've tested with a 3.1.0-rc8 that includes the vfs-automount
changes that would have been included in 3.1.0-rc9 on an F14 install.

The additional test I did was the autofs Connectathon test.
It uses a wide range of valid and invalid mount map syntaxes
including map entries similar to what was being mounted here.

I also exported "/" and a path to another file system and tried
to simulate the symptom here mounting each and then moving the
root mount. That's not exactly what is used here but is quite
close.

I haven't seen a problem yet.

Next thing to do is test against F16.

Ian

Comment 8 Daniel Berrangé 2011-10-14 08:40:36 UTC
> The server is F14. If I mounted it manually I end up with the followig:

I should clarify because this is slightly ambiguous. The *NFS* server is F14.  The client where I run autofs is F16.

Comment 9 Ian Kent 2011-10-14 13:08:50 UTC
(In reply to comment #7)
> 
> So far I've tested with a 3.1.0-rc8 that includes the vfs-automount
> changes that would have been included in 3.1.0-rc9 on an F14 install.
> 
> The additional test I did was the autofs Connectathon test.
> It uses a wide range of valid and invalid mount map syntaxes
> including map entries similar to what was being mounted here.
> 
> I also exported "/" and a path to another file system and tried
> to simulate the symptom here mounting each and then moving the
> root mount. That's not exactly what is used here but is quite
> close.
> 
> I haven't seen a problem yet.
> 
> Next thing to do is test against F16.

And with F16 I see the fail.

This isn't dependent on autofs and doesn't appear related to
the kernel changes that went into rc8. Every "mount --move"
I try fails at the mount(2) call and returns -EINVAL.

I can't see why yet.

Ian

Comment 10 Ian Kent 2011-10-24 14:48:34 UTC
(In reply to comment #9)
> 
> And with F16 I see the fail.
> 
> This isn't dependent on autofs and doesn't appear related to
> the kernel changes that went into rc8. Every "mount --move"
> I try fails at the mount(2) call and returns -EINVAL.
> 
> I can't see why yet.

The root file system in f16 is marked as shared which means that
move mount is not permitted anywhere within any filesystems, unless
a filesystem is explicitly marked as not shared, since that will be
propagated to subordinate mounts.

I have no idea why or where this is being done, I just can't find
where it happens.

I could re-write the code which uses move mount, since we have the
new vfs-automount in kernel but that would introduce a restriction
on what kernel version can be expected to work reliably under
pressure.

At the moment I'm stumped as to how to find out why and where this
happens, any suggestions of who we should consult?

Comment 11 Daniel Berrangé 2011-10-24 15:01:19 UTC
There was an RFE against SystemD to make the root filesystem shared, instead of private. The reason for this is so that mounts automatically propagate into application sandboxes.

In my testing if appears you can move from a private filesystem, into a shared filesystem, but not the other direction. So one trick would be to mount a tmpfs directory private, have autofs use that initially, and then move to the real location on the shared root FS.

Here's what I did to test this idea

# mount --make-shared /
# mount -t tmpfs none /tmp/vroot
# mount --make-private /tmp/vroot


... setup our two mount points

# mkdir /tmp/vroot/a
# mkdir /tmp/e


... mount on the private FS originally

# mount /dev/loop0  /tmp/vroot/a

...move from private to shared fs:

# mount --move /tmp/vroot/a /tmp/e

...see move from shared to private fail as described earlier

# mount --move /tmp/e /tmp/vroot/a
mount: wrong fs type, bad option, bad superblock on /tmp/e,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

Comment 12 Sandro Mathys 2011-10-26 06:58:51 UTC
*** Bug 748888 has been marked as a duplicate of this bug. ***

Comment 13 Ian Kent 2011-10-26 08:16:57 UTC
(In reply to comment #11)
> There was an RFE against SystemD to make the root filesystem shared, instead of
> private. The reason for this is so that mounts automatically propagate into
> application sandboxes.

I'm having difficulty understanding how it is justified to
break the mount move system call system wide without any
regard to the effect it will have on existing applications.

Comment 14 Marcus Moeller 2011-10-26 10:55:55 UTC
@lpoetter, shouldn't this be fixed in systemd?

Comment 15 Daniel Berrangé 2011-10-26 11:10:36 UTC
Even if systemd did not set the filesystem mount mode to 'shared', any system administrator or application can come along at any time and run

  mount --make-rshared /

which will result in autofs ceasing to work. Previously the 'sandbox' application (and others) would include an init script did just this, and other apps would directly call mount() to make / shared. 

So IMHO you can't really call this a bug in systemd, or require systemd to change / back to private. Both private or shared are perfectly valid modes for the / filesystem on any modern Linux OS, and autofs needs to be made robust for operation under whichever is configured.

Comment 16 Ian Kent 2011-10-26 12:06:38 UTC
(In reply to comment #15)
> 
> So IMHO you can't really call this a bug in systemd, or require systemd to
> change / back to private. Both private or shared are perfectly valid modes for
> the / filesystem on any modern Linux OS, and autofs needs to be made robust for
> operation under whichever is configured.

Maybe, I agree that we can't call it a "bug", but I don't agree
that this is sensible at all. This type of usage wasn't the
intent behind the pnode implementation AFAIR.

It is worth making autofs tolerant of it though and I'm still
thinking about how I should do that.

Ian

Comment 17 Marcus Moeller 2011-10-27 14:03:55 UTC
mount --make-rshared /

did not really fix the problem.

Comment 18 Ajay Ramaswamy 2011-10-30 03:10:43 UTC
This also breaks delayed mounts in fstab

I had this line in f15
UUID=110a6adb-db6f-4ddf-b7d3-6d055676ab1c /mnt/playlist           xfs     noauto,comment=systemd.automount        1 2


This works fine in F15 but not in F16

Comment 19 Ian Kent 2011-10-31 02:25:45 UTC
(In reply to comment #17)
> mount --make-rshared /
> 
> did not really fix the problem.

That's right, I think if it isn't private move mount is
forbidden.

Comment 20 Ian Kent 2011-10-31 02:27:02 UTC
(In reply to comment #18)
> This also breaks delayed mounts in fstab
> 
> I had this line in f15
> UUID=110a6adb-db6f-4ddf-b7d3-6d055676ab1c /mnt/playlist           xfs    
> noauto,comment=systemd.automount        1 2
> 
> 
> This works fine in F15 but not in F16

What does this have to do with this bug?

Comment 21 Ajay Ramaswamy 2011-11-01 02:06:37 UTC
(In reply to comment #20)
> (In reply to comment #18)
> > This also breaks delayed mounts in fstab
> > 
> > I had this line in f15
> > UUID=110a6adb-db6f-4ddf-b7d3-6d055676ab1c /mnt/playlist           xfs    
> > noauto,comment=systemd.automount        1 2
> > 
> > 
> > This works fine in F15 but not in F16
> 
> What does this have to do with this bug?

as per the docs here
https://fedoraproject.org/wiki/User:Johannbg/QA/Systemd/Systemd.mount

systemd will use autofs to mount such fstab entries so the autofs failure means the system did not boot till I changes the fstab.

Thanks

Comment 22 Ian Kent 2011-11-01 06:01:47 UTC
(In reply to comment #21)
> (In reply to comment #20)
> > (In reply to comment #18)
> > > This also breaks delayed mounts in fstab
> > > 
> > > I had this line in f15
> > > UUID=110a6adb-db6f-4ddf-b7d3-6d055676ab1c /mnt/playlist           xfs    
> > > noauto,comment=systemd.automount        1 2
> > > 
> > > 
> > > This works fine in F15 but not in F16
> > 
> > What does this have to do with this bug?
> 
> as per the docs here
> https://fedoraproject.org/wiki/User:Johannbg/QA/Systemd/Systemd.mount
> 
> systemd will use autofs to mount such fstab entries so the autofs failure means
> the system did not boot till I changes the fstab.

I still don't know what this has to do with the move mount option
restriction to mount(2) which is the problem being discussed here.

Comment 23 Ian Kent 2011-11-03 06:04:53 UTC
I've been thinking about this and I feel that the move mount isn't
really needed.

Certainly with current kernels that include the new vfs-automount
it shouldn't be needed. But also there have been some recent bug
fixes for possibly related issues that I wasn't aware of at the
time I added it. Also, move mount was only ever needed for a small,
not widely used subset of configurations as well.

So I'm adding a configure option to disable the use of move mount
and, while it won't be the default, it will be set in the spec
file in the autofs distribution tar and in the spec file for Fedora.

Ian

Comment 24 Ian Kent 2011-11-03 06:19:12 UTC
Created attachment 531489 [details]
Patch - fix fix map source check in file lookup

Comment 25 Ian Kent 2011-11-03 06:20:02 UTC
Created attachment 531490 [details]
Patch - add disable move mount configure option

Comment 26 Ian Kent 2011-11-03 06:26:07 UTC
These two patches appear to resolve the problem with autofs.

Note that I'm not saying anything about autofs in the kernel
(actually called autofs4) since this bug is about the user
space automount daemon which is included in the autofs
package. Also, systemd doesn't use user space autofs at all
so there may be some misunderstanding by the look of some
of the comments above.

For those that do understand this distinction please test
the scratch build found here, which includes the two patches
above:

https://koji.fedoraproject.org/koji/taskinfo?taskID=3482097

Comment 27 Marcus Moeller 2011-11-03 08:56:11 UTC
The 'move_mount: failed to move mount' error is gone, but the shares are still not accessible.

This is the only message that is logged:

rpc_get_exports_proto

The same result as if I do an:

mount --make-rprivate /

Comment 28 Ian Kent 2011-11-03 10:49:06 UTC
(In reply to comment #27)
> The 'move_mount: failed to move mount' error is gone, but the shares are still
> not accessible.
> 
> This is the only message that is logged:
> 
> rpc_get_exports_proto
> 
> The same result as if I do an:
> 
> mount --make-rprivate /

You'll need to provide more information because I don't see
a problem with getting the exports list from a server here.

What does the exports list from the server look like?
What OS is the server running?
What NFS version, v3 or v4?
 
Provide a full debug log, set LOGGING="debug" in the
autofs configuration and ensure that daemon debug messages
are being logged to syslog, which can be done by sending
daemon.* to a log file in rsyslog.conf.

Comment 29 Lennart Poettering 2011-11-03 13:00:04 UTC
(In reply to comment #15)
> Even if systemd did not set the filesystem mount mode to 'shared', any system
> administrator or application can come along at any time and run

BTW, we have discussed the default propagation mode problem with a couple of folks and always came to the same conclusion: we want the propagation mode to be a mount option like any other, so that it would be applied atomically to all mounts as they are created and it can be listed in /etc/fstab. Not sure when we'll get this from the kernel folks, but given this perspective we decided not do this at all in systemd.

And yupp, systemd does not use the autofs package, so the bug definitely has no relation to systemd.

Comment 30 Ian Kent 2011-11-03 14:26:06 UTC
(In reply to comment #29)
> 
> And yupp, systemd does not use the autofs package, so the bug definitely has no
> relation to systemd.

I'm struggling to call this a bug.

It's a choice that's been made to provide certain functionality that
has side effects and it happens to affect autofs.

Changing autofs isn't really such a big deal, especially when running
against kernel versions where we will be likely to find systemd running
(famous last words), and that's what I'm doing.

;)

Comment 31 Ajay Ramaswamy 2011-11-06 12:50:26 UTC
(In reply to comment #22)
> (In reply to comment #21)
> > (In reply to comment #20)
> > > (In reply to comment #18)
> > > > This also breaks delayed mounts in fstab
> > > > 
> > > > I had this line in f15
> > > > UUID=110a6adb-db6f-4ddf-b7d3-6d055676ab1c /mnt/playlist           xfs    
> > > > noauto,comment=systemd.automount        1 2
> > > > 
> > > > 
> > > > This works fine in F15 but not in F16
> > > 
> > > What does this have to do with this bug?
> > 
> > as per the docs here
> > https://fedoraproject.org/wiki/User:Johannbg/QA/Systemd/Systemd.mount
> > 
> > systemd will use autofs to mount such fstab entries so the autofs failure means
> > the system did not boot till I changes the fstab.
> 
> I still don't know what this has to do with the move mount option
> restriction to mount(2) which is the problem being discussed here.


You are correct. I reinstalled (not upgrade) F-16 Gold and the fstab works fine

Comment 32 Ajay Ramaswamy 2011-11-06 12:51:46 UTC
(In reply to comment #26)
> These two patches appear to resolve the problem with autofs.
> 
> Note that I'm not saying anything about autofs in the kernel
> (actually called autofs4) since this bug is about the user
> space automount daemon which is included in the autofs
> package. Also, systemd doesn't use user space autofs at all
> so there may be some misunderstanding by the look of some
> of the comments above.
> 
> For those that do understand this distinction please test
> the scratch build found here, which includes the two patches
> above:
> 
> https://koji.fedoraproject.org/koji/taskinfo?taskID=3482097

This works fine on a fresh install of F-16 here.

Comment 33 Marcus Moeller 2011-11-07 09:19:50 UTC
re-checked. Seems to work here, too. Thanks for the great work Ian.

Comment 34 Fedora Update System 2011-11-08 05:51:04 UTC
autofs-5.0.6-3.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/autofs-5.0.6-3.fc16

Comment 35 Ian Kent 2011-11-08 06:40:39 UTC
(In reply to comment #34)
> autofs-5.0.6-3.fc16 has been submitted as an update for Fedora 16.
> https://admin.fedoraproject.org/updates/autofs-5.0.6-3.fc16

Not sure when this will show up in testing due to being in the
release phase but when it does please give it a try.

Comment 36 Ian Kent 2011-11-08 06:42:07 UTC
Created attachment 532208 [details]
Proposed F16 update

This package includes a number of fixes and is worth building
it locally and testing it (while waiting for the build system
build to reach the testing repo) as it is what I'm proposing
as an update for F16.

Comment 37 Ian Kent 2011-11-09 02:01:41 UTC
*** Bug 751766 has been marked as a duplicate of this bug. ***

Comment 38 Fedora Update System 2011-11-10 17:35:46 UTC
autofs-5.0.6-3.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.