+++ This bug was initially created as a clone of Bug #537969 +++ When converting diskless machines (PXE boot, NFSv4 root) from FC11 to FC12 saw many problems. Firstly, boot arguments are discussed here: http://fedoraproject.org/wiki/Dracut/Options#NFS But those such as this do not work: root=dhcp root-path=nfs4:server:/root/path Nor this: root=/dev/nfs nfsroot=server:/root/path However, this almost works but gives the similar issues as described in bug #537969: root=nfs4:server:/root/path Using that the message/errors below are given: ---snip --- dracut: Mounted root filesystem server:/root/path dracut: Switching root mount: only root can do that ---snip --- So the root path is mounted, and the system proceeds to boot, but because root filesystem is not read/write at this point nothing works. I don't get this... it would appear that mount thinks it's not being run as the root superuser which sounds very odd.
A while back I spent several hours trying to track down the cause of this error. It turns out that it's because the mount binary has the setuid bit set. Since idmapd isn't running, the mount binary is owned by nfsnobody, so mount is setuid nfsnobody. When root runs mount, the effective uid switches to nfsnobody, which makes mount give up.
Anyway, either the init scripts need to be drastically changed, or Dracut needs to leave idmapd running through the pivot. I tried making idmapd start immediately after pivot, but I ran into lots of problems and eventually gave up. I think this needs to be worked on by someone who knows Dracut better than I do.
Is there anything I can do to help progress a fix here? Am willing to try various things in the init scripts, etc. if that helps. This is becoming urgent as until fixed one cannot update the kernel on machines with an NFS root. The NFS root machines I have are still on an FC11 kernel as initrd created images work, but Dracut ones do not.
More updates on this. 1. Andrew McNabb suggested this problem is due to 'mount' being SUID. I removed SUID from both the below on the initramfs image: /sbin/mount.nfs /bin/mount This had no effect. 2. I then looked at the /sbin/nfsroot script and put '-x' on the first line to see what it was doing. It definitely executes rpc.idmapd prior to mounting root. 3. I also saw in the nfsroot script it looks for an optional 'nfsrw=rw' argument so put that in the kernel args to boot. I placed a 'mount' command right at the end of the /sbin/nfsroot script and confirmed that the root was indeed being mounted RW prior to switch_root. This didn't help. 4. What I find odd in all this is that the NFS server is not showing any mount requests in the logs. What is in there is: rpc.idmapd[1732]: nss_getpwnam: name 'nobody' does not map into domain 'my.domain.name' 5. I then put '-x' on the init script itself. I can see the correct arguments are being passed to switch_root, and can see that the 'mount: only root can do that' message comes out after switch_root has been exec'd. Given that at this point /sysroot is mounted RW as required what's the big deal? 6. Noticing #5, I thought perhaps it's the 'mount' on the NFS root, instead of the ramdisk that's causing the issue so removed SUID on that. The has the effect of removing the 'mount: only root can do that' message and init starts. There are couple of errors ion chgrp but then the system hangs. The last message is 'Starting HAL daemon [OK]' Is any of this helping?
Robin, thanks for adding your experiences about the problem. As you noted in item 6, the problems are post-pivoting, so the mount-SUID problem is in the NFS root, not the initramfs. I hadn't tried making root non-SUID, but I'm not shocked that it still hangs, since rpc.idmapd probably still isn't starting early enough. This really is a frustrating problem.
The problem is that /etc/passwd (in the initramfs) has no entry for root.
Created attachment 406658 [details] Patch so /etc/passwd in initramfs gets an entry for root.
Ian, does that actually work? It doesn't seem to me like it would solve the problem, but I haven't actually tried it. Have you tested it?
Absolutely. You can boot with rdbreak on the command line and you get a shell prompt juse before the switch root. Do an ls -l /sysroot and you see that for the mounted root file system, practically everything is owned by MAX_INT (nfsnobody). The groups are all OK because the the full /etc/group gets copied across. When you think about it, idmapd tries to map the user name "root" from across the wire to a uid locally, but it doesn't know what the uid is because there is no entry in /etc/passwd. The result is that the mount executable on the new root is SUID nfsnobody. Installations using ldap or something might not need the patch (but it would do not harm). This would require that "root" and other "system' uid's be in LDAP which I am not sure is standard practice. [I'm not sure what standard practice is with nfs4 and ldap. I'm used to user's uid's coming from NIS or YP or LDAP, but having "system' uid's still locally defined in /etc/passwd, which causes problems with nfs4 when the system uid's get out of sync. Whilst OT for this particular bug, it is an issue which stops nfs4 being a drop-in replacement for nfs3.]
The errors we've reported, such as "mount: only root can do that" occur post-pivot, so it's very surprising that the initramfs /etc/passwd (pre-pivot) would be related. To make sure I'm understanding you correctly, do you mean that you added root to /etc/passwd in the initramfs, and this made the "mount: only root can do that" error go away and the system boot correctly?
You understand me correctly! Pre-pivot, the mount executable used is the one in the initramfs and it runs as root. Post pivot the mount which runs is on the root partition which (without the patch) is SUID nfsnobody. Now maybe there is a different solution which involves using chroot to run rpc.idmapd, restarting it or giving it a kick to read the new /etc/passwd early in rc.sysinit, but the patch as given works, at least enough to get the readonly-root rwtab and statetab magic done and get to a login prompt with at least most services running. The "mount: only root can do that" are definitely gone! I'll check when I get a chance whether there is any legacy effect whereby rpc.ipdmapd only recognizes uid's in the intramfs version of /etc/passwd.
Reading the previous comments eg #2, I should perhaps clarify. rpc.idmapd IS started early enough and it IS left running through the pivot (all with dracut-004-4). The sunrpc vfs is moved to the new root prior to the pivot.
(In reply to comment #7) > Created an attachment (id=406658) [details] > Patch so /etc/passwd in initramfs gets an entry for root. this is already in dracut git... http://dracut.git.sourceforge.net/git/gitweb.cgi?p=dracut/dracut;a=commitdiff;h=93bc3d440c937486a1885050818431746b171cea you might want to test dracut-005 http://download.fedoraproject.org/pub/fedora/linux/development/rawhide/i386/os/Packages/dracut-005-1.fc14.noarch.rpm http://download.fedoraproject.org/pub/fedora/linux/development/rawhide/i386/os/Packages/dracut-network-005-1.fc14.noarch.rpm http://download.fedoraproject.org/pub/fedora/linux/development/rawhide/i386/os/Packages/dracut-generic-005-1.fc14.noarch.rpm http://download.fedoraproject.org/pub/fedora/linux/development/rawhide/i386/os/Packages/dracut-tools-005-1.fc14.noarch.rpm
It's true... I can confirm if one places a line for 'root' in the initfs /etc/passwd incredibly the system boots correctly! Does anyone know when this is likely to be rolled into an FC12 release?
yes, I should release an update for F12... maybe even tomorrow
dracut-005-2.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/dracut-005-2.fc12
*** Bug 564294 has been marked as a duplicate of this bug. ***
However celebration should be muted! I have discovered that bind mounts don't work, which breaks the rwtab and statetab stuff if you are using readonly-root. I think this is really a separate bug and probably a kernel one. I have reported it as https://bugzilla.kernel.org/show_bug.cgi?id=15789
dracut-005-2.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update dracut'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/dracut-005-2.fc12
For nfs4 root, note the kernel bug https://bugzilla.kernel.org/show_bug.cgi?id=15854 It seems that nfs4 root won't work ATM.
dracut-005-2.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report.
I just upgraded NFS FC12 box to dracut-005-2.fc12.noarch to try this. Date on the binary is: # ls -la `which dracut` -rwxr-xr-x 1 root root 11073 2010-04-15 23:47 /sbin/dracut I removed and re-installed an FC12 kernel on the diskless machine to regenerate initrd image. Using the NFS boot args (TFTPd config) "root=nfs4:server:/path/to/root" does not work. On boot one gets an error pertaining to rpc.statd not running when attempting to remount root read/write. However, using args "root=nfs4:server:/path/to/root nfsrw=rw" seems to work, but... While the machine appears OK, it probably isn't because the message log has many errors such as: May 15 22:04:02 iiwi rpc.statd[1305]: creat(/var/lib/nfs/statd/sm/hsem.my.domain) failed: Permission denied May 15 22:04:02 iiwi rpc.statd[1305]: STAT_FAIL to iiwi.my.domain for SM_MON of 10.1.0.6 May 15 22:04:02 iiwi kernel: lockd: cannot monitor hsem In this example, 'hsem' is the serving host (10.1.0.6) and 'iiwi' is the diskless, remote booting NFS root host.
Does anyone know if this bug persists in FC13?
This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.