Bug 537969 - NFSv4 readonly root cannot boot without rpc.idmapd running
Summary: NFSv4 readonly root cannot boot without rpc.idmapd running
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: dracut
Version: 12
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Harald Hoyer
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 537217 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-11-17 00:59 UTC by Andrew McNabb
Modified: 2012-07-26 15:50 UTC (History)
5 users (show)

Fixed In Version: 004-4.fc12
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 570946 (view as bug list)
Environment:
Last Closed: 2010-01-28 00:52:50 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
additional patch needed for dracut-004 (760 bytes, patch)
2010-02-10 16:27 UTC, Harald Hoyer
no flags Details | Diff

Description Andrew McNabb 2009-11-17 00:59:38 UTC
I have tried booting from an NFSv4 root, and I have not been able to get dracut to set the NFSv4 domain correctly.  When it mounts the root filesystem, it gives the error:

rpc.idmapd: Could not find user "nobody"

I have tried logging in with a dracut shell.  Running "hostname" showed the correct FQDN--if rpc.idmapd were using this to set the domain, everything would be working.  When I tried running "hostname -s" and "hostname -f", I got the error "Resolver Error 0 (no error)".  I mention this because if rpc.idmapd is trying to do the equivalent of "hostname -f", this could be the cause of the problem.

Is there any other information I could provide that would be helpful in diagnosing this problem?  For example, is there a command to show what rpc.idmapd thinks the NFSv4 domain is?  Thanks.

Comment 1 Andrew McNabb 2009-11-17 17:42:51 UTC
It appears that building a debug initramfs (with "dracut -a debug /boot/initramfs-debug-2.6.31.5-127.fc12.x86_64.img 2.6.31.5-127.fc12.x86_64") gets rid of the 'rpc.idmapd: Could not find user "nobody"' error.  Perhaps this is because the debug image contains libnss, so there aren't any dns errors.  If this is correct, I think it makes sense to include the libnss libraries in the dracut-network package.

It also seems that rpc.idmapd is stopping somewhere after mounting, because the system is getting permission errors on boot (when rpc.idmapd isn't running, files are owned by nfsnobody instead of root).  I'll keep on looking into this part of the problem.

Comment 2 Andrew McNabb 2009-11-17 17:58:49 UTC
Hmm.  When I just rebuilt my initramfs, it copied over the system's idmapd.conf, which helps a lot for setting the NFSv4 domain.  The earlier initramfs did not contain the system's idmapd.conf.

Also, I found why rpc.idmapd is being killed.  This happens in pre-pivot/70nfsroot-cleanup.sh.  Unfortunately, rpc.idmapd really can't be killed this early because it breaks all of the init scripts, many of which don't work if all files are owned by nfsnobody instead of root.

Comment 3 Andrew McNabb 2009-11-24 23:50:01 UTC
From David Dillow:
> It sounds like you are on the right track -- though I suspect F12 will
> need to be changed to support NFS root properly. We discussed trying to
> get rpc.idmapd to survive the pivot, but the final decision was that the
> distro should start it earlier in the boot process.
> 
> One way to work around the problem would be to patch dracut to try $INIT
> before doing its search if a module has already set it. Then a module
> could set INIT=/linuxrc and you could start rpc.idmapd there before
> exec'ing the real init.

Comment 4 Harald Hoyer 2009-11-26 09:29:13 UTC
*** Bug 537217 has been marked as a duplicate of this bug. ***

Comment 5 Andrew McNabb 2009-11-27 01:28:34 UTC
Since bug #537217 has been marked as a duplicate, the following issu needs to be added to this:

The documentation at https://fedoraproject.org/wiki/Dracut and
 https://fedoraproject.org/wiki/Dracut/Options#NFS needs to be updated to add information about the "dracut-network" package.  This documentation also needs to be updated to mention that the idmapd.conf file gets copied into the initramfs image.

Unfortunately, the rest of this bug report is specifically about NFSv4.  Since this particular issue applies to NFS in general (including NFSv3), I hope this isn't too confusing.

Comment 6 Harald Hoyer 2009-11-27 09:26:54 UTC
(In reply to comment #5)
> Since bug #537217 has been marked as a duplicate, the following issu needs to
> be added to this:
> 
> The documentation at https://fedoraproject.org/wiki/Dracut and
>  https://fedoraproject.org/wiki/Dracut/Options#NFS needs to be updated to add
> information about the "dracut-network" package.  This documentation also needs
> to be updated to mention that the idmapd.conf file gets copied into the
> initramfs image.

done

Comment 7 Fedora Update System 2009-11-27 15:12:25 UTC
dracut-003-1.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/dracut-003-1.fc12

Comment 8 Fedora Update System 2009-12-01 04:39:48 UTC
dracut-003-1.fc12 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update dracut'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2009-12432

Comment 9 Andrew McNabb 2009-12-01 19:34:15 UTC
The new documentation looks great.  Thanks!

Comment 10 Andrew McNabb 2009-12-01 19:44:00 UTC
I noticed the following comment in /usr/share/dracut/modules.d/95nfs/nfsroot:

# XXX really needed? Do we need non-root users before we start it in
# XXX the real root image?
[ -z "$(pidof rpc.idmapd)" ] && rpc.idmapd

I've spent a decent amount of time over the last few weeks trying to track down the problem with NFSv4 not booting.  It's much more difficult than I originally imagined.  With NFS as a read-only root, if rpc.idmapd isn't running, then _everything_ is owned by nfsnobody.  This means that basically _nothing_ works.  For example, /bin/mount has the suid-bit set, but it's owned by nfsnobody; mount notices that it's being run with the effective uid of nfsnobody, so it reports: "mount: only root can do that" and refuses to mount anything.

I tried making rpc.idmapd start just after pivoting, but that doesn't work because rpc.idmapd needs to write to some files in /var, but nothing in /var is writable until rwtab is processed, which mounts tmpfs filesystems.  This is a chicken-and-egg situation: rpc.idmapd can't start unless mount works, and mount doesn't work unless rpc.idmapd is running.

Comment 11 Fedora Update System 2010-01-26 10:47:44 UTC
dracut-004-4.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/dracut-004-4.fc12

Comment 12 Fedora Update System 2010-01-27 01:05:18 UTC
dracut-004-4.fc12 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update dracut'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2010-1088

Comment 13 Fedora Update System 2010-01-28 00:50:34 UTC
dracut-004-4.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 14 Harald Hoyer 2010-02-10 16:27:36 UTC
Created attachment 390042 [details]
additional patch needed for dracut-004

Comment 15 Eray Ozkural 2011-08-18 14:32:57 UTC
This bug most certainly persists in dracut-013. 

When I try to start rpc.idmapd manually in dracut debug shell, it dies with the same error message. I strace'd and saw that it borked on nsswitch access. It can't find the nss lib.

Comment 16 Eray Ozkural 2011-08-18 22:31:41 UTC
Ok, scratch the previous comment, that applies to systems other than red hat based, because the code that copies nss libs don't work on every distro. That part (and some others AFAICT) could be made more distro agnostic. I solved it on my system by installing all of `find /lib -iname 'libnss*'`and also copying passwd and group files (because that part copying only some users doesn't work on debian based distros)

I'm trying to get a diskless installation tool to work on my system that's why :) I liked dracut a lot by the way, it's easily customizable. If only it had been written in python instead of sh it would be perfect!

Cheers,

Eray

Comment 17 Dennis Schridde 2012-07-21 17:02:27 UTC
I do not see this issue fixed. I have /lib64/libnss*.so, a /etc/passwd which defines the nobody user and a /etc/idmapd.conf which sets the correct Domain all in my initramfs and still I get:

During kernel/dracut startup:
rpc.idmapd: Could not find user "nobody"

When Gentoo's OpenRC tries to mount something (probably the rootfs):
mount: only root can do that (effective UID is 4294967294)

Later I get a lot of these, for varying files:
mkdir `/lib64/rc/init.d/starting': Read-only file system

Comment 18 Harald Hoyer 2012-07-24 09:44:05 UTC
(In reply to comment #17)
> I do not see this issue fixed. I have /lib64/libnss*.so, a /etc/passwd which
> defines the nobody user and a /etc/idmapd.conf which sets the correct Domain
> all in my initramfs and still I get:
> 
> During kernel/dracut startup:
> rpc.idmapd: Could not find user "nobody"
> 
> When Gentoo's OpenRC tries to mount something (probably the rootfs):
> mount: only root can do that (effective UID is 4294967294)
> 
> Later I get a lot of these, for varying files:
> mkdir `/lib64/rc/init.d/starting': Read-only file system

Then Gentoo's OpenRC has to start rpc.idmapd early, before trying to remount things.

Comment 19 Dennis Schridde 2012-07-26 14:04:46 UTC
(In reply to comment #10)
> I tried making rpc.idmapd start just after pivoting, but that doesn't work
> because rpc.idmapd needs to write to some files in /var, but nothing in /var
> is writable until rwtab is processed, which mounts tmpfs filesystems.  This
> is a chicken-and-egg situation: rpc.idmapd can't start unless mount works,
> and mount doesn't work unless rpc.idmapd is running.
Has this issue been fixed yet?

Otherwise I do not see how to fix this:

(In reply to comment #18)
> Then Gentoo's OpenRC has to start rpc.idmapd early, before trying to remount
> things.

And as the init system itself needs to keep track of things, starting idmapd probably needs to be the very first thing, even before it sets up its own directories?

Comment 20 Harald Hoyer 2012-07-26 15:50:12 UTC
Maybe rpc.idmapd should be patched and should make use of /run instead of /var


Note You need to log in before you can comment on or make changes to this bug.