Bug 692702
Summary: | rpc.idmapd sometimes fails to start due to sunrpc | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Andrew Schultz <ajschult784> | ||||
Component: | nfs-utils | Assignee: | Steve Dickson <steved> | ||||
Status: | CLOSED ERRATA | QA Contact: | yanfu,wang <yanwang> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 6.0 | CC: | dcbw, syeghiay | ||||
Target Milestone: | rc | Keywords: | OtherQA | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | nfs-utils-1.2.3-8.el6 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2011-12-06 18:53:33 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Andrew Schultz
2011-03-31 21:17:13 UTC
(In reply to comment #0) > Description of problem: > the rpcidmapd script sometimes fails to start rpc.idmapd because > /var/lib/nfs/rpc_pipefs is not mounted, even though it explicitly loads sunrpc. > The reason is that sunrpc is not finished loading even though modprobe returns > control to the script. I just added additional bits to sleep and recheck for > rpc_pipefs when it can't find rpc_pipefs the first time. This allows the > script to successfully load rpc.idmapd. > > Version-Release number of selected component (if applicable): > nfs-utils-1.2.2-7.el6 > > Steps to Reproduce: > 1. configure a box as an NFS client (and not a server) > 2. boot > > Actual results: > 1. see messages in boot.log about missing RPC MTAB Could you please post what these messages are? boot.log would contain: Starting RPC idmapd: FATAL: Module sunrpc already in kernel. FATAL: Error running install command for sunrpc Error: RPC MTAB does not exist. the more explicit complaints about sunrpc sound worse, but don't cause real problems. Further investigation reveals that something earlier during bootup triggers loading sunrpc (actually... sunrpc is in initramfs). It apparently just takes its time finishing the loading process, especially on the first bootup. Hey, (In reply to comment #3) > boot.log would contain: > > Starting RPC idmapd: FATAL: Module sunrpc already in kernel. > FATAL: Error running install command for sunrpc > Error: RPC MTAB does not exist. The fact the script does not find /var/lib/nfs/rpc_pipefs in /proc/mounts and then the 'modprobe sunrpc' fails because the module already exists basically means whomever is loading sunrpc is not doing it correctly. Does 'grep sunrpc /etc/modprobe.d/dist.conf' show two rules on sunrpc? andrew@george#grep sunrpc dist.conf install sunrpc /sbin/modprobe --first-time --ignore-install sunrpc && { /bin/mount -t rpc_pipefs sunrpc /var/lib/nfs/rpc_pipefs > /dev/null 2>&1 || :; } remove sunrpc { /bin/umount /var/lib/nfs/rpc_pipefs > /dev/null 2>&1 || :; } ; /sbin/modprobe -r --ignore-remove sunrpc alias rpc_pipefs sunrpc alias rpc_svc_gss_pipefs sunrpc Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. I'm unable to reproduce this locally... so when this happens can you verify that 1) sunrpc is loaded (lsmod | grep sunrpc) 2) /var/lib/nfs/rpc_pipefs is not mounted (grep sunrpc /proc/mounts) I would add those to commands to the nfs initscript. It's only a problem for a moment (which is why "sleep 1" fixes it). The various conditions for this to happen really only happen during first (or second) bootup, but I can induce them synthetically. 1. sunrpc will be loaded on its own during bootup (before rpcidmapd). At the top of the NetworkManager startup script, I added a script that checks for sunrpc and "sleep 1" until it is loaded. It only needs to sleep once. 2. There is a finite amount of time after sunrpc starts to load before it shows up in lsmod. trying to "modprobe sunrpc" during this time results in the first two error messages in comment 3. 3. There is a finite amount of time after sunrpc shows up in lsmod that rpc_pipefs will show up in /proc/mounts. This is what causes rpcidmapd to fail. The amounts of time for #2 and #3 are longer when the system doing more disk access (apparently during first boot). I'll attach a script that I used to test this. Created attachment 490288 [details]
script to demonstrate conditions #2 and #3
I ran this after booting into single user mode and starting a few of the services that would be loaded before rpcidmapd (network, portreserve, rpcbind).
The output would look like:
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2
[messages about unregistering]
"0" indicates sunrpc doesn't show up in lsmod and "1" indicates that it does, but that rpc_pipefs doesn't show up in /proc/mounts.
(In reply to comment #8) > It's only a problem for a moment (which is why "sleep 1" fixes it). The > various conditions for this to happen really only happen during first (or > second) bootup, but I can induce them synthetically. > > 1. sunrpc will be loaded on its own during bootup (before rpcidmapd). Who and where is it getting loaded? Because when its loaded the rpc_pipefs should also happen... immediately after the insmod. (In reply to comment #10) > > 1. sunrpc will be loaded on its own during bootup (before rpcidmapd). > Who and where is it getting loaded? Because when its loaded the > rpc_pipefs should also happen... immediately after the insmod. ... for some definition of "immediately". See #2 and #3 from comment 8. It gets loaded during the NetworkManager script. I sprinkled checks within the script and the daemon itself (/usr/sbin/NetworkManager) seems to be what triggers sunrpc. How/why/when/where that happens... I don't know. I'm not going to trace through the NetworkManager source code to track it down. Seems like a bit of a silly exercise anyway. What's important here is that the rpcidmapd makes invalid assumptions behavior about module loading. (In reply to comment #11) > (In reply to comment #10) > > > 1. sunrpc will be loaded on its own during bootup (before rpcidmapd). > > Who and where is it getting loaded? Because when its loaded the > > rpc_pipefs should also happen... immediately after the insmod. > > ... for some definition of "immediately". See #2 and #3 from comment 8. > > It gets loaded during the NetworkManager script. I sprinkled checks within the > script and the daemon itself (/usr/sbin/NetworkManager) seems to be what > triggers sunrpc. Cool... thanks for this effort.. > How/why/when/where that happens... I don't know. I'm not > going to trace through the NetworkManager source code to track it down. . Of course not... Its just I'm not seeing this problem in my world so I'm just trying to figure what is going on... > > What's important here is that the rpcidmapd makes invalid assumptions behavior > about module loading. Not really since the /var/lib/nfs/rpc_pipefs is user to kernel interface the daemons, like rpc.idmapd, use. So it wrong to load the sunrpc module without enabling its interface (i.e. mounting /var/lib/nfs/rpc_pipefs) BTW, what version of NetworkManager are you running? Again, thank you for your time... > So it wrong to load the sunrpc module without enabling its interface (i.e. > mounting /var/lib/nfs/rpc_pipefs) /var/lib/nfs/rpc_pipefs is mounted even when sunrpc loading is triggered by NetworkManager. It's just not immediate. > BTW, what version of NetworkManager are you running? 0.8.1-5.el6_0.1.x86_64 I try to test using comment #9: # cat bangfile #!/bin/bash dd if=/dev/zero of=test bs=1M count=7000 # cat test.sh #!/bin/sh # creates a 7GB file and reads it in a grossly inefficient way ./bangfile & sleep 5 modprobe sunrpc & while [ 1 ]; do lsmod | grep sunrpc > /dev/null rv=$? if [ $rv -eq 0 ]; then grep pipefs /proc/mounts > /dev/null rv=$? if [ $rv -eq 0 ]; then echo 2 break else echo -n "1 " fi else echo -n "0 " fi done umount /var/lib/nfs/rpc_pipefs rmmod sunrpc killall bangfile Then boot into single user mode and run test.sh script, and got below test result against RHEL6.1 and RHEL6.2 package. reproduced on RHEL6.1: [root@hp-dl385g7-01 /]# ./test.sh 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 RPC: Registered udp transport module. RPC: Registered tcp transport module. 1 RPC: Registered tcp NFSv4.1 backchannel transport module. 1 1 1 1 1 1 1 1 1 1 1 2 RPC: Unregistered udp transport module. RPC: Unregistered tcp transport module. RPC: Unregistered tcp NFSv4.1 backchannel transport module. ./test.sh: line 29: 1679 Terminated ./bangfile ^C verified on rhel6.2: [root@tyan-gt24-05 /]# ./test.sh 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 RPC: Registered udp transport module. RPC: Registered tcp transport module. RPC: Registered tcp NFSv4.1 backchannel transport module. 1 1 1 1 1 2 RPC: Unregistered udp transport module. RPC: Unregistered tcp transport module. RPC: Unregistered tcp NFSv4.1 backchannel transport module. ./tesh.sh: line 28: 1213 Terminated ./bangfile So I'm not sure what's the meaning of the script shown to me. I check rpcidmapd.init of nfs-utils-1.2.3-8.el6 src rpm and could found below specification: # Make sure the rpc_pipefs filesystem is available /bin/mount -t rpc_pipefs sunrpc /var/lib/nfs/rpc_pipefs > /dev/null 2>&1 Verified sunrpc loading is ok on RHEL6.2 alpha distro: # lsmod | grep sunrpc sunrpc 197959 31 nfsd,nfs,lockd,nfs_acl,rpcsec_gss_krb5,auth_rpcgss # grep sunrpc /proc/mounts sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0 So is the test result is ok for you? > So I'm not sure what's the meaning of the script shown to me. The script simply demonstrates that the condition described in comment 0 is possible. Fixing the bug won't prevent the condition from occurring or alter the attached script's behavior, but would simply change the init script to handle the condition. In terms of reproducing the bug, I only hit the bug on fresh installs. Conceivably, the bug could be reproduced with nfs-utils-1.2.3-7 by replacing the while loop in the attached script with /etc/init.d/rpcidmapd start but I think my own attempts to encounter the bug that way were unsuccessful. (In reply to comment #17) > > So I'm not sure what's the meaning of the script shown to me. > > The script simply demonstrates that the condition described in comment 0 is > possible. Fixing the bug won't prevent the condition from occurring or alter > the attached script's behavior, but would simply change the init script to > handle the condition. > > In terms of reproducing the bug, I only hit the bug on fresh installs. > Conceivably, the bug could be reproduced with nfs-utils-1.2.3-7 by replacing > the while loop in the attached script with > > /etc/init.d/rpcidmapd start > > but I think my own attempts to encounter the bug that way were unsuccessful. I can't reproduce it also, so do code review and the below result is ok, if any problem pls let me know. Checking rpcidmapd.init of nfs-utils-1.2.3-8.el6 src rpm and could found below specification: # Make sure the rpc_pipefs filesystem is available /bin/mount -t rpc_pipefs sunrpc /var/lib/nfs/rpc_pipefs > /dev/null 2>&1 Verified sunrpc loading is ok after install RHEL6.2 alpha distro: # lsmod | grep sunrpc sunrpc 197959 31 nfsd,nfs,lockd,nfs_acl,rpcsec_gss_krb5,auth_rpcgss # grep sunrpc /proc/mounts sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2011-1534.html |