Description of problem: Possible race condition. When running service nfs restart then running exportfs immediately following (in a script for example) I'm noticing inconsistencies in what is actually exported and what is reported by showmount vs. exportfs. This may be a kernel issue rather than an nfs-utils problem. RHEL 5 is also affected and I can clone the bug to RHEL 5 if anyone thinks this warrants further investigation. Version-Release number of selected component (if applicable): RHEL 4.5 - 0415 tree How reproducible: Frequently Steps to Reproduce: 1.run the test case below: /etc/init.d/nfs restart && usleep 600000 && exportfs -ua && exportfs -iv -o no_root_squash,rw 127.0.0.1:/foo && exportfs -v && showmount -e 127.0.0.1 Actual results: Sleeping for about .6 seconds reproduces the issue about 30% of the time on my test system. Lowering the usleep time below 500000 (.5 seconds) reproduces the problem almost every time as does removing the usleep command entirely. I'm not 100% certain but this may have something to do with the size of the -o option as just specifying rw for example typically works as expected every time. Failure output: [root@test177 bz218777]# /etc/init.d/nfs restart && usleep 600000 && exportfs -ua && exportfs -iv -o no_root_squash,rw 127.0.0.1:/foo && exportfs -v && showmount -e 127.0.0.1 Shutting down NFS mountd: [ OK ] Shutting down NFS daemon: [ OK ] Shutting down NFS quotas: [ OK ] Shutting down NFS services: [ OK ] Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS daemon: [ OK ] Starting NFS mountd: [ OK ] exporting localhost.localdomain:/foo /foo localhost.localdomain(rw,wdelay,no_root_squash) Export list for 127.0.0.1: /tmp *.test.redhat.com log entry from mountd: Apr 17 15:38:41 test177 mountd[7493]: export request from 127.0.0.1 failed. Expected results: [root@test177 bz218777]# /etc/init.d/nfs restart && usleep 600000 && exportfs -ua && exportfs -iv -o no_root_squash,rw 127.0.0.1:/foo && exportfs -v && showmount -e 127.0.0.1 Shutting down NFS mountd: [ OK ] Shutting down NFS daemon: [ OK ] Shutting down NFS quotas: [ OK ] Shutting down NFS services: [ OK ] Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS daemon: [ OK ] Starting NFS mountd: [ OK ] exporting localhost.localdomain:/foo /foo localhost.localdomain(rw,wdelay,no_root_squash) Export list for 127.0.0.1: /foo localhost.localdomain Additional info:
Can you post what the contents of /etc/exports were during this reproducer? I'm guessing that's where this came from: Export list for 127.0.0.1: /tmp *.test.redhat.com ...but I'd like confirmation.
Mike clarified that that line comes from /etc/exports... I'm also able to reproduce this on a F7-test install, though the window seems to be slightly tighter. I'll probably focus on this there first...
This, however, does not reproduce the issue. So it seems like it's not simply a race between different exportfs invocations: # exportfs -ua && sleep 1 && exportfs -r && exportfs -ua && exportfs -iv -o no_root_squash,rw 127.0.0.1:/foo && exportfs -v && showmount -e 127.0.0.1 ...with that, I consistently get: exporting 127.0.0.1:/foo /foo 127.0.0.1(rw,wdelay,no_root_squash,no_subtree_check) Export list for 127.0.0.1: /foo 127.0.0.1 ...the interesting bit is that I notice that the exportfs in the startup script is done long before mountd actually starts. So this may be a race somehow between exportfs and mountd's startup.
I can, however reproduce it like this: # exportfs -ua ; pkill rpc.mountd ; sleep 1 ; exportfs -rv && rpc.mountd && exportfs -uva && exportfs -iv -o no_root_squash,rw 127.0.0.1:/foo && exportfs -v && showmount -e 127.0.0.1 exporting *:/export exporting 127.0.0.1:/foo /foo 127.0.0.1(rw,wdelay,no_root_squash,no_subtree_check) Export list for 127.0.0.1: /export * ...the other important thing to note is that this is persistent. The discrepancy between 'exportfs -v' and 'showmount -e' lasts indefinitely. exportfs -v seems to just read /var/lib/nfs/etab so that seems to be getting updated properly. mountd isn't, however...
I think I understand what's happening, but I'm not sure if this is easily fixable I believe this is tied up with ext3 filesystem timestamp granularity.I think the issue is: 1) the first "exportfs -r" is run. It modifies /var/lib/nfs/etab and changes its mtime. 2) mountd starts, reads in this file and then timestamps its internal idea of what the cached contents of the file are 3) the second and third exportfs -ua and exportfs -iv are run, but all of the occurs within the same second. The timestamp on the file does not change, and mountd doesn't realize it's been updated. It's possible to work around this issue by just doing: # touch /var/lib/nfs/etab After this, mountd reports the correct export table. I'm thinking this isn't easily fixable without surgery to change how mountd decides whether to update its cache.
Yeah, I'm thinking this is pretty much a WONTFIX sort of problem (even upstream). To change this seems like it would be pretty invasive, and for minimal benefit. The only time it's ever an issue is when exportfs is racing with a starting mountd. We should, however, probably revisit this again when ext3 (or ext4) introduces microsecond timestamps for files. At that point, fixing this would be feasible.
sleeping for 1 second at the end of the start() function in the nfs init script while ugly, would work around the problem just fine for me.
Im against the idea of an unconditional delay there. Most people don't need it, and that slows down booting. My suggestion would be to just make sure that you're delaying a second before you do the exportfs when setting up your testing.
Since this problem exists in fedora, I'm going to change this BZ to be a fedora BZ. We can clone it later for RHEL if need be.
Created attachment 153430 [details] patch -- make auth_reload take sub-second timestamps into account This nfs-utils patch should make auth_reload take sub-second timestamps into account. Unfortunately, this has no real affect on any current version of fedora or RHEL, since neither ships a filesystem that supplies timestamp granularity < 1 sec. Still, this might be reasonable for upstream for when those filesystems become more prevalent (or for people who want to use a different fs type for /var/lib/nfs). I'll plan to post this there for consideration.
Created attachment 153433 [details] patch -- respun patch, fix NULL pointer check This patch is a respun version of the other patch with a fix for the NULL pointer check in auth_reload. I've sent this to the upstream nfs mailing list. Awaiting response at this point. It's probably not worth including this in RHEL, though this should "future proof" nfs-utils for a later fedora/rhel release (with ext4 or some other filesystem). Alternately, if we backport sub-second timestamps to ext3 for RHEL5, we might want to pull this in.
Testing with a xfs filesystem mounted on /var/lib/nfs shows that this patch works when the underlying filesystem has sub-second timestamps.
Created attachment 153538 [details] patch -- make mountd hold open the etab to force a inode number change Actually there may be some hope for filesystems w/ 1sec granularity yet. There are a couple of patches being bandied around upstream now. This is the latest one I've posted. It simply makes mountd keep an open fd for the etab. Because of the way exportfs works, when the etab is rewritten this will force it to get a new inode number. We can then just compare the inode numbers to see if they've changed. This is pretty simple, but it does depend on exportfs (or other tools) not editing the etab in place...
Created attachment 154024 [details] patch -- return static counter value instead of inode number Since other functions can call into auth_reload and the inode number can be reused, get_exportlist might miss reloading at times. This adds a static counter to both functions to make sure that get_exportlist won't miss reloading.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Fixed in nfs-utils-1.0.9-18.el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0651.html