236823 – exportfs gives inconsistent results when run immediately after nfs service is restarted

Bug 236823 - exportfs gives inconsistent results when run immediately after nfs service is restarted

Summary: exportfs gives inconsistent results when run immediately after nfs service is...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	nfs-utils
Sub Component:
Version:	5.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Steve Dickson
QA Contact:	Ben Levenson
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	239795
TreeView+	depends on / blocked

Reported:	2007-04-17 20:15 UTC by Mike Gahagan
Modified:	2007-11-30 22:07 UTC (History)
CC List:	1 user (show)
Fixed In Version:	RHBA-2007-0651
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-11-07 17:15:25 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
patch -- make auth_reload take sub-second timestamps into account (3.88 KB, patch) 2007-04-25 17:20 UTC, Jeff Layton	no flags	Details \| Diff
patch -- respun patch, fix NULL pointer check (3.89 KB, patch) 2007-04-25 18:13 UTC, Jeff Layton	no flags	Details \| Diff
patch -- make mountd hold open the etab to force a inode number change (2.87 KB, patch) 2007-04-26 18:35 UTC, Jeff Layton	no flags	Details \| Diff
patch -- return static counter value instead of inode number (2.94 KB, patch) 2007-05-03 12:38 UTC, Jeff Layton	no flags	Details \| Diff
Show Obsolete (3) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2007:0651	0	normal	SHIPPED_LIVE	nfs-utils bug fix and enhancement update	2007-10-30 16:19:49 UTC

Description Mike Gahagan 2007-04-17 20:15:34 UTC

Description of problem:

Possible race condition.

When running service nfs restart then running exportfs immediately following (in
a script for example) I'm noticing inconsistencies in what is actually exported
and what is reported by showmount vs. exportfs. This may be a kernel issue
rather than an nfs-utils problem. RHEL 5 is also affected and I can clone the
bug to RHEL 5 if anyone thinks this warrants further investigation.

Version-Release number of selected component (if applicable):
RHEL 4.5 - 0415 tree

How reproducible:
Frequently

Steps to Reproduce:
1.run the test case below:

/etc/init.d/nfs restart && usleep 600000 && exportfs -ua && exportfs -iv -o
no_root_squash,rw 127.0.0.1:/foo && exportfs -v && showmount -e 127.0.0.1

  
Actual results:

Sleeping for about .6 seconds reproduces the issue about 30% of the time on my
test system. Lowering the usleep time below 500000 (.5 seconds) reproduces the
problem almost every time as does removing the usleep command entirely. I'm not
100% certain but this may have something to do with the size of the -o option as
just specifying rw for example typically works as expected every time.

  Failure output:

 [root@test177 bz218777]# /etc/init.d/nfs restart && usleep 600000 && exportfs
-ua && exportfs -iv -o no_root_squash,rw 127.0.0.1:/foo && exportfs -v &&
showmount -e 127.0.0.1
Shutting down NFS mountd:                                  [  OK  ]
Shutting down NFS daemon:                                  [  OK  ]
Shutting down NFS quotas:                                  [  OK  ]
Shutting down NFS services:                                [  OK  ]
Starting NFS services:                                     [  OK  ]
Starting NFS quotas:                                       [  OK  ]
Starting NFS daemon:                                       [  OK  ]
Starting NFS mountd:                                       [  OK  ]
exporting localhost.localdomain:/foo
/foo            localhost.localdomain(rw,wdelay,no_root_squash)
Export list for 127.0.0.1:
/tmp *.test.redhat.com

log entry from mountd:
Apr 17 15:38:41 test177 mountd[7493]: export request from 127.0.0.1 failed.

Expected results:

[root@test177 bz218777]# /etc/init.d/nfs restart && usleep 600000 && exportfs
-ua && exportfs -iv -o no_root_squash,rw 127.0.0.1:/foo && exportfs -v &&
showmount -e 127.0.0.1
Shutting down NFS mountd:                                  [  OK  ]
Shutting down NFS daemon:                                  [  OK  ]
Shutting down NFS quotas:                                  [  OK  ]
Shutting down NFS services:                                [  OK  ]
Starting NFS services:                                     [  OK  ]
Starting NFS quotas:                                       [  OK  ]
Starting NFS daemon:                                       [  OK  ]
Starting NFS mountd:                                       [  OK  ]
exporting localhost.localdomain:/foo
/foo            localhost.localdomain(rw,wdelay,no_root_squash)
Export list for 127.0.0.1:
/foo localhost.localdomain


Additional info:

Comment 1 Jeff Layton 2007-04-24 14:34:09 UTC

Can you post what the contents of /etc/exports were during this reproducer? I'm
guessing that's where this came from:

Export list for 127.0.0.1:
/tmp *.test.redhat.com

...but I'd like confirmation.

Comment 2 Jeff Layton 2007-04-24 14:49:25 UTC

Mike clarified that that line comes from /etc/exports...

I'm also able to reproduce this on a F7-test install, though the window seems to
be slightly tighter. I'll probably focus on this there first...

Comment 3 Jeff Layton 2007-04-24 15:07:30 UTC

This, however, does not reproduce the issue. So it seems like it's not simply a
race between different exportfs invocations:

# exportfs -ua && sleep 1 && exportfs -r && exportfs -ua && exportfs -iv -o
no_root_squash,rw 127.0.0.1:/foo && exportfs -v && showmount -e 127.0.0.1

...with that, I consistently get:

exporting 127.0.0.1:/foo
/foo            127.0.0.1(rw,wdelay,no_root_squash,no_subtree_check)
Export list for 127.0.0.1:
/foo 127.0.0.1

...the interesting bit is that I notice that the exportfs in the startup script
is done long before mountd actually starts. So this may be a race somehow
between exportfs and mountd's startup.

Comment 4 Jeff Layton 2007-04-24 16:28:40 UTC

I can, however reproduce it like this:

# exportfs -ua ; pkill rpc.mountd ; sleep 1 ; exportfs -rv && rpc.mountd &&
exportfs -uva && exportfs -iv -o no_root_squash,rw 127.0.0.1:/foo && exportfs -v
&& showmount -e 127.0.0.1
exporting *:/export
exporting 127.0.0.1:/foo
/foo            127.0.0.1(rw,wdelay,no_root_squash,no_subtree_check)
Export list for 127.0.0.1:
/export *

...the other important thing to note is that this is persistent. The discrepancy
between 'exportfs -v' and 'showmount -e' lasts indefinitely.

exportfs -v seems to just read /var/lib/nfs/etab so that seems to be getting
updated properly. mountd isn't, however...

Comment 5 Jeff Layton 2007-04-24 19:37:28 UTC

I think I understand what's happening, but I'm not sure if this is easily
fixable I believe this is tied up with ext3 filesystem timestamp granularity.I
think the issue is:

1) the first "exportfs -r" is run. It modifies /var/lib/nfs/etab and changes its
mtime.

2) mountd starts, reads in this file and then timestamps its internal idea of
what the cached contents of the file are

3) the second and third  exportfs -ua and exportfs -iv are run, but all of the
occurs within the same second. The timestamp on the file does not change, and
mountd doesn't realize it's been updated.

It's possible to work around this issue by just doing:

# touch /var/lib/nfs/etab

After this, mountd reports the correct export table. I'm thinking this isn't
easily fixable without surgery to change how mountd decides whether to update
its cache.

Comment 6 Jeff Layton 2007-04-25 12:39:53 UTC

Yeah, I'm thinking this is pretty much a WONTFIX sort of problem (even
upstream). To change this seems like it would be pretty invasive, and for
minimal benefit. The only time it's ever an issue is when exportfs is racing
with a starting mountd.

We should, however, probably revisit this again when ext3 (or ext4) introduces
microsecond timestamps for files. At that point, fixing this would be feasible.

Comment 7 Mike Gahagan 2007-04-25 14:37:00 UTC

sleeping for 1 second at the end of the start() function in the nfs init script
while ugly, would work around the problem just fine for me.

Comment 8 Jeff Layton 2007-04-25 14:45:40 UTC

Im against the idea of an unconditional delay there. Most people don't need it,
and that slows down booting. My suggestion would be to just make sure that
you're delaying a second before you do the exportfs when setting up your testing.

Comment 9 Jeff Layton 2007-04-25 17:15:47 UTC

Since this problem exists in fedora, I'm going to change this BZ to be a fedora
BZ. We can clone it later for RHEL if need be.

Comment 10 Jeff Layton 2007-04-25 17:20:05 UTC

Created attachment 153430 [details]
patch -- make auth_reload take sub-second timestamps into account

This nfs-utils patch should make auth_reload take sub-second timestamps into
account. Unfortunately, this has no real affect on any current version of
fedora or RHEL, since neither ships a filesystem that supplies timestamp
granularity < 1 sec.

Still, this might be reasonable for upstream for when those filesystems become
more prevalent (or for people who want to use a different fs type for
/var/lib/nfs). I'll plan to post this there for consideration.

Comment 11 Jeff Layton 2007-04-25 18:13:24 UTC

Created attachment 153433 [details]
patch -- respun patch, fix NULL pointer check

This patch is a respun version of the other patch with a fix for the NULL
pointer check in auth_reload. I've sent this to the upstream nfs mailing list.
Awaiting response at this point.

It's probably not worth including this in RHEL, though this should "future
proof" nfs-utils for a later fedora/rhel release (with ext4 or some other
filesystem).

Alternately, if we backport sub-second timestamps to ext3 for RHEL5, we might
want to pull this in.

Comment 12 Jeff Layton 2007-04-25 20:09:41 UTC

Testing with a xfs filesystem mounted on /var/lib/nfs shows that this patch
works when the underlying filesystem has sub-second timestamps.

Comment 13 Jeff Layton 2007-04-26 18:35:30 UTC

Created attachment 153538 [details]
patch -- make mountd hold open the etab to force a inode number change

Actually there may be some hope for filesystems w/ 1sec granularity yet. There
are a couple of patches being bandied around upstream now. This is the latest
one I've posted. It simply makes mountd keep an open fd for the etab. Because
of the way exportfs works, when the etab is rewritten this will force it to get
a new inode number. We can then just compare the inode numbers to see if
they've changed.

This is pretty simple, but it does depend on exportfs (or other tools) not
editing the etab in place...

Comment 14 Jeff Layton 2007-05-03 12:38:20 UTC

Created attachment 154024 [details]
patch -- return static counter value instead of inode number

Since other functions can call into auth_reload and the inode number can be
reused, get_exportlist might miss reloading at times. This adds a static
counter to both functions to make sure that get_exportlist won't miss
reloading.

Comment 16 RHEL Program Management 2007-05-11 12:43:52 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 19 Steve Dickson 2007-05-12 12:24:44 UTC

Fixed in nfs-utils-1.0.9-18.el5

Comment 23 errata-xmlrpc 2007-11-07 17:15:25 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0651.html

Note You need to log in before you can comment on or make changes to this bug.