Bug 113636
Summary: | NFS: bad interaction between rsync and NFS | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Ben Woodard <woodard> | ||||||||||
Component: | kernel | Assignee: | Steve Dickson <steved> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | |||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | medium | ||||||||||||
Version: | 3.0 | CC: | garlick, petrides, riel, tao | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | All | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2004-09-02 04:31:02 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Ben Woodard
2004-01-16 00:29:42 UTC
Here is some ethereal output for the NFS interchange before and after the remove/rsync. 20.1 is the client, 20.198 is the server. successful ls -l 74.176978 192.168.20.1 -> 192.168.20.198 NFSACL V3 GETACL Call XID 0x2e2f18ee 74.177017 192.168.20.198 -> 192.168.20.1 NFSACL V3 GETACL Reply XID 0x2e2f18ee 74.177284 192.168.20.1 -> 192.168.20.198 NFSACL V3 GETACL Call XID 0x2f2f18ee 74.177302 192.168.20.198 -> 192.168.20.1 NFSACL V3 GETACL Reply XID 0x2f2f18ee 74.177532 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x302f18ee 74.177551 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x302f18ee 74.177782 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x312f18ee 74.177799 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x312f18ee 74.178018 192.168.20.1 -> 192.168.20.198 NFSACL V3 GETACL Call XID 0x322f18ee 74.178070 192.168.20.198 -> 192.168.20.1 NFSACL V3 GETACL Reply XID 0x322f18ee 74.178265 192.168.20.1 -> 192.168.20.198 NFSACL V3 GETACL Call XID 0x332f18ee 74.178281 192.168.20.198 -> 192.168.20.1 NFSACL V3 GETACL Reply XID 0x332f18ee 74.178513 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x342f18ee 74.178530 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x342f18ee 74.178765 192.168.20.1 -> 192.168.20.198 NFSACL V3 GETACL Call XID 0x352f18ee 74.178781 192.168.20.198 -> 192.168.20.1 NFSACL V3 GETACL Reply XID 0x352f18ee 74.179013 192.168.20.1 -> 192.168.20.198 NFSACL V3 GETACL Call XID 0x362f18ee 74.179071 192.168.20.198 -> 192.168.20.1 NFSACL V3 GETACL Reply XID 0x362f18ee 74.179262 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x372f18ee 74.179314 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x372f18ee 74.179511 192.168.20.1 -> 192.168.20.198 NFSACL V3 GETACL Call XID 0x382f18ee 74.179528 192.168.20.198 -> 192.168.20.1 NFSACL V3 GETACL Reply XID 0x382f18ee 74.179765 192.168.20.1 -> 192.168.20.198 NFSACL V3 GETACL Call XID 0x392f18ee 74.179780 192.168.20.198 -> 192.168.20.1 NFSACL V3 GETACL Reply XID 0x392f18ee stale ls -l 91.032865 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x3a2f18ee 91.033120 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x3a2f18ee 91.033341 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x3b2f18ee 91.033410 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x3b2f18ee 91.033611 192.168.20.1 -> 192.168.20.198 NFSACL V3 GETACL Call XID 0x3c2f18ee 91.033662 192.168.20.198 -> 192.168.20.1 NFSACL V3 GETACL Reply XID 0x3c2f18ee 91.033855 192.168.20.1 -> 192.168.20.198 NFSACL V3 GETACL Call XID 0x3d2f18ee 91.033877 192.168.20.198 -> 192.168.20.1 NFSACL V3 GETACL Reply XID 0x3d2f18ee 91.034104 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x3e2f18ee 91.034125 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x3e2f18ee 91.034358 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x3f2f18ee 91.034387 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x3f2f18ee 91.034590 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x402f18ee 91.034611 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x402f18ee 91.034845 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x412f18ee 91.034865 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x412f18ee 91.035150 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x422f18ee 91.035192 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x422f18ee 91.035339 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x432f18ee 91.035396 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x432f18ee 91.035594 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x442f18ee 91.035614 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x442f18ee 91.035859 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x452f18ee 91.035914 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x452f18ee 91.036096 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x462f18ee 91.036132 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x462f18ee 91.036347 192.168.20.1 -> 192.168.20.198 NFS V3 GETATTR Call XID 0x472f18ee 91.036369 192.168.20.198 -> 192.168.20.1 NFS V3 GETATTR Reply XID 0x472f18ee reread the notes they sent me. It doesn't involve netapps anymore. Ben, Could you make the binary files (i.e. ethereal -w /tmp/data) of these traces available? From: Makia Minich <makia> To: Jim Garlick <garlick>, bwoodard Cc: braby1 Subject: Re: update on Bugzilla #113636 Date: Tue, 03 Feb 2004 16:11:00 -0800 Jim / Ben, OK, here's the result of my testing: First, to get a heads up, I attempted to re-create what was on the bugzilla report with toadi and toad8: # toadi /root > mkdir /tftpboot/makia # toadi /root > cd /tftpboot/makia # toadi /tftpboot/makia > rsync -av /var/dist/storage . building file list ... done storage/ storage/ftpaccess.ocf storage/ftpusers.ocf storage/verifiertable.ocf wrote 44093 bytes read 68 bytes 88322.00 bytes/sec total size is 43854 speedup is 0.99 # toadi /tftpboot/makia > # toad8 /root > mount etoadi:/tftpboot/makia /mnt # toad8 /root > ls -l /mnt/storage/ total 52 -r--r--r-- 1 root root 374 Jun 3 2003 ftpaccess.ocf -r--r--r-- 1 root root 359 Jun 3 2003 ftpusers.ocf -r--r--r-- 1 root root 43121 Jun 3 2003 verifiertable.ocf # toad8 /root > # toadi /tftpboot/makia > rm -rf storage/* # toadi /tftpboot/makia > rsync -av /var/dist/storage . building file list ... done storage/ storage/ftpaccess.ocf storage/ftpusers.ocf storage/verifiertable.ocf wrote 44093 bytes read 68 bytes 88322.00 bytes/sec total size is 43854 speedup is 0.99 # toadi /tftpboot/makia > # toad8 /root > ls -l /mnt/storage/ ; date ls: /mnt/storage/verifiertable.ocf: Stale NFS file handle ls: /mnt/storage/ftpaccess.ocf: Stale NFS file handle ls: /mnt/storage/ftpusers.ocf: Stale NFS file handle total 0 Tue Feb 3 15:48:08 PST 2004 # toad8 /root > ls -l /mnt/storage/ ; date ls: /mnt/storage/verifiertable.ocf: Stale NFS file handle ls: /mnt/storage/ftpaccess.ocf: Stale NFS file handle ls: /mnt/storage/ftpusers.ocf: Stale NFS file handle total 0 Tue Feb 3 15:49:10 PST 2004 # toad8 /root > This backs up Jim's claim that it persists indefinetely (at least longer than 60 seconds... I tried many times after that but to no avail). For my second test, I used toadi running the 2.4.21-p4smp-27chaos kernel and toad8 running the 2.4.21-9.EL kernel: # toad8 /root > mount etoadi:/tftpboot/makia /mnt # toad8 /root > ls -l /mnt/storage total 52 -r--r--r-- 1 root root 374 Jun 3 2003 ftpaccess.ocf -r--r--r-- 1 root root 359 Jun 3 2003 ftpusers.ocf -r--r--r-- 1 root root 43121 Jun 3 2003 verifiertable.ocf # toad8 /root > # toadi /tftpboot/makia > rm -rf storage/* # toadi /tftpboot/makia > rsync -av /var/dist/storage . building file list ... done storage/ storage/ftpaccess.ocf storage/ftpusers.ocf storage/verifiertable.ocf wrote 44093 bytes read 68 bytes 88322.00 bytes/sec total size is 43854 speedup is 0.99 # toadi /tftpboot/makia > # toad8 /root > ls -l /mnt/storage ; date ls: /mnt/storage/verifiertable.ocf: Stale NFS file handle ls: /mnt/storage/ftpaccess.ocf: Stale NFS file handle ls: /mnt/storage/ftpusers.ocf: Stale NFS file handle total 0 Tue Feb 3 15:58:25 PST 2004 # toad8 /root > ls -l /mnt/storage ; date ls: /mnt/storage/verifiertable.ocf: Stale NFS file handle ls: /mnt/storage/ftpaccess.ocf: Stale NFS file handle ls: /mnt/storage/ftpusers.ocf: Stale NFS file handle total 0 Tue Feb 3 16:00:02 PST 2004 # toad8 /root > Looks like a client side NFS isn't the problem (not that it was in question). So, for the next test, I will use toad8 (client) and toadj (server) both running with the 2.4.21-9.EL kernels: # toadj /root > mkdir /tftpboot/makia # toadj /root > cd /tftpboot/makia/ # toadj /tftpboot/makia > rsync -av /var/dist/storage . building file list ... done storage/ storage/ftpaccess.ocf storage/ftpusers.ocf storage/verifiertable.ocf wrote 44093 bytes read 68 bytes 88322.00 bytes/sec total size is 43854 speedup is 0.99 # toadj /tftpboot/makia > # toad8 /root > mount etoadj:/tftpboot/makia /mnt # toad8 /root > ls -l /mnt/storage total 52 -r--r--r-- 1 root root 374 Jun 3 2003 ftpaccess.ocf -r--r--r-- 1 root root 359 Jun 3 2003 ftpusers.ocf -r--r--r-- 1 root root 43121 Jun 3 2003 verifiertable.ocf # toad8 /root > # toadj /tftpboot/makia > rm -rf storage/* # toadj /tftpboot/makia > rsync -av /var/dist/storage . building file list ... done storage/ storage/ftpaccess.ocf storage/ftpusers.ocf storage/verifiertable.ocf wrote 44093 bytes read 68 bytes 88322.00 bytes/sec total size is 43854 speedup is 0.99 # toadj /tftpboot/makia > # toad8 /root > ls -l /mnt/storage ; date ls: /mnt/storage/verifiertable.ocf: Stale NFS file handle ls: /mnt/storage/ftpaccess.ocf: Stale NFS file handle ls: /mnt/storage/ftpusers.ocf: Stale NFS file handle total 0 Tue Feb 3 16:07:12 PST 2004 # toad8 /root > ls -l /mnt/storage ; date ls: /mnt/storage/verifiertable.ocf: Stale NFS file handle ls: /mnt/storage/ftpaccess.ocf: Stale NFS file handle ls: /mnt/storage/ftpusers.ocf: Stale NFS file handle total 0 Tue Feb 3 16:08:26 PST 2004 # toad8 /root > Unless someone can see something that I missed, it looks like the problem is still occuring. From: Makia Minich <makia> To: Ben Woodard <bwoodard> Cc: Jim Garlick <garlick>, Ryan Braby <braby1> Subject: Re: update on Bugzilla #113636 Date: Wed, 04 Feb 2004 11:19:11 -0800 OK, here's the testing with ethereal going. Attached are the ethereal outputs (the first line in each file is the ethereal command that I used): # toadj /root > mkdir /tftpboot/makia # toadj /root > cd /tftpboot/makia/ # toadj /tftpboot/makia > rsync -av /var/dist/storage . building file list ... done storage/ storage/ftpaccess.ocf storage/ftpusers.ocf storage/verifiertable.ocf wrote 44093 bytes read 68 bytes 88322.00 bytes/sec total size is 43854 speedup is 0.99 # toadj /tftpboot/makia > # toad8 /root > mount etoadj:/tftpboot/makia /mnt # toad8 /root > ls /mnt/storage ftpaccess.ocf ftpusers.ocf verifiertable.ocf # toad8 /root > ls -l /mnt/storage total 52 -r--r--r-- 1 root root 374 Jun 3 2003 ftpaccess.ocf -r--r--r-- 1 root root 359 Jun 3 2003 ftpusers.ocf -r--r--r-- 1 root root 43121 Jun 3 2003 verifiertable.ocf # toad8 /root > <SEE SECTIONS TITLED "MOUNT" AND "LS"> # toadj /tftpboot/makia > rm -rf storage/* # toadj /tftpboot/makia > rsync -av /var/dist/storage . building file list ... done storage/ storage/ftpaccess.ocf storage/ftpusers.ocf storage/verifiertable.ocf wrote 44093 bytes read 68 bytes 88322.00 bytes/sec total size is 43854 speedup is 0.99 # toadj /tftpboot/makia > # toad8 /root > ls -l /mnt/storage ; date ls: /mnt/storage/verifiertable.ocf: Stale NFS file handle ls: /mnt/storage/ftpaccess.ocf: Stale NFS file handle ls: /mnt/storage/ftpusers.ocf: Stale NFS file handle total 0 Wed Feb 4 11:13:23 PST 2004 # toad8 /root > <SEE SECTIONS TITLED "LS AFTER RSYNC" AND "LS AGAIN AFTER >60secs"> Let me know if there's anything else you need. As for updating bugzilla reports, I don't know how to do that... Created attachment 97473 [details]
NFS packet trace
Created attachment 97474 [details]
2nd NFS packet trace
I've also managed to recreate this problem, and I think my trace shows the problem fairly well. I took this trace from the NFS server in my configuration, and preformed the following operations: 1) populated the test NFS mount directory with three files from the NFS client 2) issued an ls -l from the client (which properly showed the files in the directory) 3) removed the files from the NFS server and recreated them with rsync. 4) issued an ls -l from the client (which returned stale NFS file handles) 5) removed the files from the NFS server 6) issued an ls -l from the client (which showed zero files present) 7) re created the files on the server with rsync 8) issued an ls -l from the client (which showed the files present as expected). The trace indicates that at first we are able to successfully grab the attributes of all the files (as expected), but after we remove and repopulate the directory, whatever code determines the validity of the directory entry gets confused, as it thinks the contents of the directory have not changed. Consequently when we preform the second ls -l in step 4, we wind up issuing getattr requests with the same file id's in them, but are returned ERR_STALE becuse "those files" no longer exist. In the subsequent steps we remove the files on the server again, and then force the NFS client to look at the directory entry, which it then realizes has changed, and consequently another READDIRPLUS command is issued and we correctly see zero files persent. A subsequent rsync of the directory from the NFS server causes another READDIRPLUS packet to be issued from the client on the final ls -l and the result is the files are displayed properly. As an example frames 100 and 249 illustrate this progression. We start at frame 100 by issuing getattr requests on file id 7127438, and are successful. Following the file removal on the server and repopulation, we see in frame 249 another getattr request with the same file id. It no longer exists on the server, but the client doesn't know that because it hasn't invalidated its dentry cache entry for this directory. So teh server corerctly responds with ERR_STALE. Later we see the readdirplus comands issued when we force the client to recognize the directory contents have changed, and we start using a new set of file ids which return successful results on their getattr requests. It would seem to me that the appropriate thing to do here is when we encounter a ERR_STALE on a given file handle in a directory we should issue another readdirplus command to refresh the directory contents, and retry the getattr request. There may well be unintended consequence of this that I am unaware of, but it seems like the thing to do for this particular problem. Created attachment 98075 [details]
tcpdump capture file to go with my comments above
the tcpdump file with the test I described above.
I should also note that I repeated the above test, but instead of removing the files from the directory I added a new file to the directory in step 5. The results were the same, which further suggests to me that the NFS client is not correctly invalidating entries in the dentry cache. I have what might be a similar bug. Kernel 2.4.21-9.0.1.ELsmp on both NFS server and client (1 cpu w/HT on each, 512MB RAM on server, 1GB on client). All standard NFS options, other than "no_root_quash" is added. Mounted directory is EXT3, again all defaults other than I have tried both with and without ACLs on (same results each way). I have a repeatable failure. Do these on the NFS client: - "cd /path/to/nfs/dir" - "sudo bash" (ie, keep the shell in the NFS mounted dir) - "cd /path/to/non-nfs/dir" - start up a demon that does heavy non-NFS disk work on the client (I start up a program that uses MySQL to do a 5-way join across tables with millions of rows, returning about a million rows in the result set. This takes 30secs to complete and will obviously put a lot of pressure on the block buffer cache and the local disk). - When the heavy workload is done, exit from sudo shell - ls Every time, ls says ".: stale NFS handle". If I then do "cd /path/to/nfs/dir" again, everything is back to normal. I have no idea why doing a heavy local disk I/O would make NFS do this. I was getting intermittent NFS problems (stale handles, "ld" reporting files not there when they clearly were, etc.), and was suspecting hardware, but this is so repeatable that flaky memory and/or network doesn't seem likely. I have tried both with and without ACLs on the server (I was using ACLs, but thought maybe they were confusing NFS, so I tried without them), same result. Rebooted both machines several times, same result. Does this sound like the same problem? Or totally unrelated? Should I open up a separate bug report for it? And, what can I do to help get more information on this? It is very frustrating, but when I searched bugzilla I found this bug that sounds quite similar, in that it forces NFS to give stale handle errors. More from wms - I was able to isolate my problem down. Still not sure if it is related to this one or not, but I have found a *very* repeatable, *very* simple test case. I have submitted that as a separate bug, complete with the ethereal binary dump of the traffic, as bug #118816. We are seeing the same problem. RHEL 3 NFS server, current with patches as of 5/17/04. RHEL 3 client, newly-installed (not yet updated). Here's the scenerio: - Update files under /home/someuser, local to NFS server. - rsync files from /home/someuser to /shared/mountpoint/someuser, NFS mounted on the NFS server from itself. - ls /shared/mountpoint/someuser on the client box, get "Stale NFS file handle" error only on the file(s) directly affected by the rsync. All other files displayed properly. Problem also occurs on client if I rsync files to the local, exported filesystem (/shared/export/mountpoint/someuser) instead of the NFS mount on the NFS server. During testing, I also got the NFS server itself to exhibit the problem through the local mount (the server is a client to itself). Have to unmount and mount the NFS filesystem to eliminate the error, which obviously isn't acceptable in production. Repeating the process using cp instead of rsync does not result in the error. Problem does not occur with RHEL 2.1 client. Created attachment 100363 [details]
Current patch
This is the patch that we are currently running here at LLNL. It seems to work
but we are still evaluating it. We did find a bad interaction with the NFS>16
groups patch.
A fix for this problem has just been committed to the RHEL3 U3 patch pool this evening (in kernel version 2.4.21-15.17.EL). An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-433.html |