Description of problem: I have 3 machines: - NFS server (Thecus N8800, but the same happens with a RHEL-4 server) - F15 x86_64 machine with kernel 2.6.40.4-5.fc15.x86_64 - F16 PPC64 machine with kernel-3.1.0-0.rc6.git0.3.fc16.ppc64 on the x86_64 machine: #> mount -o rw,rsize=4096,wsize=8192 nfs-server:/data/tmp /mnt/foo/ #> echo 1 > /mnt/foo/xxx; ls -l /mnt/foo/xxx -rw-rw-rw- 1 karsten karsten 2 29. Sep 13:09 xxx #> mount -o rw,rsize=4096,wsize=16384 nfs-server:/data/tmp /mnt/foo/ #> echo 1 > /mnt/foo/xxx; ls -l /mnt/foo/xxx -rw-rw-rw- 1 karsten karsten 2 29. Sep 13:10 xxx #> mount -o rw,rsize=4096,wsize=32738 nfs-server:/data/tmp /mnt/foo/ #> echo 1 > /mnt/foo/xxx; ls -l /mnt/foo/xxx -rw-rw-rw- 1 karsten karsten 2 29. Sep 13:11 xxx #> mount -o rw,rsize=4096,wsize=65536 nfs-server:/data/tmp /mnt/foo/ #> echo 1 > /mnt/foo/xxx; ls -l /mnt/foo/xxx -rw-rw-rw- 1 karsten karsten 2 29. Sep 13:11 xxx The same on the PPC64 machine: #> mount -o rw,rsize=4096,wsize=8192 nfs-server:/data/tmp /mnt/foo/ #> echo 1 > /mnt/foo/xxx; ls -l /mnt/foo/xxx -rw-rw-rw- 1 karsten karsten 8192 29. Sep 2011 xxx #> mount -o rw,rsize=4096,wsize=16384 nfs-server:/data/tmp /mnt/foo/ #> echo 1 > /mnt/foo/xxx; ls -l /mnt/foo/xxx -rw-rw-rw- 1 karsten karsten 16384 29. Sep 13:10 xxx #> mount -o rw,rsize=4096,wsize=32768 nfs-server:/data/tmp /mnt/foo/ #> echo 1 > /mnt/foo/xxx; ls -l /mnt/foo/xxx -rw-rw-rw- 1 karsten karsten 32768 29. Sep 2011 xxx #> mount -o rw,rsize=4096,wsize=65535 nfs-server:/data/tmp /mnt/foo/ #> echo 1 > /mnt/foo/xxx; ls -l /mnt/foo/xxx -rw-rw-rw- 1 karsten karsten 32768 29. Sep 2011 xxx #> mount -o rw,rsize=4096,wsize=65536 nfs-server:/data/tmp /mnt/foo/ #> echo 1 > /mnt/foo/xxx; ls -l /mnt/foo/xxx -rw-rw-rw- 1 karsten karsten 2 29. Sep 2011 xxx #> mount | grep foo nfs-server:/data/tmp/ on /mnt/foo type nfs (rw,relatime,vers=3,rsize=4096,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=xx.xx.xx.xx,mountvers=3,mountport=903,mountproto=udp,local_lock=none,addr=xx.xx.xx.xx) #> hexdump xxx 0000000 320a 0000 0000 0000 0000 0000 0000 0000 0000010 0000 0000 0000 0000 0000 0000 0000 0000 * 0008000 It's interesting that as long as wsize is greater than 65535, everything seems to be ok.
Steve and Jay, any ideas?
IIRC, ppc64 has 64k pages, right? It would be interesting to see if this commit helps: commit f13c3620a4d1123dbf032f93f350b856ef292ced Author: Trond Myklebust <Trond.Myklebust> Date: Mon Sep 12 11:47:53 2011 -0400 NFS: Fix a typo in nfs_flush_multi Fix a typo which causes an Oops in the RPC layer, when using wsize < 4k. Signed-off-by: Trond Myklebust <Trond.Myklebust> Tested-by: Sricharan R <r.sricharan> ...currently the code does a wsize write even when there's less data to be written. nfs_flush_multi is only used when the wsize is less than the PAGE_CACHE_SIZE.
(In reply to comment #2) > IIRC, ppc64 has 64k pages, right? It would be interesting to see if this commit > helps: We recently switched Fedora to 64K pages, yes. I don't remember exactly which NVR that happened in. > commit f13c3620a4d1123dbf032f93f350b856ef292ced > Author: Trond Myklebust <Trond.Myklebust> > Date: Mon Sep 12 11:47:53 2011 -0400 > > NFS: Fix a typo in nfs_flush_multi OK. That is in rc7 I believe, so it should be easy enough to build and test. (Aside, I suck for calling you Jay. My apologies Jeff.)
------- Comment From baude.com 2011-10-27 12:09 EDT------- reverse mirror bug
I've verified that this is fixed in 3.1.0-0.rc8.git0.0.fc16.kh.ppc64