Bug 1046011

Summary: NFS mount over RDMA cannot read more than 812 bytes
Product: [Fedora] Fedora Reporter: Markus Stockhausen <mst>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: bfields, chuck.lever, gansalmon, itamar, jlayton, jonathan, kernel-maint, madhu.chinakonda, nfs-maint, steved
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1048477 (view as bug list) Environment:
Last Closed: 2014-04-24 14:29:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1048477    

Description Markus Stockhausen 2013-12-23 08:00:06 UTC
Description of problem:

When mount NFS shares using RDMA protocol the client cannot read blocks that are larger than 812 bytes. 


Version-Release number of selected component (if applicable):
Kernel 3.11.10-200.fc19.x86_64

How reproducible:
100%

Steps to Reproduce:
1) Create NFS RDMA server (as explained in the kernel doc)
# modprobe svcrdma
# service nfs start
# echo rdma 20049 > /proc/fs/nfsd/portlist

2) Create some export
/ 10.0.0.0/8(fsid=0,rw,async,no_subtree_check,no_root_squash,insecure)
# exportfs -a

3) Configure NFS RDMA client
# modprobe xprtrdma

4) mount filesystem on client
# mount -o rdma,port=20049,nfsvers=4 10.10.30.1:/ /mnt

5) check for rdma mount on client
# mount
...
10.10.30.1:/ on /mnt type nfs4 (rw,relatime,vers=4.0,rsize=262144,wsize=262144,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,clientaddr=10.10.30.3,local_lock=none,addr=10.10.30.1)
...

6) create file on nfs server with 812 bytes
dd if=/dev/urandom of=bla.txt bs=812 count=1

7) read file on client (works fine)
cat bla.txt > /dev/null

8) create file on nfs server with 813 bytes
dd if=/dev/urandom of=bla.txt bs=813 count=1

9) read file on client (gives error - german text for I/O error)
# cat bla.txt > /dev/null
cat: bla.txt: Eingabe-/Ausgabefehler


Actual results:
file not readable

Expected results:
no error should occur

Additional info:
Similar problem discussed on the NFS mailing list http://www.spinics.net/lists/linux-nfs/msg40253.html

Comment 1 Markus Stockhausen 2013-12-23 08:03:01 UTC
I'm using Mellanox ConnectX Infiniband cards with driver mlx4_ib. Fedora 19 is fully patched on both nodes.

Comment 2 Markus Stockhausen 2013-12-23 17:05:03 UTC
If I get the call stack the right way in net/sunrpc/xprtrdma/rpc_rdma.c:

1) rpcrdma_reply_handler() calls rpcrdma_inline_fixup()

2) rpcrdma_inline_fixup() executes rqst->rq_rcv_buf.page_len = 0;

activating rpc debugging gives the following log 
RPC: rpcrdma_inline_fixup: srcp 0xffff8805eb1c4094 len 60 hdrlen 60

Kernel commit http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a11a2bf4de5679fa0b63474c7d39bea2dac7d061 changes the logic that net/sunrpc/xdr.c works with page_len value. The datalen is reduced to page_len even if page_len is zero.

Comment 3 Markus Stockhausen 2013-12-23 17:43:14 UTC
I cannot confirm it but this might be an issue for RHEL 7 Beta too.

Comment 4 Jeff Layton 2014-01-20 17:34:58 UTC
Looks like Chuck has a patch for this. I'll plan to test it out soon.

Comment 5 Jeff Layton 2014-01-21 20:28:19 UTC
...and Chuck's patch works. Looks like he ended up fixing the same spot that Markus noticed. See:

    http://article.gmane.org/gmane.linux.nfs/60953

I imagine this'll get merged for 3.14 and marked for stable.

Comment 6 Jeff Layton 2014-04-24 14:29:43 UTC
Oof, didn't make 3.14, but did get merged for v3.15. I'll close with a resolution of RAWHIDE.