Bug 765497 (GLUSTER-3765) - [glusterfs-3.2.5qa4] - dbench fails with 'Reply submission failed' error
Summary: [glusterfs-3.2.5qa4] - dbench fails with 'Reply submission failed' error
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-3765
Product: GlusterFS
Classification: Community
Component: nfs
Version: pre-release
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Krishna Srinivas
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-10-31 08:21 UTC by M S Vishwanath Bhat
Modified: 2016-06-01 01:55 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description M S Vishwanath Bhat 2011-10-31 08:21:17 UTC
Created distribute volume with rdma transport type. Mounted the volume via nfs client with nolock option. So when I run dbench on the nfs client it fails in warming up stage itself with following error.

  20       908    13.16 MB/sec  warmup  33 sec  latency 7594.004 ms
  20       908    12.78 MB/sec  warmup  34 sec  latency 8594.046 ms
  20       908    12.41 MB/sec  warmup  35 sec  latency 9594.089 ms
[892] open ./clients/client3/~dmtmp/PWRPNT/PPTC241.TMP failed for handle 10058 (File exists)
(893) ERROR: handle 10058 was not found
Child failed with status 1

nfs is mounted with following options.

10.1.10.21:hosdu /mnt nfs rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=38467,timeo=600,retrans=2,sec=sys,mountaddr=10.1.10.21,mountvers=3,mountport=38465,mountproto=tcp,local_lock=all,addr=10.1.10.21 0 0


nfs logs on one of the servers had the following errors.

[2011-10-28 08:43:39.241236] I [client-handshake.c:913:client_setvolume_cbk] 0-hosdu-client-1: Connected to 10.1.10.24:24009, attached to remote volume '/tmp/brick'.
[2011-10-30 20:57:42.758309] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.758421] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.758469] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.758517] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.759380] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.759406] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.759778] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.759819] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.760214] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.760239] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.778048] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.778075] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.788795] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.790366] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.820686] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.820712] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:52.559668] E [client3_1-fops.c:1722:client3_1_create_cbk] 0-hosdu-client-1: remote operation failed: File exists

Comment 1 Krishna Srinivas 2011-11-01 11:02:26 UTC
Vishwa, Is the problem reproducible everytime?

Comment 2 M S Vishwanath Bhat 2011-11-01 13:49:31 UTC
I was able to hit it twice in as many tries.

Comment 3 Saurabh 2011-11-02 04:37:30 UTC
FYI, I also found the same issue for dist-rep and stripe volumes. Already had discussion about this with Krishna.

Comment 4 Anand Avati 2011-11-04 09:37:20 UTC
CHANGE: http://review.gluster.com/671 (In the rpc implementation of nfs suppose the transmission buffer list) merged in release-3.2 by Vijay Bellur (vijay)

Comment 5 Saurabh 2011-11-08 03:23:16 UTC
 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX      12572    53.847  4327.962
 Close           9460     0.292  1978.602
 Rename           533   210.111  2715.497
 Unlink          2334    64.667  2625.165
 Qpathinfo      11143    45.807  4422.968
 Qfileinfo       2424     0.117    80.244
 Qfsinfo         2098    44.202  2623.089
 Sfileinfo       1060     5.621  1535.059
 Find            4320    52.457  3797.398
 WriteX          8348   148.227  3959.857
 ReadX          19551     0.057   121.344
 LockX             40     0.009     0.031
 UnlockX           40     0.002     0.004
 Flush            951     0.002     0.023

Throughput 1.64406 MB/sec (sync open) (sync dirs)  10 clients  10 procs  max_latency=4445.089 ms

real	6m6.619s
user	0m0.488s
sys	0m3.179s



dbench works for me on 3.2.5qa6 for a distribute-replicate volume


Note You need to log in before you can comment on or make changes to this bug.