Bug 765497 - (GLUSTER-3765) [glusterfs-3.2.5qa4] - dbench fails with 'Reply submission failed' error
[glusterfs-3.2.5qa4] - dbench fails with 'Reply submission failed' error
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: nfs (Show other bugs)
pre-release
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Krishna Srinivas
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-31 04:21 EDT by M S Vishwanath Bhat
Modified: 2016-05-31 21:55 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description M S Vishwanath Bhat 2011-10-31 04:21:17 EDT
Created distribute volume with rdma transport type. Mounted the volume via nfs client with nolock option. So when I run dbench on the nfs client it fails in warming up stage itself with following error.

  20       908    13.16 MB/sec  warmup  33 sec  latency 7594.004 ms
  20       908    12.78 MB/sec  warmup  34 sec  latency 8594.046 ms
  20       908    12.41 MB/sec  warmup  35 sec  latency 9594.089 ms
[892] open ./clients/client3/~dmtmp/PWRPNT/PPTC241.TMP failed for handle 10058 (File exists)
(893) ERROR: handle 10058 was not found
Child failed with status 1

nfs is mounted with following options.

10.1.10.21:hosdu /mnt nfs rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=38467,timeo=600,retrans=2,sec=sys,mountaddr=10.1.10.21,mountvers=3,mountport=38465,mountproto=tcp,local_lock=all,addr=10.1.10.21 0 0


nfs logs on one of the servers had the following errors.

[2011-10-28 08:43:39.241236] I [client-handshake.c:913:client_setvolume_cbk] 0-hosdu-client-1: Connected to 10.1.10.24:24009, attached to remote volume '/tmp/brick'.
[2011-10-30 20:57:42.758309] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.758421] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.758469] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.758517] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.759380] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.759406] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.759778] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.759819] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.760214] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.760239] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.778048] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.778075] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.788795] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.790366] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:42.820686] E [rpcsvc.c:1710:nfs_rpcsvc_submit_generic] 0-nfsrpc: Failed to submit message
[2011-10-30 20:57:42.820712] E [nfs3.c:522:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed
[2011-10-30 20:57:52.559668] E [client3_1-fops.c:1722:client3_1_create_cbk] 0-hosdu-client-1: remote operation failed: File exists
Comment 1 Krishna Srinivas 2011-11-01 07:02:26 EDT
Vishwa, Is the problem reproducible everytime?
Comment 2 M S Vishwanath Bhat 2011-11-01 09:49:31 EDT
I was able to hit it twice in as many tries.
Comment 3 Saurabh 2011-11-02 00:37:30 EDT
FYI, I also found the same issue for dist-rep and stripe volumes. Already had discussion about this with Krishna.
Comment 4 Anand Avati 2011-11-04 05:37:20 EDT
CHANGE: http://review.gluster.com/671 (In the rpc implementation of nfs suppose the transmission buffer list) merged in release-3.2 by Vijay Bellur (vijay@gluster.com)
Comment 5 Saurabh 2011-11-07 22:23:16 EST
 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX      12572    53.847  4327.962
 Close           9460     0.292  1978.602
 Rename           533   210.111  2715.497
 Unlink          2334    64.667  2625.165
 Qpathinfo      11143    45.807  4422.968
 Qfileinfo       2424     0.117    80.244
 Qfsinfo         2098    44.202  2623.089
 Sfileinfo       1060     5.621  1535.059
 Find            4320    52.457  3797.398
 WriteX          8348   148.227  3959.857
 ReadX          19551     0.057   121.344
 LockX             40     0.009     0.031
 UnlockX           40     0.002     0.004
 Flush            951     0.002     0.023

Throughput 1.64406 MB/sec (sync open) (sync dirs)  10 clients  10 procs  max_latency=4445.089 ms

real	6m6.619s
user	0m0.488s
sys	0m3.179s



dbench works for me on 3.2.5qa6 for a distribute-replicate volume

Note You need to log in before you can comment on or make changes to this bug.