Bug 765481 (GLUSTER-3749)

Summary: [glusterfs-3.2.5qa2] dbench fails with 'bad fd' error on a volume with tcp,rdma transport type
Product: [Community] GlusterFS Reporter: M S Vishwanath Bhat <vbhat>
Component: rdmaAssignee: Raghavendra G <rgowdapp>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: pre-releaseCC: amarts, gluster-bugs, vijay
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 849126 854627 (view as bug list) Environment:
Last Closed: 2013-07-24 13:13:21 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 849126, 854627, 858450    
Attachments:
Description Flags
glusterfs client log none

Description M S Vishwanath Bhat 2011-10-21 08:25:28 EDT
Created a 2 way distribute volume with tcp,rdma transport type. Mounted via tcp and started dbench. dbench failed with 'file descriptor in bad state' error. 

[1289] read failed on handle 10137 (File descriptor in bad state)
[1323] read failed on handle 10139 (File descriptor in bad state)
[1311] read failed on handle 10138 (File descriptor in bad state)
[1312] read failed on handle 10138 (File descriptor in bad state)
 100      1443     0.14 MB/sec  execute 243 sec  latency 144553.751 ms
[1290] read failed on handle 10137 (File descriptor in bad state)
[1312] read failed on handle 10138 (File descriptor in bad state)
[1311] read failed on handle 10138 (File descriptor in bad state)
[1289] read failed on handle 10137 (File descriptor in bad state)
[1323] read failed on handle 10139 (File descriptor in bad state)
[1313] read failed on handle 10138 (File descriptor in bad state)
[1352] read failed on handle 10139 (File descriptor in bad state)
[1309] read failed on handle 10138 (File descriptor in bad state)
[1349] read failed on handle 10139 (File descriptor in bad state)
[1313] read failed on handle 10138 (File descriptor in bad state)
[1312] read failed on handle 10138 (File descriptor in bad state)
[1301] read failed on handle 10138 (File descriptor in bad state)
[1314] read failed on handle 10138 (File descriptor in bad state)
[1324] read failed on handle 10139 (File descriptor in bad state)
[1323] read failed on handle 10139 (File descriptor in bad state)
Child failed with status 1
[1290] read failed on handle 10137 (File descriptor in bad state)
[root@client4 mnt]# [1290] read failed on handle 10137 (File descriptor in bad state)
[1324] read failed on handle 10139 (File descriptor in bad state)
[1312] read failed on handle 10138 (File descriptor in bad state)
[1313] read failed on handle 10138 (File descriptor in bad state)
[1291] read failed on handle 10137 (File descriptor in bad state)
[1313] read failed on handle 10138 (File descriptor in bad state)
[1312] read failed on handle 10138 (File descriptor in bad state)
[1290] read failed on handle 10137 (File descriptor in bad state)
[1324] read failed on handle 10139 (File descriptor in bad state)
[1314] read failed on handle 10138 (File descriptor in bad state)
[1353] read failed on handle 10139 (File descriptor in bad state)
[1310] read failed on handle 10138 (File descriptor in bad state)
[1350] read failed on handle 10139 (File descriptor in bad state)
[1314] read failed on handle 10138 (File descriptor in bad state)
[1313] read failed on handle 10138 (File descriptor in bad state)
[1302] read failed on handle 10138 (File descriptor in bad state)
[1315] read failed on handle 10138 (File descriptor in bad state)
[1331] read failed on handle 10139 (File descriptor in bad state)
[1325] read failed on handle 10139 (File descriptor in bad state)


I see lot of these errors in client logs.


[2011-10-21 00:20:02.968436] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.968551] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7869409): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.968571] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.968960] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7606383): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.968983] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.969165] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7867713): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.969185] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.969552] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7606485): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.969574] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.969806] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7606523): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.969827] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.969934] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7869569): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.969954] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.970083] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7606553): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.970103] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.970323] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7606625): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.970343] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.970756] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7606677): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.970778] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.971102] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7607013): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.971122] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.973831] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7867053): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.973855] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.973978] I [client3_1-fops.c:2364:client_fdctx_destroy] 0-hosdu-client-1: sending release on fd
[2011-10-21 00:20:02.974206] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7866561): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.974231] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.974350] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7867271): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.974370] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.974704] I [client3_1-fops.c:2364:client_fdctx_destroy] 0-hosdu-client-1: sending release on fd
[2011-10-21 00:20:02.975006] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7866809): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.975034] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.975090] I [client3_1-fops.c:2364:client_fdctx_destroy] 0-hosdu-client-1: sending release on fd
[2011-10-21 00:20:02.975158] W [client3_1-fops.c:3662:client3_1_flush] 0-hosdu-client-1: (7869623): failed to get fd ctx. EBADFD
[2011-10-21 00:20:02.975183] W [client3_1-fops.c:3692:client3_1_flush] 0-hosdu-client-1: failed to send the fop: File descriptor in bad state
[2011-10-21 00:20:02.975683] I [client3_1-fops.c:2364:client_fdctx_destroy] 0-hosdu-client-1: sending release on fd
[2011-10-21 00:20:02.976080] I [client3_1-fops.c:2364:client_fdctx_destroy] 0-hosdu-client-1: sending release on fd

I have attached the client log.
Comment 1 Amar Tumballi 2011-10-28 05:51:02 EDT
Even this is linked to the 'ping timeout' issue (described in bug 765486). by reducing the number of threads in dbench (from 50 to 20), the issues got fixed.
Comment 2 Amar Tumballi 2012-09-04 05:30:18 EDT
need to check if this happens with latest codebase.
Comment 3 Amar Tumballi 2012-10-11 06:06:05 EDT
need one more check, more than an year since this...