Bug 803201

Summary: [fa5b0347193f8d1a4b917a2edb338423cb175e66] Dbench exits with EBADF during graph change
Product: [Community] GlusterFS Reporter: Anush Shetty <ashetty>
Component: fuseAssignee: Raghavendra G <rgowdapp>
Status: CLOSED WORKSFORME QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: gluster-bugs, shwetha.h.panduranga
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-03 07:40:06 UTC Type: ---
Regression: --- Mount Type: fuse
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anush Shetty 2012-03-14 08:03:40 UTC
Description of problem: On a 2-replica volume, when graph-change was done using volume set while dbench was running on the fuse client, dbench exited with EBADF


Version-Release number of selected component (if applicable): upstream


How reproducible: Consistently


Steps to Reproduce:
1. dbench -s 10 -D /mnt/gluster
2. gluster volume set performance.write-behind off; sleep 1; gluster volume set performance.write-behind on
3.
  
Actual results:

 10       281     2.57 MB/sec  warmup  41 sec  latency 824.678 ms
  10       285     2.52 MB/sec  warmup  42 sec  latency 972.568 ms
  10       291     2.51 MB/sec  warmup  43 sec  latency 610.378 ms
  10       297     2.49 MB/sec  warmup  44 sec  latency 499.366 ms
  10       304     2.48 MB/sec  warmup  45 sec  latency 484.016 ms
  10       309     2.46 MB/sec  warmup  46 sec  latency 709.612 ms
[319] write failed on handle 9973 (Bad file descriptor)
Child failed with status 1
dbench version 4.00 - Copyright Andrew Tridgell 1999-2004


Expected results:

Dbench process should continue without errors


Additional info:

Client log-

[2012-03-14 13:21:10.038989] W [fuse-bridge.c:3561:fuse_migrate_fd] 0-glusterfs-fuse: name-less lookup of gfid (7aa0ce78-3ff5-4acf-8d50-bef91
9c29ec8) failed (Input/output error)
[2012-03-14 13:21:10.095234] D [afr-common.c:129:afr_lookup_xattr_req_prepare] 2-test2-replicate-0: : failed to get the gfid from dict
[2012-03-14 13:21:10.095782] D [afr-self-heal-common.c:148:afr_sh_print_pending_matrix] 2-test2-replicate-0: pending_matrix: [ 0 0 ]
[2012-03-14 13:21:10.095814] D [afr-self-heal-common.c:148:afr_sh_print_pending_matrix] 2-test2-replicate-0: pending_matrix: [ 0 0 ]
[2012-03-14 13:21:10.095835] D [afr-self-heal-common.c:753:afr_mark_sources] 2-test2-replicate-0: Number of sources: 0
[2012-03-14 13:21:10.095855] D [afr-self-heal-data.c:799:afr_lookup_select_read_child_by_txn_type] 2-test2-replicate-0: returning read_child:
 1


[2012-03-14 13:21:10.126123] D [afr-common.c:129:afr_lookup_xattr_req_prepare] 2-test2-replicate-0: <gfid:00000000-0000-0000-0000-00000000000
0>: failed to get the gfid from dict
[2012-03-14 13:21:10.126390] D [afr-common.c:129:afr_lookup_xattr_req_prepare] 2-test2-replicate-0: /clients: failed to get the gfid from dic
t
[2012-03-14 13:21:10.126745] D [afr-lk-common.c:1427:afr_nonblocking_inodelk] 2-test2-replicate-0: attempting data lock range 0 0 by 8415233a
7f7f0000
[2012-03-14 13:21:10.126973] D [afr-common.c:129:afr_lookup_xattr_req_prepare] 2-test2-replicate-0: /clients: failed to get the gfid from dic
t
[2012-03-14 13:21:10.127170] D [afr-lk-common.c:1427:afr_nonblocking_inodelk] 2-test2-replicate-0: attempting data lock range 0 0 by 7c87233a
7f7f0000
[2012-03-14 13:21:10.127382] D [afr-common.c:129:afr_lookup_xattr_req_prepare] 2-test2-replicate-0: /clients: failed to get the gfid from dic
t
[2012-03-14 13:21:10.127611] D [afr-lk-common.c:1427:afr_nonblocking_inodelk] 2-test2-replicate-0: attempting data lock range 196608 63488 by 9073243a7f7f0000
[2012-03-14 13:21:10.127718] W [fuse-resolve.c:346:fuse_resolve_fd] 0-fuse-resolve: migration of fd (0x7f7f300282a8) did not complete, failing fop with EBADF
[2012-03-14 13:21:10.127986] D [afr-lk-common.c:1427:afr_nonblocking_inodelk] 2-test2-replicate-0: attempting data lock range 131072 65536 by c85e233a7f7f0000
[2012-03-14 13:21:10.128183] D [afr-common.c:129:afr_lookup_xattr_req_prepare] 2-test2-replicate-0: /clients: failed to get the gfid from dict
[2012-03-14 13:21:10.128501] W [fuse-resolve.c:346:fuse_resolve_fd] 0-fuse-resolve: migration of fd (0x7f7f300282a8) did not complete, failing fop with EBADF

Comment 1 Amar Tumballi 2012-03-25 07:14:18 UTC
Check if its already fixed.

Comment 2 Anush Shetty 2012-03-26 04:58:52 UTC
This issue still exists in the mainline.

Comment 3 Raghavendra G 2012-04-03 03:27:50 UTC
*** Bug 808054 has been marked as a duplicate of this bug. ***

Comment 4 Raghavendra G 2012-04-03 03:30:50 UTC
*** Bug 802233 has been marked as a duplicate of this bug. ***

Comment 5 Raghavendra G 2012-04-03 03:32:42 UTC
*** Bug 803328 has been marked as a duplicate of this bug. ***

Comment 6 Raghavendra G 2012-04-03 06:52:04 UTC
The above three bugs marked as duplicate of this bug are actually on distribute volume. Please ignore these comments.

Comment 7 Raghavendra G 2012-04-03 07:40:06 UTC
On commit e5b5bb4de46a2a37c8ff392c45, dbench runs fine parallely with a
sequence of volume set commands. Please confirm and update the status of this
bug if the issue is seen.