Bug 764477 (GLUSTER-2745)

Summary: failure to detect split brain
Product: [Community] GlusterFS Reporter: kakogianni
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.1.3CC: gluster-bugs, jdarcy
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description kakogianni 2011-04-13 11:42:14 UTC
Reproduce:
In a replicate setup with two nodes
1. first node goes down
2. create a new file
3. first node comes up. 
4. second node goes down before self-heal is completed
5. edit the new file
6. second node comes back
7. access file

glusterfs does not prevent accessing the file. Self-heal messages are loggged with each access:
[2011-04-13 11:26:24.596470] I [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] replicate-0: background  data self-heal completed on /split

The file's contents are different in each subvolume and it may happen that two clients view different contents

Comment 1 kakogianni 2011-04-13 12:51:20 UTC
In addition if a directory ends up in an inconsistent state, the following messages appear:

[2011-04-13 12:28:52.260565] I [afr-dir-read.c:171:afr_examine_dir_readdir_cbk] replicate-0:  entry self-heal triggered. path: /foo, reason: checksums of directory differ, forced merge option set
[2011-04-13 12:28:52.261138] I [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] replicate-2: background  entry self-heal completed on /foo
[2011-04-13 12:28:52.261243] E [afr-common.c:110:afr_set_split_brain] replicate-1: invalid argument: inode

No "possible split-brain" message appears

Comment 2 Pranith Kumar K 2011-04-13 13:45:38 UTC
(In reply to comment #1)
> In addition if a directory ends up in an inconsistent state, the following
> messages appear:
> 
> [2011-04-13 12:28:52.260565] I [afr-dir-read.c:171:afr_examine_dir_readdir_cbk]
> replicate-0:  entry self-heal triggered. path: /foo, reason: checksums of
> directory differ, forced merge option set
> [2011-04-13 12:28:52.261138] I
> [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] replicate-2:
> background  entry self-heal completed on /foo
> [2011-04-13 12:28:52.261243] E [afr-common.c:110:afr_set_split_brain]
> replicate-1: invalid argument: inode
> 
> No "possible split-brain" message appears

Could you paste the outputs of the following command for the file on both the replicas.
'sudo getfattr -d -m "trusted.*" pranith'

Comment 3 Pranith Kumar K 2011-04-13 13:57:01 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > In addition if a directory ends up in an inconsistent state, the following
> > messages appear:
> > 
> > [2011-04-13 12:28:52.260565] I [afr-dir-read.c:171:afr_examine_dir_readdir_cbk]
> > replicate-0:  entry self-heal triggered. path: /foo, reason: checksums of
> > directory differ, forced merge option set
> > [2011-04-13 12:28:52.261138] I
> > [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] replicate-2:
> > background  entry self-heal completed on /foo
> > [2011-04-13 12:28:52.261243] E [afr-common.c:110:afr_set_split_brain]
> > replicate-1: invalid argument: inode
> > 
> > No "possible split-brain" message appears
> 
> Could you paste the outputs of the following command for the file on both the
> replicas.
> 'sudo getfattr -d -m "trusted.*" pranith'

please substitute the filename in-place of "pranith" in the command above.
'sudo getfattr -d -m "trusted.*" filename'

Pranith

Comment 4 kakogianni 2011-04-14 06:09:56 UTC
The outputs for the text file are:
# file: mnt/brick1/storage/test1
trusted.afr.client-2=0sAAAAAwAAAAAAAAAA
trusted.afr.client-3=0sAAAAAAAAAAAAAAAA
trusted.gfid=0sEqcf/8hzR5Si3GV3jc0geg==

# file: mnt/brick1/storage/test1
trusted.afr.client-2=0sAAAAAAAAAAAAAAAA
trusted.afr.client-3=0sAAAAAQAAAAAAAAAA
trusted.gfid=0sUB1aiV9KSBW+lh25fwKOWA==

The outputs for the directory:
# file: mnt/brick1/storage/testdir
trusted.afr.client-2=0sAAAAAAAAAAEAAAAB
trusted.afr.client-3=0sAAAAAAAAAAAAAAAA
trusted.gfid=0sAGjIRPFlSwOuGBwgojM+mA==
trusted.glusterfs.dht=0sAAAAAQAAAAAAAAAAVVVVVA==

# file: mnt/brick1/storage/testdir
trusted.afr.client-2=0sAAAAAAAAAAAAAAAA
trusted.afr.client-3=0sAAAAAAAAAAEAAAAB
trusted.gfid=0smcO8OOF/TZKtY3uc/yGJJw==
trusted.glusterfs.dht=0sAAAAAQAAAAAAAAAAVVVVVA==

Comment 5 Pranith Kumar K 2011-04-15 09:48:27 UTC
It is the issue with gfid rename, when one of the nodes is down. Fixed as part of 2500
Excerpt from the bug 764232:
Gfid is equivalent to inode in the glusterfs cluster.
This problem happens because most of the editors create a temporary file (which
has different gfid) and rename the file to the original file on save+exit. This
results in a scenario where same filepath on the replicas end-up with different
gfids on each of the replicas when one of the nodes in replica is offline.

Pranith

*** This bug has been marked as a duplicate of bug 2500 ***

Comment 6 kakogianni 2011-04-18 14:10:10 UTC
Thank you for your quick reply on this. However, we believe that this is in fact a different bug. In our case, the file is never edited with a text editor or renamed. It's contents are simply modified using echo. Below is the trace log printed when the file is accessed:

[2011-04-18 16:48:33.863852] D [afr-common.c:561:afr_lookup_collect_xattr] 0-replicate-1: entry self-heal is pending for /.
[2011-04-18 16:48:33.863895] D [afr-common.c:561:afr_lookup_collect_xattr] 0-replicate-1: entry self-heal is pending for /.
[2011-04-18 16:48:33.863908] I [afr-common.c:716:afr_lookup_done] 0-replicate-1: background  entry self-heal triggered. path: /
[2011-04-18 16:48:33.864188] D [afr-self-heal-entry.c:2258:afr_sh_post_nonblocking_entry_cbk] 0-replicate-1: Non Blocking entrylks done. Proceeding to FOP
[2011-04-18 16:48:33.864648] D [afr-self-heal-entry.c:2099:afr_sh_entry_sync_prepare] 0-replicate-1: no active sources for / found. merging all entries as a conservative decision
[2011-04-18 16:48:33.882015] D [afr-lk-common.c:415:transaction_lk_op] 0-replicate-1: lk op is for a self heal
[2011-04-18 16:48:33.882309] I [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-1: background  entry self-heal completed on /
[2011-04-18 16:48:33.882805] D [client3_1-fops.c:1937:client3_1_lookup_cbk] 0-client-2: gfid changed for /test
[2011-04-18 16:48:33.882851] D [afr-common.c:567:afr_lookup_collect_xattr] 0-replicate-1: data self-heal is pending for /test.
[2011-04-18 16:48:33.882867] I [afr-common.c:716:afr_lookup_done] 0-replicate-1: background  data self-heal triggered. path: /test
[2011-04-18 16:48:33.883185] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:48:33.883212] D [client3_1-fops.c:3283:client3_1_flush] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:48:33.883224] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk] 0-replicate-1: flush or setattr failed on /test on subvolume client-2: File descriptor in bad state
[2011-04-18 16:48:33.883482] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk] 0-replicate-1: flush or setattr failed on /test on subvolume client-2: No such file or directory
[2011-04-18 16:48:33.883508] D [client-lk.c:441:delete_granted_locks_fd] 0-client-3: Number of locks cleared=0
[2011-04-18 16:48:33.883543] I [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-1: background  data self-heal completed on /test
[2011-04-18 16:48:33.883979] D [client3_1-fops.c:3411:client3_1_fstat] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:48:33.884917] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:48:33.885156] D [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-replicate-1: Non blocking inodelks failed. Proceeding to blocking
[2011-04-18 16:48:33.885384] D [afr-lk-common.c:987:afr_lock_blocking] 0-replicate-1: we're done locking
[2011-04-18 16:48:33.885405] D [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-replicate-1: Blocking inodelks done. Proceeding to FOP
[2011-04-18 16:48:33.885420] D [client3_1-fops.c:3283:client3_1_flush] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:48:33.885671] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-client-3: Attempting to delete locks of owner=4090126825326001386
[2011-04-18 16:48:33.885695] D [client-lk.c:407:delete_granted_locks_owner] 0-client-3: Number of locks cleared=0
[2011-04-18 16:48:33.885712] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:48:33.885964] D [client-lk.c:441:delete_granted_locks_fd] 0-client-3: Number of locks cleared=0
[2011-04-18 16:49:16.428630] D [afr-common.c:567:afr_lookup_collect_xattr] 0-replicate-1: data self-heal is pending for /test.
[2011-04-18 16:49:16.428694] D [client3_1-fops.c:1937:client3_1_lookup_cbk] 0-client-2: gfid changed for /test
[2011-04-18 16:49:16.428708] I [afr-common.c:716:afr_lookup_done] 0-replicate-1: background  data self-heal triggered. path: /test
[2011-04-18 16:49:16.429031] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:49:16.429058] D [client3_1-fops.c:3283:client3_1_flush] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:16.429071] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk] 0-replicate-1: flush or setattr failed on /test on subvolume client-2: File descriptor in bad state
[2011-04-18 16:49:16.429324] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk] 0-replicate-1: flush or setattr failed on /test on subvolume client-2: No such file or directory
[2011-04-18 16:49:16.429351] D [client-lk.c:441:delete_granted_locks_fd] 0-client-3: Number of locks cleared=0
[2011-04-18 16:49:16.429392] I [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-1: background  data self-heal completed on /test
[2011-04-18 16:49:16.429858] D [client3_1-fops.c:3411:client3_1_fstat] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:16.430191] D [client3_1-fops.c:3411:client3_1_fstat] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:16.431072] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:49:16.431348] D [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-replicate-1: Non blocking inodelks failed. Proceeding to blocking
[2011-04-18 16:49:16.431599] D [afr-lk-common.c:987:afr_lock_blocking] 0-replicate-1: we're done locking
[2011-04-18 16:49:16.431617] D [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-replicate-1: Blocking inodelks done. Proceeding to FOP
[2011-04-18 16:49:16.431628] D [client3_1-fops.c:3283:client3_1_flush] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:16.431893] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-client-3: Attempting to delete locks of owner=4090126825326001386
[2011-04-18 16:49:16.431919] D [client-lk.c:407:delete_granted_locks_owner] 0-client-3: Number of locks cleared=0
[2011-04-18 16:49:16.431936] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:49:16.432168] D [client-lk.c:441:delete_granted_locks_fd] 0-client-3: Number of locks cleared=0
[2011-04-18 16:49:16.428630] D [afr-common.c:567:afr_lookup_collect_xattr] 0-replicate-1: data self-heal is pending for /test.
[2011-04-18 16:49:16.428694] D [client3_1-fops.c:1937:client3_1_lookup_cbk] 0-client-2: gfid changed for /test
[2011-04-18 16:49:16.428708] I [afr-common.c:716:afr_lookup_done] 0-replicate-1: background  data self-heal triggered. path: /test
[2011-04-18 16:49:16.429031] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:49:16.429058] D [client3_1-fops.c:3283:client3_1_flush] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:16.429071] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk] 0-replicate-1: flush or setattr failed on /test on subvolume client-2: File descriptor in bad state
[2011-04-18 16:49:16.429324] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk] 0-replicate-1: flush or setattr failed on /test on subvolume client-2: No such file or directory
[2011-04-18 16:49:16.429351] D [client-lk.c:441:delete_granted_locks_fd] 0-client-3: Number of locks cleared=0
[2011-04-18 16:49:16.429392] I [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-1: background  data self-heal completed on /test
[2011-04-18 16:49:16.429858] D [client3_1-fops.c:3411:client3_1_fstat] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:16.430191] D [client3_1-fops.c:3411:client3_1_fstat] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:16.431072] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:49:16.431348] D [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-replicate-1: Non blocking inodelks failed. Proceeding to blocking
[2011-04-18 16:49:16.431599] D [afr-lk-common.c:987:afr_lock_blocking] 0-replicate-1: we're done locking
[2011-04-18 16:49:16.431617] D [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-replicate-1: Blocking inodelks done. Proceeding to FOP
[2011-04-18 16:49:16.431628] D [client3_1-fops.c:3283:client3_1_flush] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:16.431893] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-client-3: Attempting to delete locks of owner=4090126825326001386
[2011-04-18 16:49:16.431919] D [client-lk.c:407:delete_granted_locks_owner] 0-client-3: Number of locks cleared=0
[2011-04-18 16:49:16.431936] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:49:16.432168] D [client-lk.c:441:delete_granted_locks_fd] 0-client-3: Number of locks cleared=0
[2011-04-18 16:49:40.160199] D [afr-common.c:567:afr_lookup_collect_xattr] 0-replicate-1: data self-heal is pending for /test.
[2011-04-18 16:49:40.160244] D [client3_1-fops.c:1937:client3_1_lookup_cbk] 0-client-2: gfid changed for /test
[2011-04-18 16:49:40.160258] I [afr-common.c:716:afr_lookup_done] 0-replicate-1: background  data self-heal triggered. path: /test
[2011-04-18 16:49:40.160585] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:49:40.160605] D [client3_1-fops.c:3283:client3_1_flush] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:40.160618] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk] 0-replicate-1: flush or setattr failed on /test on subvolume client-2: File descriptor in bad state
[2011-04-18 16:49:40.160902] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk] 0-replicate-1: flush or setattr failed on /test on subvolume client-2: No such file or directory
[2011-04-18 16:49:40.160929] D [client-lk.c:441:delete_granted_locks_fd] 0-client-3: Number of locks cleared=0
[2011-04-18 16:49:40.160966] I [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-1: background  data self-heal completed on /test
[2011-04-18 16:49:40.161403] D [client3_1-fops.c:3411:client3_1_fstat] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:40.161670] D [client3_1-fops.c:3411:client3_1_fstat] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:40.162460] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:49:40.162683] D [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-replicate-1: Non blocking inodelks failed. Proceeding to blocking
[2011-04-18 16:49:40.162933] D [afr-lk-common.c:987:afr_lock_blocking] 0-replicate-1: we're done locking
[2011-04-18 16:49:40.162957] D [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-replicate-1: Blocking inodelks done. Proceeding to FOP
[2011-04-18 16:49:40.162975] D [client3_1-fops.c:3283:client3_1_flush] 0-client-2: (52409692): failed to get fd ctx. EBADFD
[2011-04-18 16:49:40.163204] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-client-3: Attempting to delete locks of owner=4090126825326001386
[2011-04-18 16:49:40.163230] D [client-lk.c:407:delete_granted_locks_owner] 0-client-3: Number of locks cleared=0
[2011-04-18 16:49:40.163248] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-1: lk op is for a transaction
[2011-04-18 16:49:40.163456] D [client-lk.c:441:delete_granted_locks_fd] 0-client-3: Number of locks cleared=0

The patch posted as a fix to bug 764232 is about entry self-heal which never happens in our case.

Furthermore, our in our senario the file is not supposed to self-heal since there is a split-brain. The problem is that gluster fails to report the split-brain and disallow access to the file. 

We tested this with versions 3.1.2 and 3.1.3 and got the same behavior.

Comment 7 kakogianni 2011-04-18 14:14:11 UTC
I should also mention that restarting after a client is restarted the split-brain is detected.

Comment 8 Pranith Kumar K 2011-04-19 07:06:57 UTC
(In reply to comment #6)
> Thank you for your quick reply on this. However, we believe that this is in
> fact a different bug. In our case, the file is never edited with a text editor
> or renamed. It's contents are simply modified using echo. Below is the trace
> log printed when the file is accessed:
> 
> [2011-04-18 16:48:33.863852] D [afr-common.c:561:afr_lookup_collect_xattr]
> 0-replicate-1: entry self-heal is pending for /.
> [2011-04-18 16:48:33.863895] D [afr-common.c:561:afr_lookup_collect_xattr]
> 0-replicate-1: entry self-heal is pending for /.
> [2011-04-18 16:48:33.863908] I [afr-common.c:716:afr_lookup_done]
> 0-replicate-1: background  entry self-heal triggered. path: /
> [2011-04-18 16:48:33.864188] D
> [afr-self-heal-entry.c:2258:afr_sh_post_nonblocking_entry_cbk] 0-replicate-1:
> Non Blocking entrylks done. Proceeding to FOP
> [2011-04-18 16:48:33.864648] D
> [afr-self-heal-entry.c:2099:afr_sh_entry_sync_prepare] 0-replicate-1: no active
> sources for / found. merging all entries as a conservative decision
> [2011-04-18 16:48:33.882015] D [afr-lk-common.c:415:transaction_lk_op]
> 0-replicate-1: lk op is for a self heal
> [2011-04-18 16:48:33.882309] I
> [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-1:
> background  entry self-heal completed on /
> [2011-04-18 16:48:33.882805] D [client3_1-fops.c:1937:client3_1_lookup_cbk]
> 0-client-2: gfid changed for /test
> [2011-04-18 16:48:33.882851] D [afr-common.c:567:afr_lookup_collect_xattr]
> 0-replicate-1: data self-heal is pending for /test.
> [2011-04-18 16:48:33.882867] I [afr-common.c:716:afr_lookup_done]
> 0-replicate-1: background  data self-heal triggered. path: /test
> [2011-04-18 16:48:33.883185] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:48:33.883212] D [client3_1-fops.c:3283:client3_1_flush]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:48:33.883224] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk]
> 0-replicate-1: flush or setattr failed on /test on subvolume client-2: File
> descriptor in bad state
> [2011-04-18 16:48:33.883482] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk]
> 0-replicate-1: flush or setattr failed on /test on subvolume client-2: No such
> file or directory
> [2011-04-18 16:48:33.883508] D [client-lk.c:441:delete_granted_locks_fd]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:48:33.883543] I
> [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-1:
> background  data self-heal completed on /test
> [2011-04-18 16:48:33.883979] D [client3_1-fops.c:3411:client3_1_fstat]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:48:33.884917] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:48:33.885156] D
> [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-replicate-1: Non
> blocking inodelks failed. Proceeding to blocking
> [2011-04-18 16:48:33.885384] D [afr-lk-common.c:987:afr_lock_blocking]
> 0-replicate-1: we're done locking
> [2011-04-18 16:48:33.885405] D
> [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-replicate-1: Blocking
> inodelks done. Proceeding to FOP
> [2011-04-18 16:48:33.885420] D [client3_1-fops.c:3283:client3_1_flush]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:48:33.885671] D [client3_1-fops.c:641:client3_1_flush_cbk]
> 0-client-3: Attempting to delete locks of owner=4090126825326001386
> [2011-04-18 16:48:33.885695] D [client-lk.c:407:delete_granted_locks_owner]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:48:33.885712] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:48:33.885964] D [client-lk.c:441:delete_granted_locks_fd]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:49:16.428630] D [afr-common.c:567:afr_lookup_collect_xattr]
> 0-replicate-1: data self-heal is pending for /test.
> [2011-04-18 16:49:16.428694] D [client3_1-fops.c:1937:client3_1_lookup_cbk]
> 0-client-2: gfid changed for /test
> [2011-04-18 16:49:16.428708] I [afr-common.c:716:afr_lookup_done]
> 0-replicate-1: background  data self-heal triggered. path: /test
> [2011-04-18 16:49:16.429031] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:49:16.429058] D [client3_1-fops.c:3283:client3_1_flush]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:16.429071] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk]
> 0-replicate-1: flush or setattr failed on /test on subvolume client-2: File
> descriptor in bad state
> [2011-04-18 16:49:16.429324] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk]
> 0-replicate-1: flush or setattr failed on /test on subvolume client-2: No such
> file or directory
> [2011-04-18 16:49:16.429351] D [client-lk.c:441:delete_granted_locks_fd]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:49:16.429392] I
> [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-1:
> background  data self-heal completed on /test
> [2011-04-18 16:49:16.429858] D [client3_1-fops.c:3411:client3_1_fstat]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:16.430191] D [client3_1-fops.c:3411:client3_1_fstat]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:16.431072] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:49:16.431348] D
> [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-replicate-1: Non
> blocking inodelks failed. Proceeding to blocking
> [2011-04-18 16:49:16.431599] D [afr-lk-common.c:987:afr_lock_blocking]
> 0-replicate-1: we're done locking
> [2011-04-18 16:49:16.431617] D
> [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-replicate-1: Blocking
> inodelks done. Proceeding to FOP
> [2011-04-18 16:49:16.431628] D [client3_1-fops.c:3283:client3_1_flush]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:16.431893] D [client3_1-fops.c:641:client3_1_flush_cbk]
> 0-client-3: Attempting to delete locks of owner=4090126825326001386
> [2011-04-18 16:49:16.431919] D [client-lk.c:407:delete_granted_locks_owner]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:49:16.431936] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:49:16.432168] D [client-lk.c:441:delete_granted_locks_fd]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:49:16.428630] D [afr-common.c:567:afr_lookup_collect_xattr]
> 0-replicate-1: data self-heal is pending for /test.
> [2011-04-18 16:49:16.428694] D [client3_1-fops.c:1937:client3_1_lookup_cbk]
> 0-client-2: gfid changed for /test
> [2011-04-18 16:49:16.428708] I [afr-common.c:716:afr_lookup_done]
> 0-replicate-1: background  data self-heal triggered. path: /test
> [2011-04-18 16:49:16.429031] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:49:16.429058] D [client3_1-fops.c:3283:client3_1_flush]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:16.429071] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk]
> 0-replicate-1: flush or setattr failed on /test on subvolume client-2: File
> descriptor in bad state
> [2011-04-18 16:49:16.429324] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk]
> 0-replicate-1: flush or setattr failed on /test on subvolume client-2: No such
> file or directory
> [2011-04-18 16:49:16.429351] D [client-lk.c:441:delete_granted_locks_fd]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:49:16.429392] I
> [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-1:
> background  data self-heal completed on /test
> [2011-04-18 16:49:16.429858] D [client3_1-fops.c:3411:client3_1_fstat]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:16.430191] D [client3_1-fops.c:3411:client3_1_fstat]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:16.431072] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:49:16.431348] D
> [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-replicate-1: Non
> blocking inodelks failed. Proceeding to blocking
> [2011-04-18 16:49:16.431599] D [afr-lk-common.c:987:afr_lock_blocking]
> 0-replicate-1: we're done locking
> [2011-04-18 16:49:16.431617] D
> [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-replicate-1: Blocking
> inodelks done. Proceeding to FOP
> [2011-04-18 16:49:16.431628] D [client3_1-fops.c:3283:client3_1_flush]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:16.431893] D [client3_1-fops.c:641:client3_1_flush_cbk]
> 0-client-3: Attempting to delete locks of owner=4090126825326001386
> [2011-04-18 16:49:16.431919] D [client-lk.c:407:delete_granted_locks_owner]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:49:16.431936] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:49:16.432168] D [client-lk.c:441:delete_granted_locks_fd]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:49:40.160199] D [afr-common.c:567:afr_lookup_collect_xattr]
> 0-replicate-1: data self-heal is pending for /test.
> [2011-04-18 16:49:40.160244] D [client3_1-fops.c:1937:client3_1_lookup_cbk]
> 0-client-2: gfid changed for /test
> [2011-04-18 16:49:40.160258] I [afr-common.c:716:afr_lookup_done]
> 0-replicate-1: background  data self-heal triggered. path: /test
> [2011-04-18 16:49:40.160585] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:49:40.160605] D [client3_1-fops.c:3283:client3_1_flush]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:40.160618] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk]
> 0-replicate-1: flush or setattr failed on /test on subvolume client-2: File
> descriptor in bad state
> [2011-04-18 16:49:40.160902] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk]
> 0-replicate-1: flush or setattr failed on /test on subvolume client-2: No such
> file or directory
> [2011-04-18 16:49:40.160929] D [client-lk.c:441:delete_granted_locks_fd]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:49:40.160966] I
> [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-1:
> background  data self-heal completed on /test
> [2011-04-18 16:49:40.161403] D [client3_1-fops.c:3411:client3_1_fstat]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:40.161670] D [client3_1-fops.c:3411:client3_1_fstat]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:40.162460] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:49:40.162683] D
> [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-replicate-1: Non
> blocking inodelks failed. Proceeding to blocking
> [2011-04-18 16:49:40.162933] D [afr-lk-common.c:987:afr_lock_blocking]
> 0-replicate-1: we're done locking
> [2011-04-18 16:49:40.162957] D
> [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-replicate-1: Blocking
> inodelks done. Proceeding to FOP
> [2011-04-18 16:49:40.162975] D [client3_1-fops.c:3283:client3_1_flush]
> 0-client-2: (52409692): failed to get fd ctx. EBADFD
> [2011-04-18 16:49:40.163204] D [client3_1-fops.c:641:client3_1_flush_cbk]
> 0-client-3: Attempting to delete locks of owner=4090126825326001386
> [2011-04-18 16:49:40.163230] D [client-lk.c:407:delete_granted_locks_owner]
> 0-client-3: Number of locks cleared=0
> [2011-04-18 16:49:40.163248] D [afr-lk-common.c:410:transaction_lk_op]
> 0-replicate-1: lk op is for a transaction
> [2011-04-18 16:49:40.163456] D [client-lk.c:441:delete_granted_locks_fd]
> 0-client-3: Number of locks cleared=0
> 
> The patch posted as a fix to bug 764232 is about entry self-heal which never
> happens in our case.
> 
> Furthermore, our in our senario the file is not supposed to self-heal since
> there is a split-brain. The problem is that gluster fails to report the
> split-brain and disallow access to the file. 
> 
> We tested this with versions 3.1.2 and 3.1.3 and got the same behavior.

According to the logs: this seems like what happened,
1) glusterfs process refered by "client-3" in the logs went down
2) A file is created on the mount point with some data, so it is written to the brick refered by "client-2" in the logs.
3) client-3 is brought up. self heal started from client-3 to client-2
4) Before the self-heal is completed "client-2" went down.
5) According to the log below some operation is done which deleted the previous file and a new file with the same name and different gfid is created. 
> [2011-04-18 16:48:33.882805] D [client3_1-fops.c:1937:client3_1_lookup_cbk]
> 0-client-2: gfid changed for /test
5) client-2 is brought up, this tries to reopen the file but finds that this file is already deleted so it will forget about the oldfile with the same filename.
6) So all the file operations on this file are failing with EBADFD

Looks like you are able to reproduce the bug consistently.

Could you please list the exact commands you executed for each of these steps.
So that I can reproduce this in-house. Missing even one command may not result in hitting this issue.

Pranith

Comment 9 kakogianni 2011-04-19 13:36:41 UTC
Using this volume file:
+------------------------------------------------------------------------------+
  1: volume client-0
  2:    type protocol/client
  3:    option remote-host localhost
  4:    option remote-port 10000
  5:    option remote-subvolume brick0
  6:    option transport-type tcp
  7: end-volume
  8:
  9: volume client-1
 10:    type protocol/client
 11:    option remote-host localhost
 12:    option remote-port 10001
 13:    option remote-subvolume brick1
 14:    option transport-type tcp
 15: end-volume
 16:
 17: volume replicate-0
 18:    type cluster/replicate
 19:    option read-subvolume client-0
 20:    subvolumes  client-0 client-1
 21: end-volume
 22:
 23: volume dht
 24:    type cluster/distribute
 25:    subvolumes  replicate-0
 26: end-volume
 27:
 28: volume write-behind
 29:    type performance/write-behind
 30:    subvolumes dht
 31: end-volume
 32:
 33: volume read-ahead
 34:    type performance/read-ahead
 35:    subvolumes write-behind
 36: end-volume
 37:
 38: volume io-cache
 39:    type performance/io-cache
 40:    subvolumes read-ahead
 41: end-volume
 42:
 43: volume quick-read
 44:    type performance/quick-read
 45:    subvolumes io-cache
 46: end-volume
 47:
 48: volume io-stats
 49:    type debug/io-stats
 50:    subvolumes quick-read
 51: end-volume
 52:

+------------------------------------------------------------------------------+

kill brick1 server
# echo write0 > /storage/test
start brick1 server and kill brick0 server
# echo write1 > /storage/test
start brick0 server
# cat /storage/test
write0 

The trace log resulting from "cat" is:
[2011-04-19 16:17:52.130435] D [afr-common.c:561:afr_lookup_collect_xattr] 0-replicate-0: entry self-heal is pending for /.
[2011-04-19 16:17:52.130485] D [afr-common.c:561:afr_lookup_collect_xattr] 0-replicate-0: entry self-heal is pending for /.
[2011-04-19 16:17:52.130513] I [afr-common.c:716:afr_lookup_done] 0-replicate-0: background  entry self-heal triggered. path: /
[2011-04-19 16:17:52.130817] D [afr-self-heal-entry.c:2235:afr_sh_post_nonblocking_entry_cbk] 0-replicate-0: Non Blocking entrylks done. Proceeding to FOP
[2011-04-19 16:17:52.131108] D [afr-self-heal-entry.c:2076:afr_sh_entry_sync_prepare] 0-replicate-0: no active sources for / found. merging all entries as a conservative decision
[2011-04-19 16:17:52.133554] D [afr-lk-common.c:415:transaction_lk_op] 0-replicate-0: lk op is for a self heal
[2011-04-19 16:17:52.133792] I [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-0: background  entry self-heal completed on /
[2011-04-19 16:17:52.134572] D [client3_1-fops.c:1937:client3_1_lookup_cbk] 0-client-0: gfid changed for /test
[2011-04-19 16:17:52.134606] D [afr-common.c:567:afr_lookup_collect_xattr] 0-replicate-0: data self-heal is pending for /test.
[2011-04-19 16:17:52.134620] I [afr-common.c:716:afr_lookup_done] 0-replicate-0: background  data self-heal triggered. path: /test
[2011-04-19 16:17:52.134916] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-0: lk op is for a transaction
[2011-04-19 16:17:52.134936] D [client3_1-fops.c:3283:client3_1_flush] 0-client-0: (9177739): failed to get fd ctx. EBADFD
[2011-04-19 16:17:52.134948] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk] 0-replicate-0: flush or setattr failed on /test on subvolume client-0: File descriptor in bad state
[2011-04-19 16:17:52.135165] D [afr-self-heal-data.c:107:afr_sh_data_flush_cbk] 0-replicate-0: flush or setattr failed on /test on subvolume client-0: No such file or directory
[2011-04-19 16:17:52.135184] D [client-lk.c:441:delete_granted_locks_fd] 0-client-1: Number of locks cleared=0
[2011-04-19 16:17:52.135210] I [afr-self-heal-common.c:1527:afr_self_heal_completion_cbk] 0-replicate-0: background  data self-heal completed on /test
[2011-04-19 16:17:52.135735] D [client3_1-fops.c:3411:client3_1_fstat] 0-client-0: (9177739): failed to get fd ctx. EBADFD
[2011-04-19 16:17:52.136580] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-0: lk op is for a transaction
[2011-04-19 16:17:52.136785] D [afr-transaction.c:973:afr_post_nonblocking_inodelk_cbk] 0-replicate-0: Non blocking inodelks failed. Proceeding to blocking
[2011-04-19 16:17:52.136977] D [afr-lk-common.c:987:afr_lock_blocking] 0-replicate-0: we're done locking
[2011-04-19 16:17:52.137001] D [afr-transaction.c:953:afr_post_blocking_inodelk_cbk] 0-replicate-0: Blocking inodelks done. Proceeding to FOP
[2011-04-19 16:17:52.137020] D [client3_1-fops.c:3283:client3_1_flush] 0-client-0: (9177739): failed to get fd ctx. EBADFD
[2011-04-19 16:17:52.137224] D [client3_1-fops.c:641:client3_1_flush_cbk] 0-client-1: Attempting to delete locks of owner=1143670214834557623
[2011-04-19 16:17:52.137243] D [client-lk.c:407:delete_granted_locks_owner] 0-client-1: Number of locks cleared=0
[2011-04-19 16:17:52.137256] D [afr-lk-common.c:410:transaction_lk_op] 0-replicate-0: lk op is for a transaction
[2011-04-19 16:17:52.137443] D [client-lk.c:441:delete_granted_locks_fd] 0-client-1: Number of locks cleared=0

/Eirini

Comment 10 Anand Avati 2011-07-17 11:45:21 UTC
PATCH: http://patches.gluster.com/patch/7906 in master (cluster/afr: make expunge/impunge re-usable)

Comment 11 Anand Avati 2011-07-17 11:45:28 UTC
PATCH: http://patches.gluster.com/patch/7907 in master (cluster/afr: Detect conflict/gfid self-heals)

Comment 12 Anand Avati 2011-07-17 11:45:34 UTC
PATCH: http://patches.gluster.com/patch/7908 in master (cluster/afr: Fix conflict files and gfid self-heal)

Comment 13 Anand Avati 2011-07-17 11:45:39 UTC
PATCH: http://patches.gluster.com/patch/7909 in master (storage/posix: Succeed lookup even if there is no gfid on the file)

Comment 14 Anand Avati 2011-07-17 11:45:46 UTC
PATCH: http://patches.gluster.com/patch/7910 in master (storage/posix: Remove the interim fix that handles the gfid race)