+++ This bug was initially created as a clone of Bug #1357000 +++ Description of problem: ========================= there is a case where rename of a file leads to data loss. Steps to Reproduce: =================== 1.create a 1x(2+1) volume with bricks as say db1,db2 and ab1 2.now mount the vol by fuse 3.create a directory say dir1 4. Now bring down the first data brick(db1) 5. create a file say f1 under dir1 with some contents 6. note down the getfattr details from both db2 and ab1 7. now bring down db2 and bring up db1 8. trigger a heal 9. now rename f1 to f2 10. now bring up db2 and trigger a heal 11. from mount do a cat of f2 We get EIO [root@dhcp42-93 db1_Down]# cat renamdatafile cat: renamdatafile: Input/output error client logs: [2016-07-15 12:25:40.299090] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-arbit-replicate-0: Unreadable subvolume -1 found with event generation 7 for gfid 091d29dd-f4e1-49da-8353-1686e59818de. (Possible split-brain) [2016-07-15 12:25:40.301196] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-arbit-replicate-0: Failing FGETXATTR on gfid 091d29dd-f4e1-49da-8353-1686e59818de: split-brain observed. [Input/output error] [2016-07-15 12:25:40.302017] W [MSGID: 108027] [afr-common.c:2245:afr_discover_done] 0-arbit-replicate-0: no read subvols for (null) [2016-07-15 12:25:40.305693] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 796: READ => -1 gfid=091d29dd-f4e1-49da-8353-1686e59818de fd=0x7fcbf801579c (Input/output error) [2016-07-15 12:25:40.303768] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-arbit-replicate-0: Unreadable subvolume -1 found with event generation 7 for gfid 091d29dd-f4e1-49da-8353-1686e59818de. (Possible split-brain) [2016-07-15 12:25:40.305666] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-arbit-replicate-0: Failing READ on gfid 091d29dd-f4e1-49da-8353-1686e59818de: split-brain observed. [Input/output error] db1 getfattr: root@dhcp43-157 ~]# getfattr -d -m . -e hex /bricks/brick2/arbit/db1_Down/renamdatafile getfattr: Removing leading '/' from absolute path names # file: bricks/brick2/arbit/db1_Down/renamdatafile security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.arbit-client-0=0x000000030000000000000000 trusted.afr.arbit-client-1=0x000000010000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005788cb3000085f14 trusted.gfid=0x091d29ddf4e149da83531686e59818de db2:[root@dhcp43-153 ~]# getfattr -d -m . -e hex /bricks/brick1/arbit/db1_Down/renamdatafile getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/arbit/db1_Down/renamdatafile security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.gfid=0x091d29ddf4e149da83531686e59818de ab1: [root@dhcp43-157 ~]# getfattr -d -m . -e hex /bricks/brick0/arbit/db1_Down/renamdatafile getfattr: Removing leading '/' from absolute path names # file: bricks/brick0/arbit/db1_Down/renamdatafile security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.gfid=0x091d29ddf4e149da83531686e59818de Volume Name: arbit Type: Replicate Volume ID: 0069b5a7-bfdf-4f59-86ec-851f500ed902 Status: Started Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: 10.70.43.129:/bricks/brick0/arbit Brick2: 10.70.43.153:/bricks/brick1/arbit Brick3: 10.70.43.129:/bricks/brick2/arbit (arbiter) Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet [root@dhcp43-157 ~]# Expected results: Additional info: --- Additional comment from Vijay Bellur on 2016-07-27 00:20:31 EDT --- REVIEW: http://review.gluster.org/15017 (afr: some coverity fixes) posted (#1) for review on release-3.7 by Ravishankar N (ravishankar) --- Additional comment from Ravishankar N on 2016-07-27 00:22:42 EDT --- Ignore comment #1, that patch is for a different bug.
REVIEW: http://review.gluster.org/15226 (afr/posix: anoninode logic for entyr-self-heal) posted (#1) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/15226 (afr, posix: anoninode logic for entry selfheal) posted (#2) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/15226 (afr, posix: anoninode logic for entry selfheal) posted (#3) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/15226 (afr, posix: anoninode logic for entry selfheal) posted (#4) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/15226 (afr, posix: anoninode logic for entry selfheal) posted (#5) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/15226 (afr, posix: anoninode logic for entry selfheal) posted (#6) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/15226 (afr, posix: anoninode logic for entry selfheal) posted (#7) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/15226 (afr, posix: anoninode logic for entry selfheal) posted (#8) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/15226 (afr, posix: anoninode logic for entry selfheal) posted (#9) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/15226 (afr, posix: anoninode logic for entry selfheal) posted (#10) for review on master by Ravishankar N (ravishankar)
REVIEW: http://review.gluster.org/15226 (afr, posix: anoninode logic for entry selfheal) posted (#11) for review on master by Ravishankar N (ravishankar)
This update is done in bulk based on the state of the patch and the time since last activity. If the issue is still seen, please reopen the bug.