Description of problem: ========================= there is a case where rename of a file leads to data loss. Steps to Reproduce: =================== 1.create a 1x(2+1) volume with bricks as say db1,db2 and ab1 2.now mount the vol by fuse 3.create a directory say dir1 4. Now bring down the first data brick(db1) 5. create a file say f1 under dir1 with some contents 6. note down the getfattr details from both db2 and ab1 7. now bring down db2 and bring up db1 8. trigger a heal 9. now rename f1 to f2 10. now bring up db2 and trigger a heal 11. from mount do a cat of f2 We get EIO [root@dhcp42-93 db1_Down]# cat renamdatafile cat: renamdatafile: Input/output error client logs: [2016-07-15 12:25:40.299090] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-arbit-replicate-0: Unreadable subvolume -1 found with event generation 7 for gfid 091d29dd-f4e1-49da-8353-1686e59818de. (Possible split-brain) [2016-07-15 12:25:40.301196] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-arbit-replicate-0: Failing FGETXATTR on gfid 091d29dd-f4e1-49da-8353-1686e59818de: split-brain observed. [Input/output error] [2016-07-15 12:25:40.302017] W [MSGID: 108027] [afr-common.c:2245:afr_discover_done] 0-arbit-replicate-0: no read subvols for (null) [2016-07-15 12:25:40.305693] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 796: READ => -1 gfid=091d29dd-f4e1-49da-8353-1686e59818de fd=0x7fcbf801579c (Input/output error) [2016-07-15 12:25:40.303768] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-arbit-replicate-0: Unreadable subvolume -1 found with event generation 7 for gfid 091d29dd-f4e1-49da-8353-1686e59818de. (Possible split-brain) [2016-07-15 12:25:40.305666] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-arbit-replicate-0: Failing READ on gfid 091d29dd-f4e1-49da-8353-1686e59818de: split-brain observed. [Input/output error] db1 getfattr: root@dhcp43-157 ~]# getfattr -d -m . -e hex /bricks/brick2/arbit/db1_Down/renamdatafile getfattr: Removing leading '/' from absolute path names # file: bricks/brick2/arbit/db1_Down/renamdatafile security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.arbit-client-0=0x000000030000000000000000 trusted.afr.arbit-client-1=0x000000010000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005788cb3000085f14 trusted.gfid=0x091d29ddf4e149da83531686e59818de db2:[root@dhcp43-153 ~]# getfattr -d -m . -e hex /bricks/brick1/arbit/db1_Down/renamdatafile getfattr: Removing leading '/' from absolute path names # file: bricks/brick1/arbit/db1_Down/renamdatafile security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.gfid=0x091d29ddf4e149da83531686e59818de ab1: [root@dhcp43-157 ~]# getfattr -d -m . -e hex /bricks/brick0/arbit/db1_Down/renamdatafile getfattr: Removing leading '/' from absolute path names # file: bricks/brick0/arbit/db1_Down/renamdatafile security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.gfid=0x091d29ddf4e149da83531686e59818de Volume Name: arbit Type: Replicate Volume ID: 0069b5a7-bfdf-4f59-86ec-851f500ed902 Status: Started Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: 10.70.43.129:/bricks/brick0/arbit Brick2: 10.70.43.153:/bricks/brick1/arbit Brick3: 10.70.43.129:/bricks/brick2/arbit (arbiter) Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet [root@dhcp43-157 ~]# Expected results: Additional info:
REVIEW: http://review.gluster.org/15017 (afr: some coverity fixes) posted (#1) for review on release-3.7 by Ravishankar N (ravishankar)
Ignore comment #1, that patch is for a different bug.
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life. Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS. If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.