Description of problem: ======================= [DATA LOSS]- DHT- rename from multiple mount ends in data loss Version-Release number of selected component (if applicable): ============================================================= 3.6.0.28-1.el6rhs.x86_64 How reproducible: ================= always Steps to Reproduce: =================== 1. created volume 4x2 using snapshot{9..12} - volume name :- multi1 Volume Name: multi1 Type: Distributed-Replicate Volume ID: 630a2173-3d1f-4ddd-8529-b1c14a6d6a64 Status: Started Snap Volume: no Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: snapshot09.lab.eng.blr.redhat.com:/brick5/multi10 Brick2: snapshot10.lab.eng.blr.redhat.com:/brick5/multi11 Brick3: snapshot11.lab.eng.blr.redhat.com:/brick5/multi12 Brick4: snapshot12.lab.eng.blr.redhat.com:/brick5/multi13 Brick5: snapshot09.lab.eng.blr.redhat.com:/brick5/multi14 Brick6: snapshot10.lab.eng.blr.redhat.com:/brick5/multi15 Brick7: snapshot11.lab.eng.blr.redhat.com:/brick5/multi16 Brick8: snapshot12.lab.eng.blr.redhat.com:/brick5/multi17 Options Reconfigured: performance.readdir-ahead: on auto-delete: disable snap-max-soft-limit: 90 snap-max-hard-limit: 256 2. snapshot09 and 10 went down 3. created directory fresh4 after that.so it has complete layout 4. inside fresh4 Directories created files - a b c d 5. started renaming from two mount as below:- one is NFS and one is FUSE while true; do cd /mnt/multi/fresh4/; mv -f a b; mv -f c d; mv -f b a; mv -f d c; cd / ; done Actual results: =============== [root@snapshot11 fresh4]# ls a b c d -li ls: cannot access c: No such file or directory ls: cannot access d: No such file or directory 10880713884975601753 -rw-r--r-- 1 root root 41943040 Sep 17 09:57 a 10880713884975601753 -rw-r--r-- 1 root root 41943040 Sep 17 09:57 b -> verified on backend file is not present there Expected results: =================
sosreport @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1142650/
Reproduced the issue with TRACE enabled. 1. Created two brick setup. 2. Created two files[make sure they hash to different bricks]. In my case they are tile and zile 3. Run "while true; done mv -f tile zile ; mv -f zile tile; done" form both nfs and fuse Logs: [Captured last logs for either unlink or rename from both mount points] 1. from mnt.log [UNLINK was the last operation on zile ] [2014-09-17 10:09:04.412720] T [fuse-bridge.c:435:fuse_entry_cbk] 0-glusterfs-fuse: 99191: LOOKUP() /zile => 11886673162260437651 [2014-09-17 10:09:04.412830] T [fuse-bridge.c:1570:fuse_unlink_resume] 0-glusterfs-fuse: 99192: UNLINK /zile [2014-09-17 10:09:04.418667] T [fuse-bridge.c:1290:fuse_unlink_cbk] 0-glusterfs-fuse: 99192: UNLINK() /zile => 0 2. from nfs.log [rename tile->zile was the last operation on tile, zile] [2014-09-17 10:09:04.397658] T [nfs-fops.c:1293:nfs_fop_rename] 0-nfs: Rename: /tile -> /zile [2014-09-17 10:09:04.397696] I [dht-rename.c:1345:dht_rename] 0-test1-dht: renaming /tile (hash=test1-client-0/cache=test1-client-0) => /zile (hash=test1-client-1/cache=<nul>) [2014-09-17 10:09:04.399864] T [MSGID: 0] [dht-rename.c:1051:dht_rename_create_links] 0-test1-dht: linkfile /tile @ test1-client-1 => test1-client-0 [2014-09-17 10:09:04.405688] T [MSGID: 0] [dht-rename.c:921:dht_rename_linkto_cbk] 0-test1-dht: link /tile => /zile (test1-client-0) [2014-09-17 10:09:04.405983] T [MSGID: 0] [dht-rename.c:839:dht_do_rename] 0-test1-dht: renaming /tile => /zile (test1-client-1) [2014-09-17 10:09:04.407583] T [MSGID: 0] [dht-rename.c:740:dht_rename_cbk] 0-test1-dht: deleting old src datafile /tile @ test1-client-0 Observations: The unlink of zile from /mnt "412720" and deletion of tile from /nfs "407583+some delay as it's not yet deleted" are very close and they are the last operations captured on the logs. And looks what Shyam pointed out earlier. 1. NFS mount tries to do rename tile -> zile and FUSE mount attempting zile -> tile 2. In case "tile->zile" tile got unlinked from nfs mount, but lookup happended same time around from FUSE mount. 3. And in the process of "zile->file" on FUSE mount, FUSE sent "unlink zile". And we loose the file. Regards, Susant
Bz 1166570 which this bug depends on is fixed in RHEL-7.2. Since we shipped rhgs-3.1.3 on RHEL-7.2, this bug should be fixed in 3.1.3 <bz 116570> Status: ASSIGNED → MODIFIED Fixed In Version: coreutils-8.22-13.el7 </116570>
Verified this bug on glusterfs build 3.7.9-12.el7rhgs.x86_64. Here are the steps that were performed, 1. Created a distributed replica volume and started it. 2. NFS and Fuse mounted the volume on two different clients. 2. Created two files file1 and file2. 3. Simultaneously from NFS and Fuse mounts, continuously renamed the two files "while true; do mv -f file1 file2 ; mv -f file2 file1; done" The issue is fixed and no data loss was seen. Hence, moving state of bug to Verified.
Thanks Prasad. Closing this BZ as per comment#7.