Hide Forgot
Description of problem: ========================== In a 1 x 3 replicate volume, a file is in data split-brain state. Created hard-link to the file. Creation of hardlink was successful. Resolved the split-brain of the file by removing the file , file's hardlink, .glusterfs entry for the file on 2 bricks {brick1, brick2} considering brick3 as good copy. 1) "ls -l" from mount point doesn't list the hardlink file even after resolving the split-brain on the file. 2) Hardlink files is not self-healed. 3) Stat on the file just shows Number of Links as "1" instead of Actual number of hardlinks. Version-Release number of selected component (if applicable): ============================================================= glusterfs 3.4.0.35.1u2rhs built on Oct 21 2013 14:00:58 How reproducible: ================= Often Steps to Reproduce: ==================== 1.Create 1 x 3 replicate volume. Set self-heal-daemon to off. Start the volume 2.Create 2 fuse mounts. 3.From one mount point create a file : " dd if=/dev/urandom of=./test_file bs=1M count=1" 4. Bring down brick1 and brick2 5. From mount1 write data to "test_file": " dd if=/dev/urandom of=./test_file bs=2M count=1" 6. Bring back brick1 . Bring down brick3 7. From mount2 write data to "test_file" : " dd if=/dev/urandom of=./test_file bs=3M count=1" 8. Bring back brick2. Bring down brick1 9. From mount2 write data to "test_file" : " dd if=/dev/urandom of=./test_file bs=4M count=1" 10. Bring back brick1 and brick3. Note: At this state the file is in split-brain state. 11. From mount2, create hard-link to split-brain file {Successful} 12. Rename "test_file" to "test_file1" {Successful} 13. Create a symbolic link to the "test_file1" {successful} 13. Resolve split-brain on brick1 and brick2. i.e remove the test_file1 , hardlink file , .glusterfs entry on brick1 and brick2. Retain the brick3 copy. Actual results: ================== 1. "ls -l" from mount point doesn't report the hard links. 2. "test_file1" is self-healed to brick1 and brick2. 3. Hard-links are not self-healed. 4. stat shows number of Links as "1" root@rhs-client14 [Nov-13-2013- 6:03:06] >stat test_file1 File: `test_file1' Size: 2097152 Blocks: 4096 IO Block: 131072 regular file Device: 1eh/30d Inode: 13299062108449068165 Links: 1 Access: (0666/-rw-rw-rw-) Uid: ( 501/ qa_func) Gid: ( 503/qa_system) Access: 2013-11-12 10:37:53.000000000 +0000 Modify: 2013-11-12 12:06:25.281715000 +0000 Change: 2013-11-12 12:40:40.562173454 +0000 Expected results: ================= Should self-heal hard-links.
SOS Reports: http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1029778/ root@rhs-client11 [Nov-13-2013- 8:03:35] >gluster v info Volume Name: vol_rep Type: Replicate Volume ID: d75d19c8-fb2f-475e-915c-d24d4dede1e3 Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: rhs-client11:/rhs/bricks/b1 Brick2: rhs-client12:/rhs/bricks/b1-rep1 Brick3: rhs-client13:/rhs/bricks/b1-rep2 Options Reconfigured: nfs.disable: on cluster.self-heal-daemon: off root@rhs-client11 [Nov-13-2013- 8:27:22] >
Shwetha, I followed the steps and was seeing the expected behavior instead of the bug. How was the resolution of split-brain performed in the setup? After the file, its hardlink + gfid-link are removed, we need to access both file, its hardlink from the mount point to make sure both the files are healed. 'ls -l' may not show this file until this is performed because, afr does not know that 'brick3' is the source at the time because there are no extended attributes to say so. You can see the following output: Initially find . | xargs stat only shows the softlink, no files at all. but once both 'test_fil1, h_test_file' are accessed both the files are re-created. And further 'ls -l' shows all the files as expected. root@pranithk-laptop - /mnt/r2 15:04:21 :) ⚡ find . | xargs stat File: ‘.’ Size: 42 Blocks: 1 IO Block: 131072 directory Device: 24h/36d Inode: 1 Links: 3 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:fusefs_t:s0 Access: 2013-12-27 15:06:17.378897044 +0530 Modify: 2013-12-27 15:06:11.103890634 +0530 Change: 2013-12-27 15:06:11.103890634 +0530 Birth: - File: ‘./s_test_file1’ -> ‘test_fil1’ Size: 9 Blocks: 0 IO Block: 131072 symbolic link Device: 24h/36d Inode: 9866554501010886634 Links: 1 Access: (0777/lrwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:fusefs_t:s0 Access: 2013-12-27 15:06:17.378897044 +0530 Modify: 2013-12-27 15:04:21.226778369 +0530 Change: 2013-12-27 15:04:21.226778369 +0530 Birth: - root@pranithk-laptop - /mnt/r2 15:07:33 :) ⚡ ls -l test_fil1 -rw-r--r--. 2 root root 2097152 Dec 27 14:58 test_fil1 root@pranithk-laptop - /mnt/r2 15:08:08 :) ⚡ ls -l h_test_file -rw-r--r--. 2 root root 2097152 Dec 27 14:58 h_test_file root@pranithk-laptop - /mnt/r2 15:08:14 :) ⚡ ls -l total 4096 -rw-r--r--. 2 root root 2097152 Dec 27 14:58 h_test_file lrwxrwxrwx. 1 root root 9 Dec 27 15:04 s_test_file1 -> test_fil1 -rw-r--r--. 2 root root 2097152 Dec 27 14:58 test_fil1 Pranith.