Description of problem: ======================= stat , ls on file which is in metadata-split-brain state succeeds and returns different outputs on different mount points. Also, from mount point we are able to perform the metadata changes like (setattr, chmod, chown) and entry operation like ln which changes the "number of Links" of the file . Version-Release number of selected component (if applicable): ============================================================== glusterfs 3.4.0.35.1u2rhs built on Oct 21 2013 14:00:58 How reproducible: ================== Often Steps to Reproduce: ==================== 1. Create a 1 x 3 replicate volume. Set "nfs.disable" to "on". Set "self-heal-daemon" to "off". Start the volume. 2. Create 3 fuse mount points. Create a file "test_file" from mount point. 3. Bring down brick1 and brick2. 4. From mount1 execute: "chmod 777 test_file" 5. Bring down brick3. Bring back brick2. 6. From mount2 execute : "chown <uid>:<gid> test_file" 7. Bring down brick2. bring back brick1. 8. From mount3 execute : "setfattr -n "user.name" -v "qa_func" test_file" ; "ln ./test_file ./test_file1" 9. set "self-heal-daemon" volume option to "on" 10. Bring back brick2 and brick3. 11. From all the 3 mount points execute stat, ls on the file. Actual results: =============== 1. Hard link is self-healed to brick2 and brick3. 2. Each mount points shows different stat structure of the file "test_file". Mount1:- ======= root@rhs-client14 [Nov-14-2013- 9:18:33] >stat ./test_file1 File: `./test_file1' Size: 1048576 Blocks: 2048 IO Block: 131072 regular file Device: 1dh/29d Inode: 12459298682473767608 Links: 2 Access: (0666/-rw-rw-rw-) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-11-14 09:00:32.245544796 +0000 Modify: 2013-11-14 09:00:32.457537026 +0000 Change: 2013-11-14 09:18:45.149018768 +0000 root@rhs-client14 [Nov-14-2013- 9:18:48] >ls -l ./test_file -rw-rw-rw- 2 root root 1048576 Nov 14 09:00 ./test_file Mount2:- ======== root@rhs-client14 [Nov-14-2013- 9:18:33] >stat ./test_file1 File: `./test_file1' Size: 1048576 Blocks: 2048 IO Block: 131072 regular file Device: 1fh/31d Inode: 12459298682473767608 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 503/ qa_perf) Gid: ( 501/ qa_all) Access: 2013-11-14 09:00:32.246642715 +0000 Modify: 2013-11-14 09:00:32.457634464 +0000 Change: 2013-11-14 09:18:27.111595369 +0000 root@rhs-client14 [Nov-14-2013- 9:18:48] >ls -l ./test_file -rw-r--r-- 2 qa_perf qa_all 1048576 Nov 14 09:00 ./test_file Mount3:- ============ root@rhs-client14 [Nov-14-2013- 9:18:33] >stat ./test_file1 File: `./test_file1' Size: 1048576 Blocks: 2048 IO Block: 131072 regular file Device: 1eh/30d Inode: 12459298682473767608 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-11-14 09:10:12.000000000 +0000 Modify: 2013-11-14 09:10:12.000000000 +0000 Change: 2013-11-14 09:18:27.111610160 +0000 root@rhs-client14 [Nov-14-2013- 9:18:48] >ls -l ./test_file -rw-r--r-- 2 root root 1048576 Nov 14 09:10 ./test_file 3. Log messages shows "metadata-self-heal" is successful. [2013-11-14 09:22:41.742579] I [afr-self-heal-common.c:2840:afr_log_self_heal_completion_status] 0-vol _rep-replicate-0: metadata self heal is successfully completed, backgroung data self heal is succes sfully completed, from vol_rep-client-0 with 1048576 1048576 1048576 sizes - Pending matrix: [ [ 0 0 0 ] [ 0 0 0 ] [ 0 0 0 ] ] on /test_file [2013-11-14 09:23:08.943001] I [afr-self-heal-common.c:2840:afr_log_self_heal_completion_status] 0-vol _rep-replicate-0: metadata self heal is successfully completed, backgroung data self heal is succes sfully completed, from vol_rep-client-0 with 1048576 1048576 1048576 sizes - Pending matrix: [ [ 0 0 0 ] [ 0 0 0 ] [ 0 0 0 ] ] on /test_file1 4. "gluster volume heal <volume_name> info split-brain" not reporting one of the hard-links. root@rhs-client11 [Nov-14-2013- 9:45:45] >gluster v heal vol_rep info split-brain Gathering list of split brain entries on volume vol_rep has been successful Brick rhs-client11:/rhs/bricks/b1 Number of entries: 5 at path on brick ----------------------------------- 2013-11-14 09:11:38 /test_file1 2013-11-14 09:21:14 /test_file1 2013-11-14 09:31:14 /test_file1 2013-11-14 09:41:14 /test_file1 2013-11-14 09:45:39 /test_file1 Brick rhs-client12:/rhs/bricks/b1-rep1 Number of entries: 5 at path on brick ----------------------------------- 2013-11-14 09:11:38 /test_file 2013-11-14 09:21:12 /test_file1 2013-11-14 09:31:13 /test_file1 2013-11-14 09:41:13 /test_file1 2013-11-14 09:45:39 /test_file1 Brick rhs-client13:/rhs/bricks/b1-rep2 Number of entries: 3 at path on brick ----------------------------------- 2013-11-14 09:21:11 /test_file1 2013-11-14 09:31:11 /test_file1 2013-11-14 09:41:11 /test_file1 Expected results: ================= 1. stat / ls should report I/O Error 2. log messages should say "metadata-self-heal failed" 3. All hardlinks should be reported in "heal info split-brain". (This is necessary to resolve all the files under split-brain state) Additional info: ================== root@rhs-client11 [Nov-14-2013- 9:45:48] >gluster v info Volume Name: vol_rep Type: Replicate Volume ID: 9cfbba56-f108-4ad2-94e7-930a45bb5d4c Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: rhs-client11:/rhs/bricks/b1 Brick2: rhs-client12:/rhs/bricks/b1-rep1 Brick3: rhs-client13:/rhs/bricks/b1-rep2 Options Reconfigured: nfs.disable: on cluster.self-heal-daemon: on
When a file is in metadata-split-brain , cat on the file from multiple mounts at the same time sometimes outputs the contents of the file and sometimes reports Input/Output Error. Whether "cat" should succeed on files which are in "metadata split brain" ? This behavior is yet to be defined.
Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/ If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.