| Summary: | Self-heal fails (split brain) on existing data preloaded in brick | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Louis Zuckerman <glusterbugs> |
| Component: | replicate | Assignee: | Pranith Kumar K <pkarampu> |
| Status: | CLOSED DUPLICATE | QA Contact: | |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | mainline | CC: | gluster-bugs |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
In all released versions so far, you can create a replicated volume with existing data in one brick, then repair, and glusterfs will self-heal the preloaded data onto the other (blank) replica brick. Now in the latest git master this does not work anymore, and the client reports split brain. Here is a procedure to reproduce the bug. This procedure works on release 3.2.3 and earlier, but not on the git master (as of friday september 23.)
To reproduce:
root.100.242# mkdir /var/tmp/test{0..1}
root.100.242# echo healme > /var/tmp/test0/healme
root.100.242# gluster volume create test replica 2 10.168.100.242:/var/tmp/test{0..1}
root.100.242# gluster volume start test
root.100.242# mount -t glusterfs localhost:test /mnt/test/
root.100.242# stat /mnt/test/healme
stat: cannot stat `/mnt/test/healme': Input/output error
root.100.242# ls -lA /mnt/test
ls: cannot access /mnt/test/healme: Input/output error
total 0
?????????? ? ? ? ? ? healme
root.100.242# ls -lA /var/tmp/test*
/var/tmp/test0:
total 8
-rw-r--r-- 1 root root 7 Sep 29 00:25 healme
/var/tmp/test1:
total 4
-rw-r--r-- 1 root root 0 Sep 29 00:25 healme
root.100.242# getfattr -m . -d -e hex /var/tmp/test*/healme
# file: var/tmp/test0/healme
trusted.gfid=0x00000000000000000000000000000000
# file: var/tmp/test1/healme
trusted.gfid=0x00000000000000000000000000000000
root.100.242# cat /var/log/glusterfs/mnt-test-.log
[2011-09-29 00:25:36.961071] I [glusterfsd.c:1569:main] 0-/usr/sbin/glusterfs: Started Running /usr/sbin/glusterfs version 3git
[2011-09-29 00:25:37.10589] I [client.c:1937:notify] 0-test-client-0: parent translators are ready, attempting connect on transport
[2011-09-29 00:25:37.11036] I [client.c:1937:notify] 0-test-client-1: parent translators are ready, attempting connect on transport
Given volfile:
+------------------------------------------------------------------------------+
1: volume test-client-0
2: type protocol/client
3: option remote-host 10.168.100.242
4: option remote-subvolume /var/tmp/test0
5: option transport-type tcp
6: end-volume
7:
8: volume test-client-1
9: type protocol/client
10: option remote-host 10.168.100.242
11: option remote-subvolume /var/tmp/test1
12: option transport-type tcp
13: end-volume
14:
15: volume test-replicate-0
16: type cluster/replicate
17: subvolumes test-client-0 test-client-1
18: end-volume
19:
20: volume test-write-behind
21: type performance/write-behind
22: subvolumes test-replicate-0
23: end-volume
24:
25: volume test-read-ahead
26: type performance/read-ahead
27: subvolumes test-write-behind
28: end-volume
29:
30: volume test-io-cache
31: type performance/io-cache
32: subvolumes test-read-ahead
33: end-volume
34:
35: volume test-quick-read
36: type performance/quick-read
37: subvolumes test-io-cache
38: end-volume
39:
40: volume test-stat-prefetch
41: type performance/stat-prefetch
42: subvolumes test-quick-read
43: end-volume
44:
45: volume test
46: type debug/io-stats
47: option latency-measurement off
48: option count-fop-hits off
49: subvolumes test-stat-prefetch
50: end-volume
+------------------------------------------------------------------------------+
[2011-09-29 00:25:37.78291] I [rpc-clnt.c:1591:rpc_clnt_reconfig] 0-test-client-0: changing port to 24009 (from 0)
[2011-09-29 00:25:37.78518] I [rpc-clnt.c:1591:rpc_clnt_reconfig] 0-test-client-1: changing port to 24010 (from 0)
[2011-09-29 00:25:40.989321] I [client-handshake.c:1085:select_server_supported_programs] 0-test-client-0: Using Program GlusterFS 3git, Num (1298437), Version (310)
[2011-09-29 00:25:40.989678] I [client-handshake.c:1085:select_server_supported_programs] 0-test-client-1: Using Program GlusterFS 3git, Num (1298437), Version (310)
[2011-09-29 00:25:40.989956] I [client-handshake.c:917:client_setvolume_cbk] 0-test-client-0: Connected to 10.168.100.242:24009, attached to remote volume '/var/tmp/test0'.
[2011-09-29 00:25:40.989987] I [afr-common.c:3455:afr_notify] 0-test-replicate-0: Subvolume 'test-client-0' came back up; going online.
[2011-09-29 00:25:40.990024] I [client-handshake.c:917:client_setvolume_cbk] 0-test-client-1: Connected to 10.168.100.242:24010, attached to remote volume '/var/tmp/test1'.
[2011-09-29 00:25:40.990043] I [afr-common.c:3459:afr_notify] 0-test-replicate-0: subvol 1 came up, start crawl
[2011-09-29 00:25:40.990060] I [afr-common.c:3554:afr_notify] 0-test-replicate-0: All subvolumes came up, start crawl
[2011-09-29 00:25:40.999254] I [fuse-bridge.c:3340:fuse_graph_setup] 0-fuse: switched to graph 0
[2011-09-29 00:25:40.999478] I [fuse-bridge.c:2924:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.16
[2011-09-29 00:25:41.75] I [afr-common.c:1757:afr_set_root_inode_on_first_lookup] 0-test-replicate-0: added root inode
[2011-09-29 00:25:44.974981] I [afr-common.c:1082:afr_detect_self_heal_by_iatt] 0-test-replicate-0: size differs for /healme
[2011-09-29 00:25:44.975019] I [afr-common.c:1233:afr_launch_self_heal] 0-test-replicate-0: background data gfid self-heal triggered. path: /healme, reason: lookup detected pending operations
[2011-09-29 00:25:44.976261] I [afr-self-heal-common.c:967:afr_sh_missing_entries_done] 0-test-replicate-0: split brain found, aborting selfheal of /healme
[2011-09-29 00:25:44.976293] E [afr-self-heal-common.c:2009:afr_self_heal_completion_cbk] 0-test-replicate-0: background data gfid self-heal failed on /healme
[2011-09-29 00:25:44.976352] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 4: LOOKUP() /healme => -1 (Input/output error)
[2011-09-29 00:25:53.679354] I [afr-common.c:1082:afr_detect_self_heal_by_iatt] 0-test-replicate-0: size differs for /healme
[2011-09-29 00:25:53.679393] I [afr-common.c:1233:afr_launch_self_heal] 0-test-replicate-0: background data gfid self-heal triggered. path: /healme, reason: lookup detected pending operations
[2011-09-29 00:25:53.680630] I [afr-self-heal-common.c:967:afr_sh_missing_entries_done] 0-test-replicate-0: split brain found, aborting selfheal of /healme
[2011-09-29 00:25:53.680663] E [afr-self-heal-common.c:2009:afr_self_heal_completion_cbk] 0-test-replicate-0: background data gfid self-heal failed on /healme
[2011-09-29 00:25:53.680689] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 6: LOOKUP() /healme => -1 (Input/output error)
[2011-09-29 00:25:57.975394] I [afr-common.c:1082:afr_detect_self_heal_by_iatt] 0-test-replicate-0: size differs for /healme
[2011-09-29 00:25:57.975430] I [afr-common.c:1233:afr_launch_self_heal] 0-test-replicate-0: background data gfid self-heal triggered. path: /healme, reason: lookup detected pending operations
[2011-09-29 00:25:57.976869] I [afr-self-heal-common.c:967:afr_sh_missing_entries_done] 0-test-replicate-0: split brain found, aborting selfheal of /healme
[2011-09-29 00:25:57.976903] E [afr-self-heal-common.c:2009:afr_self_heal_completion_cbk] 0-test-replicate-0: background data gfid self-heal failed on /healme
[2011-09-29 00:25:57.976929] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 8: LOOKUP() /healme => -1 (Input/output error)
[2011-09-29 00:26:02.351169] I [afr-common.c:1082:afr_detect_self_heal_by_iatt] 0-test-replicate-0: size differs for /healme
[2011-09-29 00:26:02.351206] I [afr-common.c:1233:afr_launch_self_heal] 0-test-replicate-0: background data gfid self-heal triggered. path: /healme, reason: lookup detected pending operations
[2011-09-29 00:26:02.352374] I [afr-self-heal-common.c:967:afr_sh_missing_entries_done] 0-test-replicate-0: split brain found, aborting selfheal of /healme
[2011-09-29 00:26:02.352406] E [afr-self-heal-common.c:2009:afr_self_heal_completion_cbk] 0-test-replicate-0: background data gfid self-heal failed on /healme
[2011-09-29 00:26:02.352435] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 15: LOOKUP() /healme => -1 (Input/output error)
[2011-09-29 00:26:05.590499] I [afr-common.c:1082:afr_detect_self_heal_by_iatt] 0-test-replicate-0: size differs for /healme
[2011-09-29 00:26:05.590536] I [afr-common.c:1233:afr_launch_self_heal] 0-test-replicate-0: background data gfid self-heal triggered. path: /healme, reason: lookup detected pending operations
[2011-09-29 00:26:05.591704] I [afr-self-heal-common.c:967:afr_sh_missing_entries_done] 0-test-replicate-0: split brain found, aborting selfheal of /healme
[2011-09-29 00:26:05.591737] E [afr-self-heal-common.c:2009:afr_self_heal_completion_cbk] 0-test-replicate-0: background data gfid self-heal failed on /healme
[2011-09-29 00:26:05.591766] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 26: LOOKUP() /healme => -1 (Input/output error)
|
hi, I think I already fixed this as part of 3557. The patch is getting reviewed. Its not on master yet. pranith @ ~/workspace/gerrit-repo/build 06:43:11 :) $ sudo gluster volume create vol replica 2 `hostname`:/tmp/{1,2} Multiple bricks of a replicate volume are present on the same server. This setup is not optimal. Do you still want to continue creating the volume? (y/n) y Creation of volume vol has been successful. Please start the volume to access data. pranith @ ~/workspace/gerrit-repo/build 06:44:10 :( $ sudo su - root@pranith-Dell-System-Vostro-3450:~# echo healme > /tmp/1/healme root@pranith-Dell-System-Vostro-3450:~# gluster volume start vol Starting volume vol has been successful root@pranith-Dell-System-Vostro-3450:~# sudo mount -t glusterfs `hostname`:/vol /mnt/ root@pranith-Dell-System-Vostro-3450:~# cd !$ cd /mnt/ root@pranith-Dell-System-Vostro-3450:/mnt# ls -l total 8 -rw-r--r-- 1 root root 7 2011-09-29 06:44 healme root@pranith-Dell-System-Vostro-3450:/mnt# cat healme healme root@pranith-Dell-System-Vostro-3450:/mnt# getfattr -d -m . -e hex /tmp/{1,2}/healme getfattr: Removing leading '/' from absolute path names # file: tmp/1/healme trusted.afr.vol-client-0=0x000000000000000000000000 trusted.afr.vol-client-1=0x000000000000000000000000 trusted.gfid=0x1b0b27b920ff49368ce8047fa5d5fcee # file: tmp/2/healme trusted.afr.vol-client-0=0x000000000000000000000000 trusted.afr.vol-client-1=0x000000000000000000000000 trusted.gfid=0x1b0b27b920ff49368ce8047fa5d5fcee Pranith *** This bug has been marked as a duplicate of bug 3557 ***