Bug 825559 - [glusterfs-3.3.0q43]: Cannot heal split-brain
Summary: [glusterfs-3.3.0q43]: Cannot heal split-brain
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: pre-release
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
Assignee: Divya
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-05-27 19:38 UTC by Joe Julian
Modified: 2015-10-22 15:40 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-10-22 15:40:20 UTC
Regression: ---
Mount Type: fuse
Documentation: DP
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
Client log (18.34 KB, text/x-log)
2012-05-27 19:38 UTC, Joe Julian
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 193843 0 None None None 2012-08-22 13:33:39 UTC

Description Joe Julian 2012-05-27 19:38:34 UTC
Created attachment 587108 [details]
Client log

Description of problem:
I had a file that was split-brained (replica 3). I deleted the two offending copies. On the client, I called stat on the file. The log reported the split-brain condition again. The two servers that had the file removed had the wrong file returned.

Version-Release number of selected component (if applicable):
3.3.0qa34

How reproducible:
Always

Steps to Reproduce:
1. File in split-brain condition on a replica3 setup
2. delete the two incorrect copies
3. stat filename through fuse client mount
  
Actual results:
"bad" copies of the file were returned to existance

Expected results:
"good" file should have been healed to the other two servers

Additional info:
gluster volume info mysql1
 
Volume Name: mysql1
Type: Distributed-Replicate
Volume ID: 713e6220-7eb2-4d12-9cd7-fe800b9b741a
Status: Started
Number of Bricks: 4 x 3 = 12
Transport-type: tcp
Bricks:
Brick1: ewcs2:/var/spool/glusterfs/a_mysql1
Brick2: ewcs4:/var/spool/glusterfs/a_mysql1
Brick3: ewcs7:/var/spool/glusterfs/a_mysql1
Brick4: ewcs2:/var/spool/glusterfs/b_mysql1
Brick5: ewcs4:/var/spool/glusterfs/b_mysql1
Brick6: ewcs7:/var/spool/glusterfs/b_mysql1
Brick7: ewcs2:/var/spool/glusterfs/c_mysql1
Brick8: ewcs4:/var/spool/glusterfs/c_mysql1
Brick9: ewcs7:/var/spool/glusterfs/c_mysql1
Brick10: ewcs2:/var/spool/glusterfs/d_mysql1
Brick11: ewcs4:/var/spool/glusterfs/d_mysql1
Brick12: ewcs7:/var/spool/glusterfs/d_mysql1
Options Reconfigured:
performance.io-cache: off
performance.write-behind: off
performance.flush-behind: off
performance.read-ahead: off
performance.quick-read: off
performance.io-thread-count: 1
nfs.disable: on

### INITIAL STATE
[root@ewcs2 glusterfs]# stat ?_mysql1/mariadb/mysql/tables_priv.MYI
  File: `a_mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 4096            Blocks: 8          IO Block: 4096   regular file
Device: fd15h/64789d    Inode: 131692      Links: 2
Access: (0660/-rw-rw----)  Uid: (  101/ UNKNOWN)   Gid: (  104/ UNKNOWN)
Access: 2012-05-27 02:42:44.614663982 -0700
Modify: 2012-05-12 09:54:21.593590726 -0700
Change: 2012-05-27 12:09:28.253590969 -0700
  File: `b_mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: fd16h/64790d    Inode: 131279      Links: 2
Access: (1000/---------T)  Uid: (  101/ UNKNOWN)   Gid: (  104/ UNKNOWN)
Access: 2011-10-19 07:58:05.222373479 -0700
Modify: 2011-10-19 07:58:05.222373479 -0700
Change: 2012-05-27 02:42:44.580645011 -0700
[root@ewcs4 glusterfs]# stat ?_mysql1/mariadb/mysql/tables_priv.MYI
  File: `a_mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 481067008       Blocks: 7056       IO Block: 4096   regular file
Device: fd1fh/64799d    Inode: 1387        Links: 2
Access: (0660/-rw-rw----)  Uid: (  101/memcached)   Gid: (  104/   mysql)
Access: 2012-05-07 08:00:11.015362130 -0700
Modify: 2012-05-27 10:42:11.256736693 -0700
Change: 2012-05-27 12:09:28.680561940 -0700
  File: `b_mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: fd20h/64800d    Inode: 157         Links: 2
Access: (1000/---------T)  Uid: (  101/memcached)   Gid: (  104/   mysql)
Access: 2011-10-19 07:58:05.229457618 -0700
Modify: 2011-10-19 07:58:05.229457618 -0700
Change: 2012-05-27 02:42:44.579439785 -0700
[root@ewcs7 glusterfs]# stat ?_mysql1/mariadb/mysql/tables_priv.MYI
  File: `a_mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 4096            Blocks: 16         IO Block: 4096   regular file
Device: fd0eh/64782d    Inode: 1061        Links: 2
Access: (0660/-rw-rw----)  Uid: (  101/ UNKNOWN)   Gid: (  104/ UNKNOWN)
Access: 2012-03-29 07:02:57.347548848 -0700
Modify: 2012-05-12 09:54:21.610919731 -0700
Change: 2012-05-27 12:09:28.254849964 -0700
  File: `b_mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: fd0fh/64783d    Inode: 1109        Links: 2
Access: (1000/---------T)  Uid: (  101/ UNKNOWN)   Gid: (  104/ UNKNOWN)
Access: 2011-10-19 07:58:05.223467302 -0700
Modify: 2011-10-19 07:58:05.223467302 -0700
Change: 2012-05-27 02:42:44.582881366 -0700

### Trigger self-heal check
stat /mnt/gluster/mysql1/mariadb/mysql/tables_priv.MYI
  File: `/mnt/gluster/mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 4096      	Blocks: 16         IO Block: 131072 regular file
Device: 16h/22d	Inode: 9904536171659694209  Links: 1
Access: (0660/-rw-rw----)  Uid: (  101/   mysql)   Gid: (  104/   mysql)
Access: 2012-03-29 07:02:57.347548848 -0700
Modify: 2012-05-12 09:54:21.610919731 -0700
Change: 2012-05-27 12:09:28.254849964 -0700

### Client Log
[2012-05-27 12:15:51.595646] W [afr-self-heal-data.c:826:afr_lookup_select_read_child_by_txn_type] 0-mysql1-replicate-0: /mariadb/mysql/tables_priv.MYI: Possible split-brain
[2012-05-27 12:15:51.595753] W [afr-common.c:1227:afr_detect_self_heal_by_lookup_status] 0-mysql1-replicate-0: split brain detected during lookup of /mariadb/mysql/tables_priv.MYI.
[2012-05-27 12:15:51.595778] I [afr-common.c:1189:afr_detect_self_heal_by_iatt] 0-mysql1-replicate-0: size differs for /mariadb/mysql/tables_priv.MYI 
[2012-05-27 12:15:51.595849] I [afr-common.c:1341:afr_launch_self_heal] 0-mysql1-replicate-0: background  data missing-entry gfid self-heal triggered. path: /mariadb/mysql/tables_priv.MYI, reason: lookup detected pending operations
[2012-05-27 12:15:51.597998] I [afr-self-heal-common.c:1318:afr_sh_missing_entries_lookup_done] 0-mysql1-replicate-0: No sources for dir of /mariadb/mysql/tables_priv.MYI, in missing entry self-heal, continuing with the rest of the self-heals
[2012-05-27 12:15:51.601431] E [afr-self-heal-data.c:769:afr_sh_data_fxattrop_fstat_done] 0-mysql1-replicate-0: Unable to self-heal contents of '/mariadb/mysql/tables_priv.MYI' (possible split-brain). Please delete the file from all but the preferred subvolume.
[2012-05-27 12:15:51.602002] E [afr-self-heal-common.c:2156:afr_self_heal_completion_cbk] 0-mysql1-replicate-0: background  data missing-entry gfid self-heal failed on /mariadb/mysql/tables_priv.MYI

### Remove offending copies (and their sticky pointers)
[root@ewcs2 glusterfs]# rm ?_mysql1/mariadb/mysql/tables_priv.MYI
rm: remove regular file `a_mysql1/mariadb/mysql/tables_priv.MYI'? y
rm: remove regular empty file `b_mysql1/mariadb/mysql/tables_priv.MYI'? y
[root@ewcs7 glusterfs]# rm ?_mysql1/mariadb/mysql/tables_priv.MYI
rm: remove regular file `a_mysql1/mariadb/mysql/tables_priv.MYI'? y
rm: remove regular empty file `b_mysql1/mariadb/mysql/tables_priv.MYI'? y

### Trigger self-heal
[root@mysql1 glusterfs]# stat /mnt/gluster/mysql1/mariadb/mysql/tables_priv.MYI
  File: `/mnt/gluster/mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 481067008 	Blocks: 7056       IO Block: 131072 regular file
Device: 16h/22d	Inode: 9904536171659694209  Links: 1
Access: (0660/-rw-rw----)  Uid: (  101/   mysql)   Gid: (  104/   mysql)
Access: 2012-05-07 08:00:11.015362130 -0700
Modify: 2012-05-27 10:42:11.256736693 -0700
Change: 2012-05-27 12:15:51.594837600 -0700

### Results
[root@ewcs2 glusterfs]# stat ?_mysql1/mariadb/mysql/tables_priv.MYI
  File: `a_mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 4096            Blocks: 8          IO Block: 4096   regular file
Device: fd15h/64789d    Inode: 131692      Links: 2
Access: (0660/-rw-rw----)  Uid: (  101/ UNKNOWN)   Gid: (  104/ UNKNOWN)
Access: 2012-05-27 02:42:44.614663982 -0700
Modify: 2012-05-12 09:54:21.593590726 -0700
Change: 2012-05-27 12:18:32.113589824 -0700
[root@ewcs4 glusterfs]# stat ?_mysql1/mariadb/mysql/tables_priv.MYI
  File: `a_mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 481067008       Blocks: 7056       IO Block: 4096   regular file
Device: fd1fh/64799d    Inode: 1387        Links: 2
Access: (0660/-rw-rw----)  Uid: (  101/memcached)   Gid: (  104/   mysql)
Access: 2012-05-07 08:00:11.015362130 -0700
Modify: 2012-05-27 10:42:11.256736693 -0700
Change: 2012-05-27 12:18:32.113699390 -0700
  File: `b_mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: fd20h/64800d    Inode: 157         Links: 2
Access: (1000/---------T)  Uid: (  101/memcached)   Gid: (  104/   mysql)
Access: 2011-10-19 07:58:05.229457618 -0700
Modify: 2011-10-19 07:58:05.229457618 -0700
Change: 2012-05-27 02:42:44.579439785 -0700
[root@ewcs7 glusterfs]# stat ?_mysql1/mariadb/mysql/tables_priv.MYI
  File: `a_mysql1/mariadb/mysql/tables_priv.MYI'
  Size: 4096            Blocks: 16         IO Block: 4096   regular file
Device: fd0eh/64782d    Inode: 1061        Links: 2
Access: (0660/-rw-rw----)  Uid: (  101/ UNKNOWN)   Gid: (  104/ UNKNOWN)
Access: 2012-03-29 07:02:57.347548848 -0700
Modify: 2012-05-12 09:54:21.610919731 -0700
Change: 2012-05-27 12:18:32.114921148 -0700

### New client log entries
[2012-05-27 12:18:32.111760] I [afr-common.c:1215:afr_detect_self_heal_by_lookup_status] 0-mysql1-replicate-0: entries are missing in lookup of /mariadb/mysql/tables_priv.MYI.
[2012-05-27 12:18:32.111873] I [afr-common.c:1341:afr_launch_self_heal] 0-mysql1-replicate-0: background  meta-data data entry missing-entry gfid self-heal triggered. path: /mariadb/mysql/tables_priv.MYI, reason: lookup detected pending operations
[2012-05-27 12:18:32.113830] E [afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler] 0-mysql1-replicate-0: path /mariadb/mysql/tables_priv.MYI on subvolume mysql1-client-0 => -1 (No such file or directory)
[2012-05-27 12:18:32.113941] E [afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler] 0-mysql1-replicate-0: path /mariadb/mysql/tables_priv.MYI on subvolume mysql1-client-2 => -1 (No such file or directory)
[2012-05-27 12:18:32.120354] E [afr-self-heal-data.c:769:afr_sh_data_fxattrop_fstat_done] 0-mysql1-replicate-0: Unable to self-heal contents of '/mariadb/mysql/tables_priv.MYI' (possible split-brain). Please delete the file from all but the preferred subvolume.
[2012-05-27 12:18:32.121019] E [afr-self-heal-common.c:2156:afr_self_heal_completion_cbk] 0-mysql1-replicate-0: background  meta-data data entry missing-entry gfid self-heal failed on /mariadb/mysql/tables_priv.MYI

Comment 1 Joe Julian 2012-05-28 06:46:27 UTC
Must be tired. I've been saying 34 half the time when I mean 43. This is qa43.

Comment 2 Pranith Kumar K 2012-05-28 07:22:27 UTC
Joe,
    In the 3.3 versions of glusterfs the file is stored in both its 'path' and inside the gfid backend. If you just delete the file at 'path' it will perform link self-heal and the deleted file will come back with all of its split-brain xattrs. Do the following steps to delete the file completely:

1) get the gfid of the file using getfattr -d -m trusted.gfid -e hex <file-path>.
trusted.gfid=0x42e989d0c2694d6fad4204b3071a878a

2) Delete the file <brick1-path>/.glusterfs/42/e9/42e989d0-c269-4d6f-ad42-04b3071a878a along with the <file-path> On the bricks you want to delete.

Let me know if you still face problems.

Thanks
Pranith.

Comment 3 Joe Julian 2012-05-28 08:44:46 UTC
If this is documented, go ahead and close this. If not, please move it over to Gluster Documentation.

Comment 6 Kaleb KEITHLEY 2015-10-22 15:40:20 UTC
pre-release version is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.


Note You need to log in before you can comment on or make changes to this bug.