Bug 1474736

Summary: Split-brain observed on a file during remove-brick operation
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Prasad Desala <tdesala>
Component: replicateAssignee: Karthik U S <ksubrahm>
Status: CLOSED DUPLICATE QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.3CC: ravishankar, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-24 09:18:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Prasad Desala 2017-07-25 09:55:43 UTC
Description of problem:
========================
Split-brain observed on a file during remove-brick operation

Version-Release number of selected component (if applicable):
3.8.4-35.el7rhgs.x86_64

How reproducible:
Reporting the first occurrence.

Steps to Reproduce:
===================
1) Create a 4x2 volume and start it.
2) FUSE mount it on multiple clients.
3) From client, fill the mount point with data in such a way that the back-end bricks reaches till 95%
4) Scale up the volume to 6x2 by adding bricks.
5) set min-free-disk value to 30% using below command,
gluster v set <vol-name> cluster.min-free-disk 30%
6) Remove 2 bricks and wait till the migration completes.

Rebalance logs:
===============
[2017-07-25 08:34:52.456723] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 0-distrep-replicate-0: Failing GETXATTR on gfid 2a899336-e575-45fc-be5b-8fe9633877ad: split-brain observed. [Input/output error]
[2017-07-25 08:34:52.457596] W [MSGID: 109023] [dht-rebalance.c:1689:dht_migrate_file] 0-distrep-dht: Migrate file failed:/dd_320: failed to get xattr from distrep-replicate-0 (Input/output error)
[2017-07-25 08:34:52.463645] E [MSGID: 101046] [afr-inode-write.c:1719:afr_fsetxattr] 0-distrep-replicate-4: setxattr dict is null
[2017-07-25 08:34:52.463748] W [MSGID: 109023] [dht-rebalance.c:810:__dht_rebalance_create_dst_file] 0-distrep-dht: /dd_320: failed to set xattr on distrep-replicate-4 (Cannot allocate memory)
[2017-07-25 08:34:52.509122] W [MSGID: 0] [dht-rebalance.c:983:__dht_check_free_space] 0-distrep-dht: Write will cross min-free-disk for file - /dd_320 on subvol - distrep-replicate-4. Looking for new subvol
[2017-07-25 08:34:52.509364] I [MSGID: 0] [dht-rebalance.c:1042:__dht_check_free_space] 0-distrep-dht: new target found - distrep-replicate-5 for file - /dd_320
[2017-07-25 08:34:52.625699] I [dht-rebalance.c:4805:gf_defrag_get_estimates_based_on_size] 0-glusterfs: TIME: (size) total_processed=964690013 tmp_cnt = 3327172608,rate_processed=30146562.906250, elapsed = 32.000000
[2017-07-25 08:34:52.625877] I [dht-rebalance.c:4953:gf_defrag_status_get] 0-glusterfs: TIME: Estimated total time to complete (size)= 110 seconds, seconds left = 78
[2017-07-25 08:34:52.626025] I [MSGID: 109028] [dht-rebalance.c:5033:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 32.00 secs
[2017-07-25 08:34:52.626079] I [MSGID: 109028] [dht-rebalance.c:5037:gf_defrag_status_get] 0-glusterfs: Files migrated: 46, size: 964689920, lookups: 162, failures: 0, skipped: 0
[2017-07-25 08:34:52.826438] I [MSGID: 0] [dht-rebalance.c:1747:dht_migrate_file] 0-distrep-dht: destination for file - /dd_299 is changed to - distrep-replicate-5
[2017-07-25 08:34:52.876652] E [MSGID: 101046] [afr-inode-write.c:1719:afr_fsetxattr] 0-distrep-replicate-5: setxattr dict is null
[2017-07-25 08:34:52.876790] W [MSGID: 109023] [dht-rebalance.c:810:__dht_rebalance_create_dst_file] 0-distrep-dht: /dd_320: failed to set xattr on distrep-replicate-5 (Cannot allocate memory)
[2017-07-25 08:34:52.895299] I [MSGID: 109022] [dht-rebalance.c:2211:dht_migrate_file] 0-distrep-dht: completed migration of /dd_286 from subvolume distrep-replicate-0 to distrep-replicate-5
[2017-07-25 08:34:53.025934] I [MSGID: 0] [dht-rebalance.c:1747:dht_migrate_file] 0-distrep-dht: destination for file - /dd_320 is changed to - distrep-replicate-5
[2017-07-25 08:34:53.030821] E [MSGID: 108008] [afr-transaction.c:2616:afr_write_txn_refresh_done] 0-distrep-replicate-0: Failing SETXATTR on gfid 2a899336-e575-45fc-be5b-8fe9633877ad: split-brain observed. [Input/output error]
[2017-07-25 08:34:53.031088] E [MSGID: 109023] [dht-rebalance.c:1255:__dht_rebalance_open_src_file] 0-distrep-dht: failed to set xattr on /dd_320 in distrep-replicate-0 (Input/output error)
[2017-07-25 08:34:53.031155] E [MSGID: 109023] [dht-rebalance.c:1766:dht_migrate_file] 0-distrep-dht: Migrate file failed: failed to open /dd_320 on distrep-replicate-0
[2017-07-25 08:34:53.064356] I [dht-rebalance.c:1579:dht_migrate_file] 0-distrep-dht: /dd_324: attempting to move from distrep-replicate-2 to distrep-replicate-5
[2017-07-25 08:34:53.295269] E [MSGID: 109023] [dht-rebalance.c:2744:gf_defrag_migrate_single_file] 0-distrep-dht: migrate-data failed for /dd_320 [Input/output error]
[2017-07-25 08:34:54.453912] I [MSGID: 109022] [dht-rebalance.c:2211:dht_migrate_file] 0-distrep-dht: completed migration of /dd_324 from subvolume distrep-replicate-2 to distrep-replicate-5
[2017-07-25 08:34:54.515583] I [MSGID: 109022] [dht-rebalance.c:2211:dht_migrate_file] 0-distrep-dht: completed migration of /dd_297 from subvolume distrep-replicate-2 to distrep-replicate-5
[2017-07-25 08:34:54.672775] I [MSGID: 109022] [dht-rebalance.c:2211:dht_migrate_file] 0-distrep-dht: completed migration of /dd_299 from subvolume distrep-replicate-2 to distrep-replicate-5
[2017-07-25 08:34:54.686976] I [MSGID: 109028] [dht-rebalance.c:5033:gf_defrag_status_get] 0-distrep-dht: Rebalance is completed. Time taken is 34.00 secs
[2017-07-25 08:34:54.687079] I [MSGID: 109028] [dht-rebalance.c:5037:gf_defrag_status_get] 0-distrep-dht: Files migrated: 50, size: 1046614016, lookups: 162, failures: 1, skipped: 0

FUSE logs:
==========
[2017-07-25 07:03:47.088858] E [MSGID: 108008] [afr-transaction.c:2616:afr_write_txn_refresh_done] 0-distrep-replicate-3: Failing WRITE on gfid 0078de69-a259-4dfd-b598-df3b22ace51a: split-brain observed. [Input/output error]
The message "E [MSGID: 108008] [afr-transaction.c:2616:afr_write_txn_refresh_done] 0-distrep-replicate-3: Failing WRITE on gfid 0078de69-a259-4dfd-b598-df3b22ace51a: split-brain observed. [Input/output error]" repeated 6 times between [2017-07-25 07:03:47.088858] and [2017-07-25 07:03:47.137491]
[2017-07-25 07:03:47.142314] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 0-distrep-replicate-3: Failing FGETXATTR on gfid 0078de69-a259-4dfd-b598-df3b22ace51a: split-brain observed. [Input/output error]
The message "E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done] 0-distrep-replicate-3: Failing FGETXATTR on gfid 0078de69-a259-4dfd-b598-df3b22ace51a: split-brain observed. [Input/output error]" repeated 6 times between [2017-07-25 07:03:47.142314] and [2017-07-25 07:03:47.150487]

Actual results:
===============
Split-brain observed on a file during remove-brick operation and the remove-brick operation failed to migrate the file in split-brain.

Expected results:
=================
No files should be in split-brain and the remove-brick should migrate all the files without any failures.

Comment 4 Karthik U S 2017-07-26 05:16:18 UTC
This bug is not related to remove-brick. There was a split brain on file dd_320 before the migration itself. From the logs I can see the first replica set where we found the slit brain was 100% full, (not 95% as mentioned in the description). Some writes were failed on the alternate replica sets. Due to which the split brain occurred. The writes were failing because of the disk getting full.
There is already a bug[1] for this and we have not decided on the solution yet for this.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1459166

Comment 7 Karthik U S 2017-08-24 09:18:02 UTC
Closing this as its a duplicate of bz #1459166

*** This bug has been marked as a duplicate of bug 1459166 ***