Bug 1275751

Summary: Data Tiering:File create terminates with "Input/output error" as split brain is observed
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: tierAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED ERRATA QA Contact: Sweta Anandpara <sanandpa>
Severity: urgent Docs Contact:
Priority: urgent    
Version: rhgs-3.1CC: aspandey, asrivast, dlambrig, ravishankar, rhs-bugs, rkavunga, sankarshan, spalai, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.5-13 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1285230 1286028 1319634 (view as bug list) Environment:
Last Closed: 2016-03-01 05:45:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1260783, 1260923, 1285230, 1286028, 1286029, 1290363, 1291557, 1358823    
Attachments:
Description Flags
Server and client logs none

Description Nag Pavan Chilakam 2015-10-27 16:07:41 UTC
Description of problem:
=========================
When attach tier during a file create using dd command of say about 500MB, the file create terminates due to following error:

dd: error writing ‘file2’: Input/output error
dd: closing output file ‘file2’: Input/output error

Tier log shows following error:

[root@rhel7-autofuseclient glusterfs]# cat mnt-tin.log|grep " E "
[2015-10-27 16:56:02.136132] E [MSGID: 108006] [afr-common.c:3881:afr_notify] 0-tin-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2015-10-27 16:56:02.136370] E [MSGID: 108006] [afr-common.c:3881:afr_notify] 0-tin-replicate-1: All subvolumes are down. Going offline until atleast one of them comes back up.
[2015-10-27 16:57:05.123426] E [MSGID: 109040] [dht-helper.c:1020:dht_migration_complete_check_task] 2-tin-cold-dht: (null): failed to lookup the file on tin-cold-dht [Stale file handle]
[2015-10-27 16:57:05.124695] E [MSGID: 108008] [afr-transaction.c:1975:afr_transaction] 2-tin-replicate-1: Failing WRITE on gfid 81fbd264-4e2d-45b7-9554-cfd030764f36: split-brain observed. [Input/output error]
[root@rhel7-autofuseclient glusterfs]# less mnt-tin.log 



How reproducible:
===================
not consistent but quite occasionally seen

Steps to Reproduce:
====================
1.create a dist-rep or ec volume and start it followed by enabling quota
2.now mount the volume and use dd command to create say 10 files of atleast 700MB each " for i in {1..10};do dd if=/dev/urandom of=file$i bs=1024 count=700000;echo $?;done"

3.Now wait for one file create to get completed and when file2 create is in progress attach a dist-rep tier and enable ctr immediately




Actual results:
==============
error can be seen as below on fuse mount:
[root@rhel7-autofuseclient tin]# for i in {1..10};do dd if=/dev/urandom of=file$i bs=1024 count=700000;echo $?;done
700000+0 records in
700000+0 records out
716800000 bytes (717 MB) copied, 408.87 s, 1.8 MB/s
0
dd: error writing ‘file2’: Input/output error
dd: closing output file ‘file2’: Input/output error
1
700000+0 records in
700000+0 records out
716800000 bytes (717 MB) copied, 345.21 s, 2.1 MB/s
0
^C301502+0 records in
301501+0 records out
308737024 bytes (309 MB) copied, 142.061 s, 2.2 MB/s


LOGS:
====


[root@zod ~]#  gluster v create tin rep 2 zod:/rhs/brick1/tin yarrow:/rhs/brick1/tin zod:/rhs/brick2/tin yarrow:/rhs/brick2/tin
volume create: tin: success: please start the volume to access data
[root@zod ~]# gluster v start bell
volume start: bell: failed: Volume bell already started
[root@zod ~]# gluster v start tin
volume start: tin: success
[root@zod ~]# gluster v quota tin enable
volume quota : success
[root@zod ~]# gluster v attach-tier tin rep 2 zod:/rhs/brick7/tin_hot yarrow:/rhs/brick7/tin_hot zod:/rhs/brick6/tin_hot yarrow:/rhs/brick6/tin_hot; 
Attach tier is recommended only for testing purposes in this release. Do you want to continue? (y/n) y
volume attach-tier: success
volume rebalance: tin: success: Rebalance on tin has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: 3161efbb-36b7-4b78-a314-7e8aa5d320fa

[root@zod ~]# gluster v set tin features.ctr-enabled on
volume set: success
[root@zod ~]# gluster v status tin
Status of volume: tin
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick yarrow:/rhs/brick6/tin_hot            49183     0          Y       24462
Brick zod:/rhs/brick6/tin_hot               49183     0          Y       26303
Brick yarrow:/rhs/brick7/tin_hot            49182     0          Y       24438
Brick zod:/rhs/brick7/tin_hot               49182     0          Y       26285
Cold Bricks:
Brick zod:/rhs/brick1/tin                   49180     0          Y       25646
Brick yarrow:/rhs/brick1/tin                49180     0          Y       23911
Brick zod:/rhs/brick2/tin                   49181     0          Y       25664
Brick yarrow:/rhs/brick2/tin                49181     0          Y       23935
NFS Server on localhost                     2049      0          Y       26322
Self-heal Daemon on localhost               N/A       N/A        Y       26336
Quota Daemon on localhost                   N/A       N/A        Y       26352
NFS Server on yarrow                        2049      0          Y       24506
Self-heal Daemon on yarrow                  N/A       N/A        Y       24524
Quota Daemon on yarrow                      N/A       N/A        Y       24532
 
Task Status of Volume tin
------------------------------------------------------------------------------
Task                 : Tier migration      
ID                   : 3161efbb-36b7-4b78-a314-7e8aa5d320fa
Status               : in progress         
 
[root@zod ~]# gluster v info tin
 
Volume Name: tin
Type: Tier
Volume ID: c56894ad-d33e-4124-8600-da830891e0c1
Status: Started
Number of Bricks: 8
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: yarrow:/rhs/brick6/tin_hot
Brick2: zod:/rhs/brick6/tin_hot
Brick3: yarrow:/rhs/brick7/tin_hot
Brick4: zod:/rhs/brick7/tin_hot
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick5: zod:/rhs/brick1/tin
Brick6: yarrow:/rhs/brick1/tin
Brick7: zod:/rhs/brick2/tin
Brick8: yarrow:/rhs/brick2/tin
Options Reconfigured:
features.ctr-enabled: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
[root@zod ~]#

Comment 3 Mohammed Rafi KC 2015-11-10 09:36:13 UTC
Tried to reproduce with the latest code base, but was unable to reproduce this issue. 

set up details :

two nodes,
cold tier = 2*2 dist-rep

hot tire = 2*2 dist-rep

client as one of the server.

Sterps tried: 

created the a volume with cold tier configuration.
started the volume.
enabled quota
fuse mounted
and started creating 10 files with size 1gb in a loop using dd
after writing one file during the writing of second file, attached hot tier.
continued to run the i/0

result : no split brain observed 


I repeated this 10 times.

Note: I tried this in a development vm, that might be the reason. I will keep trying to reproduce this issue.

Comment 4 Mohammed Rafi KC 2015-12-01 09:48:19 UTC
upstream patch : http://review.gluster.org/#/c/12745/

Comment 9 Vivek Agarwal 2015-12-24 11:55:51 UTC
*** Bug 1286028 has been marked as a duplicate of this bug. ***

Comment 10 Sweta Anandpara 2015-12-30 05:32:09 UTC
Tested and verified this on the build glusterfs-3.7.5-13.el7rhgs.x86_64

Had a 2 * (4+2) disperse volume as cold tier and 2*2 dist-rep volume as hot tier, with quota enabled on the volume.

Created 10 files of 700mb using dd, and repeatedly did attach-tier and detach-tier while file creation was in progress. The file creation and movement between hot and cold tiers continued as expected, and no IO error was seen. The cluster continued to be in a healthy state. 

Moving this bug to verified in 3.1.2. Detailed logs are attached.

Comment 11 Sweta Anandpara 2015-12-30 05:32:36 UTC
Created attachment 1110442 [details]
Server and client logs

Comment 13 errata-xmlrpc 2016-03-01 05:45:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html