Bug 833727

Summary: appending data to a file on fuse mount fails with EIO
Product: [Community] GlusterFS Reporter: Shwetha Panduranga <shwetha.h.panduranga>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.3-betaCC: aavati, gluster-bugs, jdarcy
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 832305
: 840394 (view as bug list) Environment:
Last Closed: 2013-07-24 17:51:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 832305    
Bug Blocks: 840394, 840820    

Description Shwetha Panduranga 2012-06-20 08:06:56 UTC
+++ This bug was initially created as a clone of Bug #832305 +++

Description of problem:
------------------------
appending data continuously to a file using "echo" command on a fuse mount  reports EIO after sometime. 

[06/15/12 - 01:47:31 root@ARF-Client1 ~]# ls -lh /mnt/gfsc1/output_file.20120615010422 
-rw-r--r--. 1 root root 14M Jun 15 01:27 /mnt/gfsc1/output_file.20120615010422


Version-Release number of selected component (if applicable):
------------------------------------------------------------
3.3.0qa45

How reproducible:
-----------------
Often

append_data.sh:-
---------------
#!/bin/bash

path=$1
start_time=$(date +%s)
append=`date '+%Y%m%d%H%M%S'`

for i in {1..100000};do
        echo -e "Date: `date`\n" >> $path/output_file.$append
        echo -e "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" >> $path/output_file.$append
	echo -e "Calender:\n" >> $path/output_file.$append
	echo -e "`cal`\n" >> $path/output_file.$append
	echo -e "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" >> $path/output_file.$append
done
finish_time=$(date +%s)
echo "Time duration: $((finish_time - start_time)) secs." >> $path/output_file.$append

Steps to Reproduce:
-------------------
1.Create a replicate volume (1x3)

2.Set "write-behind on" and "eager-lock on" on the volume

3.Start the volume

4.Create a fuse mount.

5.execute the script "append_data <mount_point>" . mount_point as argument to the script. 

6.After sometime the script fails with Input/Output Error

Actual results:
---------------
[2012-06-15 01:27:51.461317] W [afr-self-heal-data.c:831:afr_lookup_select_read_child_by_txn_type] 0-vol-replicate-0: /output_file.20120615010422: Possible split-brain
[2012-06-15 01:27:51.461524] W [afr-common.c:1226:afr_detect_self_heal_by_lookup_status] 0-vol-replicate-0: split brain detected during lookup of /output_file.20120615010422.
[2012-06-15 01:27:51.461736] W [afr-open.c:213:afr_open] 0-vol-replicate-0: failed to open as split brain seen, returning EIO
[2012-06-15 01:27:51.461972] W [fuse-bridge.c:713:fuse_fd_cbk] 0-glusterfs-fuse: 2565915: OPEN() /output_file.20120615010422 => -1 (Input/output error)
[2012-06-15 01:27:51.462492] W [afr-open.c:213:afr_open] 0-vol-replicate-0: failed to open as split brain seen, returning EIO
[2012-06-15 01:27:51.462560] W [fuse-bridge.c:713:fuse_fd_cbk] 0-glusterfs-fuse: 2565916: OPEN() /output_file.20120615010422 => -1 (Input/output error)
[2012-06-15 01:27:51.465513] W [afr-open.c:213:afr_open] 0-vol-replicate-0: failed to open as split brain seen, returning EIO
[2012-06-15 01:27:51.465558] W [fuse-bridge.c:713:fuse_fd_cbk] 0-glusterfs-fuse: 2565917: OPEN() /output_file.20120615010422 => -1 (Input/output error)
[2012-06-15 01:27:51.466011] W [afr-open.c:213:afr_open] 0-vol-replicate-0: failed to open as split brain seen, returning EIO
[2012-06-15 01:27:51.466056] W [fuse-bridge.c:713:fuse_fd_cbk] 0-glusterfs-fuse: 2565918: OPEN() /output_file.20120615010422 => -1 (Input/output error)
[2012-06-15 01:27:51.468327] W [afr-open.c:213:afr_open] 0-vol-replicate-0: failed to open as split brain seen, returning EIO

Output of stat on file after the script failure:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[06/15/12 - 01:40:28 root@ARF-Client1 ~]# stat /mnt/gfsc1/output_file.20120615010422
  File: `/mnt/gfsc1/output_file.20120615010422'
  Size: 13812273  	Blocks: 32776      IO Block: 131072 regular file
Device: 15h/21d	Inode: 9581628177895579223  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2012-06-15 01:04:27.966005057 -0400
Modify: 2012-06-15 01:27:51.459004464 -0400
Change: 2012-06-15 01:27:51.460121389 -0400


Fuse Mount log when executed stat:-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[2012-06-15 01:47:31.337301] W [afr-common.c:1226:afr_detect_self_heal_by_lookup_status] 0-vol-replicate-0: split brain detected during lookup of /output_file.20120615010422.
[2012-06-15 01:47:31.337387] I [afr-common.c:1340:afr_launch_self_heal] 0-vol-replicate-0: background  data gfid self-heal triggered. path: /output_file.20120615010422, reason: lookup detected pending operations
[2012-06-15 01:47:31.339249] I [afr-self-heal-common.c:1189:afr_sh_missing_entry_call_impunge_recreate] 0-vol-replicate-0: no missing files - /output_file.20120615010422. proceeding to metadata check
[2012-06-15 01:47:31.340112] I [afr-self-heal-common.c:994:afr_sh_missing_entries_done] 0-vol-replicate-0: split brain found, aborting selfheal of /output_file.20120615010422
[2012-06-15 01:47:31.340185] E [afr-self-heal-common.c:2156:afr_self_heal_completion_cbk] 0-vol-replicate-0: background  data gfid self-heal failed on /output_file.20120615010422


Xattrs of file after split-brain reported:-
------------------------------------------
brick3:
~~~~~~~
[06/15/12 - 01:41:18 root@AFR-Server3 ~]# getfattr -d -m. -ehex /export_b1/dir1/output_file.20120615010422
getfattr: Removing leading '/' from absolute path names
# file: export_b1/dir1/output_file.20120615010422
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.vol-client-0=0x000000000000000000000000
trusted.afr.vol-client-1=0x000000000000000000000000
trusted.afr.vol-client-2=0x000000000000000000000000
trusted.gfid=0x8438a813ad7646c684f8c80fe956de57

brick2:
~~~~~~~~
[06/15/12 - 01:41:13 root@AFR-Server2 ~]# getfattr -d -m. -ehex /export_b1/dir1/output_file.20120615010422
getfattr: Removing leading '/' from absolute path names
# file: export_b1/dir1/output_file.20120615010422
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.vol-client-0=0x000000000000000000000000
trusted.afr.vol-client-1=0x000000000000000000000000
trusted.afr.vol-client-2=0x000000000000000000000000
trusted.gfid=0x8438a813ad7646c684f8c80fe956de57

brick1:
~~~~~~~~
[06/15/12 - 01:41:03 root@AFR-Server1 ~]# getfattr -d -m. -ehex /export_b1/dir1/output_file.20120615010422
getfattr: Removing leading '/' from absolute path names
# file: export_b1/dir1/output_file.20120615010422
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.vol-client-0=0x000000000000000000000000
trusted.afr.vol-client-1=0x000000000000000000000000
trusted.afr.vol-client-2=0x000000000000000000000000
trusted.gfid=0x8438a813ad7646c684f8c80fe956de57

--- Additional comment from shwetha.h.panduranga on 2012-06-15 02:16:48 EDT ---

Created attachment 592039 [details]
glusterfs logs, history of commands executed on storage_node1 and fuse mount log

--- Additional comment from shwetha.h.panduranga on 2012-06-15 02:31:19 EDT ---

md5sum on brick1:-
-----------------

[06/15/12 - 02:28:59 root@AFR-Server1 ~]# arequal-checksum -p /export_b1/dir1/output_file.20120615010422

Entry counts
Regular files   : 1
Directories     : 0
Symbolic links  : 0
Other           : 0
Total           : 1

Metadata checksums
Regular files   : 486e85
Directories     : 3e9
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : b05e703fbeac795f934ec0aa043387c9
Directories     : 0
Symbolic links  : 0
Other           : 0
Total           : 2310b095ba9ffe96

md5sum on brick2:-
------------------
[06/15/12 - 02:29:04 root@AFR-Server2 ~]# arequal-checksum -p /export_b1/dir1/output_file.20120615010422 

Entry counts
Regular files   : 1
Directories     : 0
Symbolic links  : 0
Other           : 0
Total           : 1

Metadata checksums
Regular files   : 486e85
Directories     : 3e9
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : b05e703fbeac795f934ec0aa043387c9
Directories     : 0
Symbolic links  : 0
Other           : 0
Total           : 2310b095ba9ffe96

md5sum on brick3:-
------------------
[06/15/12 - 02:30:48 root@AFR-Server3 ~]# arequal-checksum -p /export_b1/dir1/output_file.20120615010422 

Entry counts
Regular files   : 1
Directories     : 0
Symbolic links  : 0
Other           : 0
Total           : 1

Metadata checksums
Regular files   : 486e85
Directories     : 3e9
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : b05e703fbeac795f934ec0aa043387c9
Directories     : 0
Symbolic links  : 0
Other           : 0
Total           : 2310b095ba9ffe96

Comment 1 Shwetha Panduranga 2012-06-22 06:52:58 UTC
There are 2 parts to the bug. Case1) For some reason split-brain flag is set, we haven't figured out why that happened. Case2) Once the flag is set, there is no way to reset the flag. This fixes the second part. 

This bug is tracking Case1

Bug 832305 tracks Case2 and Case2 is already resolved.

Comment 2 Anand Avati 2012-06-22 17:31:17 UTC
The patch which was committed has removed the setting into inode context altogether. We can close this bug as it is not applicable anymore.

Comment 3 Pranith Kumar K 2012-06-26 05:43:51 UTC
There is still one place where the decision to set split-brain flag is taken outside inodelk. Re-opening the bug.

Comment 4 Vijay Bellur 2012-07-26 17:15:02 UTC
CHANGE: http://review.gluster.com/3626 (cluster/afr: Modified split-brain handling) merged in master by Anand Avati (avati)