Bug 761937 (GLUSTER-205)

Summary: [ glusterfs 2.0.6rc4 ] - Hard disk failure not handled correctly
Product: [Community] GlusterFS Reporter: Gururaj K <guru>
Component: replicateAssignee: Vikas Gorur <vikas>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 2.0.5CC: amarts, gluster-bugs, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 762118    

Description Amar Tumballi 2009-08-11 19:21:06 UTC
just on looking at the symptoms, the problem here seems like though underlying disk got 're-mounted', glusterfsd didn't die, hence all the 'replicate' clients thought both its subvolumes are in good condition. 

More detailed review needed in the error conditions where one of the subvolume is not marked as 'down'

Comment 1 Gururaj K 2009-08-11 22:16:45 UTC
* 4 server distribute-replicate setup
* Write behind on client side, IO threads on server side



* Error observed when one of the servers - brick8 (second subvolume of replicate) encountered a hardware error and /jbod (on which the backend resided) got remounted read-only

[root@brick8 ~]# dmesg | tail
..
sd 1:2:0:0: SCSI error: return code = 0x00040000
end_request: I/O error, dev sdb, sector 4630780962
EXT3-fs error (device sdb1): ext3_get_inode_loc: unable to read inode block - inode=289424022, block=578847616

[root@brick8 ~]# touch /jbod/abcd
touch: cannot touch `/jbod/abcd': Read-only file system

* Test tools failed with various messages:

-------------------------------------------
iozone:
..
Error reading block at 4346937344
read: File descriptor in bad state
-------------------------------------------
exnihilate.sh:
..
split: _91.10042: Input/output error
-------------------------------------------
rsync:
..
rsync: mkstemp "/mnt/pavan/rc4/client03/rsync/usr/include/python2.4/.sysmodule.h.RO0w3Q" failed: Input/output error (5)
rsync: mkstemp "/mnt/pavan/rc4/client03/rsync/usr/include/sys/.sysmacros.h.gH5HMG" failed: Input/output error (5)
-------------------------------------------
dbench:

(Not the exact output):
"No such file or directory"

Comment 2 Vikas Gorur 2009-08-12 05:02:08 UTC
Something else I observed in this situation:

A file was present only on the subvolume that was read-only. When we did a 'stat' on that file, it didn't get created on the other subvolume by self-heal.

Stepping through afr_lookup_cbk in gdb, it was found that the open_fd_count returned was 1, even though I'm 99% sure that the file hadn't been opened by any other process. Checking the backend glusterfsd's fd's in /proc also did not show the file as open. Due to the non-zero open_fd_count, self-heal wasn't happening.

Comment 3 Gururaj K 2009-08-14 06:25:11 UTC
(In reply to comment #2)
> Something else I observed in this situation:
> 
> A file was present only on the subvolume that was read-only. When we did a
> 'stat' on that file, it didn't get created on the other subvolume by self-heal.
> 
> Stepping through afr_lookup_cbk in gdb, it was found that the open_fd_count
> returned was 1, even though I'm 99% sure that the file hadn't been opened by
> any other process. Checking the backend glusterfsd's fd's in /proc also did not
> show the file as open. Due to the non-zero open_fd_count, self-heal wasn't
> happening.

I have reported a separate bug to track the above issue (and another one that is related):

http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=214

Comment 4 Anand Avati 2009-11-30 07:53:44 UTC
PATCH: http://patches.gluster.com/patch/2426 in master (cluster/afr: Refactored lookup_cbk and introduce precedence of errors.)