761937 – (GLUSTER-205) [ glusterfs 2.0.6rc4 ] - Hard disk failure not handled correctly

Bug 761937 (GLUSTER-205) - [ glusterfs 2.0.6rc4 ] - Hard disk failure not handled correctly

Summary: [ glusterfs 2.0.6rc4 ] - Hard disk failure not handled correctly

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	GLUSTER-205
Product:	GlusterFS
Classification:	Community
Component:	replicate
Sub Component:
Version:	2.0.5
Hardware:	All
OS:	Linux
Priority:	low
Severity:	low
Target Milestone:	---
Assignee:	Vikas Gorur
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	GLUSTER-386
TreeView+	depends on / blocked

Reported:	2009-08-11 22:16 UTC by Gururaj K
Modified:	2009-12-01 11:09 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Amar Tumballi 2009-08-11 19:21:06 UTC

just on looking at the symptoms, the problem here seems like though underlying disk got 're-mounted', glusterfsd didn't die, hence all the 'replicate' clients thought both its subvolumes are in good condition. 

More detailed review needed in the error conditions where one of the subvolume is not marked as 'down'

Comment 1 Gururaj K 2009-08-11 22:16:45 UTC

* 4 server distribute-replicate setup
* Write behind on client side, IO threads on server side



* Error observed when one of the servers - brick8 (second subvolume of replicate) encountered a hardware error and /jbod (on which the backend resided) got remounted read-only

[root@brick8 ~]# dmesg | tail
..
sd 1:2:0:0: SCSI error: return code = 0x00040000
end_request: I/O error, dev sdb, sector 4630780962
EXT3-fs error (device sdb1): ext3_get_inode_loc: unable to read inode block - inode=289424022, block=578847616

[root@brick8 ~]# touch /jbod/abcd
touch: cannot touch `/jbod/abcd': Read-only file system

* Test tools failed with various messages:

-------------------------------------------
iozone:
..
Error reading block at 4346937344
read: File descriptor in bad state
-------------------------------------------
exnihilate.sh:
..
split: _91.10042: Input/output error
-------------------------------------------
rsync:
..
rsync: mkstemp "/mnt/pavan/rc4/client03/rsync/usr/include/python2.4/.sysmodule.h.RO0w3Q" failed: Input/output error (5)
rsync: mkstemp "/mnt/pavan/rc4/client03/rsync/usr/include/sys/.sysmacros.h.gH5HMG" failed: Input/output error (5)
-------------------------------------------
dbench:

(Not the exact output):
"No such file or directory"

Comment 2 Vikas Gorur 2009-08-12 05:02:08 UTC

Something else I observed in this situation:

A file was present only on the subvolume that was read-only. When we did a 'stat' on that file, it didn't get created on the other subvolume by self-heal.

Stepping through afr_lookup_cbk in gdb, it was found that the open_fd_count returned was 1, even though I'm 99% sure that the file hadn't been opened by any other process. Checking the backend glusterfsd's fd's in /proc also did not show the file as open. Due to the non-zero open_fd_count, self-heal wasn't happening.

Comment 3 Gururaj K 2009-08-14 06:25:11 UTC

(In reply to comment #2)
> Something else I observed in this situation:
> 
> A file was present only on the subvolume that was read-only. When we did a
> 'stat' on that file, it didn't get created on the other subvolume by self-heal.
> 
> Stepping through afr_lookup_cbk in gdb, it was found that the open_fd_count
> returned was 1, even though I'm 99% sure that the file hadn't been opened by
> any other process. Checking the backend glusterfsd's fd's in /proc also did not
> show the file as open. Due to the non-zero open_fd_count, self-heal wasn't
> happening.

I have reported a separate bug to track the above issue (and another one that is related):

http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=214

Comment 4 Anand Avati 2009-11-30 07:53:44 UTC

PATCH: http://patches.gluster.com/patch/2426 in master (cluster/afr: Refactored lookup_cbk and introduce precedence of errors.)

Note You need to log in before you can comment on or make changes to this bug.