Bug 824286

Summary: Self-Heal of files without GFID should return I/O Error when some of the bricks are down
Product: [Community] GlusterFS Reporter: Shwetha Panduranga <shwetha.h.panduranga>
Component: replicateAssignee: vsomyaju
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.3-betaCC: gluster-bugs, nsathyan
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-02 06:53:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shwetha Panduranga 2012-05-23 07:37:45 UTC
Description of problem:
----------------------
When some of the bricks are down in a replicate volume, lookup on files without GFID should return EIO 

Version-Release number of selected component (if applicable):
---------------------------------------------------------------
3.3.0qa42. 

This is a regression. The same test case passed on 3.3.0qa41.

How reproducible:
----------------
often

Steps to Reproduce:
----------------------
1.Create a file on brick1 from back end before creating the volume
2.Create a pure replicate volume with 3 bricks (brick1, brick2, brick3)
3.set "self-heal-daemon" off for the volume
4.start the volume
5.Bring down brick2
6.create a fuse mount
7.execute "find . |xargs stat" on  fuse mount. 
  
Actual results:
---------------
[05/23/12 - 18:19:22 root@APP-CLIENT1 ~]# cd /mnt/gfsc1 ; find . | xargs stat
  File: `.'
  Size: 23        	Blocks: 0          IO Block: 131072 directory
Device: 15h/21d	Inode: 1           Links: 3
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2012-05-23 18:19:26.206943871 +0530
Modify: 2012-05-23 18:17:39.706931389 +0530
Change: 2012-05-23 18:17:39.706931389 +0530

Expected results:
-------------------
[05/23/12 - 18:20:24 root@APP-CLIENT1 gfsc1]# find . | xargs stat
  File: `.'
  Size: 32        	Blocks: 0          IO Block: 131072 directory
Device: 15h/21d	Inode: 1           Links: 3
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2012-05-23 18:19:26.206943871 +0530
Modify: 2012-05-23 18:17:39.708936680 +0530
Change: 2012-05-23 18:17:39.708936680 +0530
stat: cannot stat `./f1': Input/output error

Additional info:
----------------
The same test case works fine when "brick2" and "brick3" were brought down in step 5. 


[05/23/12 - 18:19:03 root@APP-SERVER1 ~]# gluster v info
 
Volume Name: dstore
Type: Replicate
Volume ID: 11b8a379-f4fa-4f51-9651-d5613690abe8
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.2.35:/export_sdb/brick0
Brick2: 192.168.2.36:/export_sdb/brick0
Brick3: 192.168.2.35:/export_sdc/brick0
Options Reconfigured:
diagnostics.brick-log-level: DEBUG
diagnostics.client-log-level: DEBUG
cluster.self-heal-daemon: off

[05/23/12 - 18:20:06 root@APP-SERVER1 ~]# gluster v status
Status of volume: dstore
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 192.168.2.35:/export_sdb/brick0			24009	Y	27899
Brick 192.168.2.36:/export_sdb/brick0			24009	N	27546
Brick 192.168.2.35:/export_sdc/brick0			24010	Y	27904
NFS Server on localhost					38467	Y	27969
NFS Server on 192.168.2.36				38467	Y	27582

Comment 1 Shwetha Panduranga 2012-05-25 10:38:55 UTC
Verified on 3.3.0qa43 . The test works fine now.

Comment 2 Shwetha Panduranga 2012-06-01 06:18:32 UTC
The test failed once again on 3.3.0qa45.