Bug 762908 (GLUSTER-1176) - Two replica failures prevent self-heal even when one node recovers
Summary: Two replica failures prevent self-heal even when one node recovers
Keywords:
Status: CLOSED WORKSFORME
Alias: GLUSTER-1176
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-07-20 08:42 UTC by Shehjar Tikoo
Modified: 2011-08-16 10:46 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: nfs
Documentation: DP
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Shehjar Tikoo 2010-07-20 06:46:29 UTC
(In reply to comment #0)
> In a three replica setup, if one replica goes down, IO continues as normal to
> the two remaining replicas, and self-heals when this replica comes back up.
> 
> When two replicas go down, IO continues to the one remaining replica. When one
> replica comes back up, self-heal allows IO on the source replica to continue in
> parallel to self-heal on the destination replica.


Correct behaviour is to pause IO on all replicas, and allow only self heal to continue on the replica that came back up. Right now, parallel self-heal and IO on the source replica is resulting in data corruption on the recovering replica.

Comment 1 Shehjar Tikoo 2010-07-20 08:42:53 UTC
In a three replica setup, if one replica goes down, IO continues as normal to the two remaining replicas, and self-heals when this replica comes back up.

When two replicas go down, IO continues to the one remaining replica. When one replica comes back up, self-heal allows IO on the source replica to continue in parallel to self-heal on the destination replica.

Comment 2 Raghavendra Bhat 2011-04-07 05:25:28 UTC
I did a similar kind of test. i.e. created a 3 replica volume. Started it and mounted it as nfs mount. Started dd of a 2gb file and brought down 2 bricks. After around 500 MB is copied brought up the 2 bricks. After the file is entirely written, checked the md5sum of the file on all the bricks. md5sum was same on all the bricks and on the mount point. Seems that it is working fine.

Comment 3 Shehjar Tikoo 2011-04-07 06:06:08 UTC
The problem is when only one downed node is brought up, not both. Leave the third replica down and let the IO complete. Then check md5 sum. Make sure you're using a file other than /dev/zero for dd input.

Comment 4 Amar Tumballi 2011-04-25 09:33:13 UTC
Please update the status of this bug as its been more than 6months since its filed (bug id < 2000)

Please resolve it with proper resolution if its not valid anymore. If its still valid and not critical, move it to 'enhancement' severity.

Comment 5 Shehjar Tikoo 2011-04-26 03:11:03 UTC
Still valid.

Comment 6 Vijaykumar 2011-08-16 07:46:44 UTC
I created 3 replica volume, started it and mounted it as nfs mount point. started dd of 2gb with if=/dev/urandom at the mount point. I brought down 2 bricks. I was monitoring du of file in all the bricks and at the mount. when i brought down two bricks, file size was increasing in one brick which was up, and it was constant in other two. When i brought up one of the bricks which were down , IO stopped on the mount point till self healing was complete. Later the file size in both bricks and mount point started increasing simultaneously. I have performed this test for some 7 to 8 times, it was consistently working fine.


Note You need to log in before you can comment on or make changes to this bug.