Bug 456575

Summary:	Mirror corruption after one of three legs fail simultaneously on more than 1 mirror
Product:	Red Hat Enterprise Linux 5	Reporter:	Jonathan Earl Brassow <jbrassow>
Component:	cmirror	Assignee:	Jonathan Earl Brassow <jbrassow>
Status:	CLOSED ERRATA	QA Contact:	Cluster QE <mspqa-list>
Severity:	high	Docs Contact:
Priority:	high
Version:	5.2	CC:	agk, antillon.maurizio, ccaulfie, cmarthal, dwysocha, edamato, heinzm, jbrassow, mbroz, mgahagan, prockai, syeghiay
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	cmirror-1.1.39-9.el5	Doc Type:	Bug Fix
Doc Text:	A data corruption may have occurred when using 3 or more mirrors. With this update, the underlying cluster code has been modified to address this issue, and the data corruption no longer occurs.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-01-13 22:48:56 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	359341
Bug Blocks:	483701, 525215, 533192

Comment 1 Jonathan Earl Brassow 2008-10-01 13:46:57 UTC

Mirror corruption issues where found in the cluster logging code and fixed in 5.3.  During the investigation, there were other issues identified.  So, there is still a problem in the kernel.  It does not need fixing until device-mapper mirror failures are handled differently (which is planned for the future).  Currently, when a mirror device fails, it is removed.  Later releases will only remove the failed device if the failure is persistent.

Description of what will cause the failure:
In drivers/md/dm-raid1.c, after a leg fails and a write returns, '__bio_mark_nosync' is used to mark the region out-of-sync.  This state is stored in a region structure that remains in the region hash.  It is not removed from the region hash until the mirror is destroyed because it never goes on the clean_regions list.  Right now, this is not a problem because when a device fails, the mirror is destroyed and a new mirror is created w/o the failed device.  In the future, when we wish to handle transient failures, we would simply suspend and resume to restart recovery.  In that case, some machines in the cluster would only write to the primary for regions that are cached as not-in-sync - due to the '__bio_mark_nosync'.  The fix is to simply clear out the region hash when a mirror is suspended.

Comment 2 RHEL Program Management 2009-01-27 20:44:08 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 RHEL Program Management 2009-02-16 15:25:43 UTC

Updating PM score.

Comment 4 Jonathan Earl Brassow 2009-04-21 19:36:57 UTC

Handling of device-mapper mirror failures has not changed, and therefore, no change is required for kernel code at this time.  Pushing out.  (See comment #1 for more detail.)

Comment 6 Jonathan Earl Brassow 2009-10-14 15:27:28 UTC

I'll try to figure this out.  These bugs take a long time to decipher, so it will have to be 'conditional nack - capacity' vs. devel_ack.  Note the configuration when setting severity/priority scores.

Comment 7 Jonathan Earl Brassow 2010-01-26 21:16:47 UTC

Please verify this bug still exists with latest rhel5.5 kernel and userspace packages.... Many things have changed which would have a direct impact on this bug:
1) kernel handles write failures differently now
2) userspace cleans up LVs on an individual basis now vs on a VG scale

Comment 9 RHEL Program Management 2010-08-25 16:09:54 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 12 Corey Marthaler 2010-10-15 22:12:06 UTC

This bug is no longer reproducible with the latest rpms. Marking verified.

2.6.18-225.el5

lvm2-2.02.74-1.el5    BUILT: Fri Oct 15 10:26:21 CDT 2010
lvm2-cluster-2.02.74-1.el5    BUILT: Fri Oct 15 10:27:02 CDT 2010
device-mapper-1.02.55-1.el5    BUILT: Fri Oct 15 06:15:55 CDT 2010
cmirror-1.1.39-10.el5    BUILT: Wed Sep  8 16:32:05 CDT 2010
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009

Comment 13 Jaromir Hradilek 2010-11-17 14:29:21 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
A data corruption may have occurred when using 3 or more mirrors. With this update, the underlying cluster code has been modified to address this issue, and the data corruption no longer occurs.

Comment 15 errata-xmlrpc 2011-01-13 22:48:56 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0057.html