Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1396253

Summary:	LVM RAID: dev failure during first sync of upconvert can loose data
Product:	Red Hat Enterprise Linux 7	Reporter:	Jonathan Earl Brassow <jbrassow>
Component:	lvm2	Assignee:	Heinz Mauelshagen <heinzm>
lvm2 sub component:	Mirroring and RAID	QA Contact:	cluster-qe <cluster-qe>
Status:	CLOSED CURRENTRELEASE	Docs Contact:
Severity:	unspecified
Priority:	unspecified	CC:	agk, heinzm, jbrassow, msnitzer, prajnoha, prockai, zkabelac
Version:	7.2
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-06-14 14:01:56 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Jonathan Earl Brassow 2016-11-17 19:21:22 UTC

There are two ways to create a "raid1" LV:
1) use the lvcreate command to create the whole thing at once
2) use the lvconvert command to "upconvert" from a linear LV

The way that initialization happens in RAID can be problematic for #2 if there is a primary device failure during the process. After encountering a failure (read, write, or entire device) in the primary device, the sync process simply continues on using an alternate device to complete the sync. This is odd but not an issue with #1; however, in the case of #2 it can lead to data loss - even if the primary device is not replace and later revived.

Imagine three devices. The first device is filled with 'A's, the second with 'B's and the third with 'C's. Before the sync begins, the three disks would look like:
A B C <- 2nd half of the disks
A B C <- 1st half of the disks
Let's say the first device dies at the moment the first half of the addressable space is sync'ed. We would then have the following (where [] signifies the dead device):
[A] B C
[A] A A
The sync process will simply continue, using the next available device after marking the first device as 'failed'. The final result will be:
[A] B B
[A] A A
If the primary device can be revived at this point, it is synchronized according to the rest of the copies, leaving:
B B B
A A A
Again, this is fine for a "raid1" that was created from scratch - the contents are perfectly in-sync. However, if the array was being upconverted from a linear LV, the last half of the original contents is destroyed by the process of recovery when the device is revived. In the case of #2, we would much rather have the recovery process stop when the primary fails - giving the user a chance to revive it (remember, it could have happened due to something as simple as a single read/write error). Once revived, the desire would be to continue syncing from the original LV.

Comment 1 Heinz Mauelshagen 2016-11-22 18:28:09 UTC

The MD raid1 personality behaves that way in case of multi-legged mirrors
(i.e. selecting a new primary when the current one dies).

The behavioural change you're requesting ("...to continue syncing from the original LV.") implies to first update any dirty regions on the recurring initial primary leg before restarting the previously interrupted resynchronization from where it got discontinued or any updates would be lost.

That wouldn't work though, because regions aren't mapped 1:1 to io payload sizes and offsets and in turn will typically not fully written over, thus replacing parts of regions on the recurred primary leg with uninitialized data causing data corruption.

To compensate that fact, we'd need a finer grained write intend bitmap (i.e. tiny region size) to make sure the whole region gets updated in this situation which imposes overhead and scalability issues on large linear LVs being up converted because of bitmap size limits.

Comment 2 Heinz Mauelshagen 2016-11-22 18:34:38 UTC

It is important to note, that an initially synchronizing up converted linear -> raid1 LV isn't any better with respect to resilience right after the conversion than the previous linear one. Initial sync in this conversion just causes the resilience ratio to grow to 100% over time. We may only be able to work around the fact of a transiently failing primary leg (the previous linear LV containing user data) with a solution along the lines of comment #1.

Comment 3 Jonathan Earl Brassow 2017-04-26 13:54:27 UTC

this bug will be deferred to 7.5, but needs a release note.

Comment 4 Jonathan Earl Brassow 2017-06-14 14:01:56 UTC

Fixed in RHEL7.4
Fixed by:
ddb14b6 lvconvert: Disallow removal of primary when up-converting (recovering)
4c0e908 RAID (lvconvert/dmeventd):  Cleanly handle primary failure during 'recover' op
d34d206 lvconvert:  Don't require a 'force' option during RAID repair.
c87907d lvconvert:  linear -> raid1 upconvert should cause "recover" not "resync"