546460 – newly allocated mimages need to be more logical and consistent

Bug 546460 - newly allocated mimages need to be more logical and consistent

Summary: newly allocated mimages need to be more logical and consistent

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	lvm2
Sub Component:
Version:	5.4
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Jonathan Earl Brassow
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1336022
TreeView+	depends on / blocked

Reported:	2009-12-10 22:39 UTC by Corey Marthaler
Modified:	2016-05-13 20:38 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1336022 (view as bug list)
Environment:
Last Closed:	2011-04-26 09:54:54 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Corey Marthaler 2009-12-10 22:39:14 UTC

Description of problem:
It appears that newly allocated mimages start at the greatest int that's open, whether or not that place was just one of the images failed. Could we have them start at the next greatest int not previously in the mirror?

so if we have a mirror with:
mimage_0
mimage_1
mimage_2

and we fail mimage_2, shouldn't logically the next image be mimage_3? If we fail mimage_1, in that case, we don't use _1 for the newly allocated spot, we'd use _3. 

The current way makes automating lvm mirror failure testing way more difficult because now I not only have to know how many legs I randomly failed, but also, what place they were in and then figure out which goofy way lvm will rebuild the mirror before I can verify what image(s) should have left, and which new one(s) have appeared.

Version-Release number of selected component (if applicable):
2.6.18-160.el5

lvm2-2.02.56-2.el5    BUILT: Thu Dec 10 09:38:13 CST 2009
lvm2-cluster-2.02.56-2.el5    BUILT: Thu Dec 10 09:38:41 CST 2009
device-mapper-1.02.39-1.el5    BUILT: Wed Nov 11 12:31:44 CST 2009
cmirror-1.1.39-2.el5    BUILT: Mon Jul 27 15:39:05 CDT 2009
kmod-cmirror-0.1.22-1.el5    BUILT: Mon Jul 27 15:28:46 CDT 2009

Comment 1 Corey Marthaler 2009-12-16 23:45:32 UTC

After researching this some more, it turns out the allocate new mimage logic is inconsistent, which makes testing (and especially automated testing) super difficult.

If you have a 2-way mirror like so:

mimage_0 sda1
mimage_1 sdb1

and you fail the primary leg (sda1) with the allocate policy, you'll end up with the following:

mimage_0 sdb1
mimage_1 sdn1

If you have a 3-way mirror like so:

mimage_0 sda1
mimage_1 sdb1
mimage_2 sdc1

and you fail the primary leg (sda1) with the allocate policy, you'll end up with the following:

mimage_1 sdb1
mimage_2 sdc1
mimage_3 sdn1

Then add in the multiple device failure scenarios listed in comment #0, and you can see why I'm pulling my hair out trying to write the logic to verify that after each type of mirror failure in the matrix of possibilities, the correct mimages and devices are removed and the correct ones are added.

Comment 2 Petr Rockai 2009-12-17 16:49:06 UTC

Corey, would it be possible to track the devices instead of the mimage numbers? What currently happens is this: upon failure (or any downconversion), the mimages are shifted, so that the ones to be removed are at the end. Then the end is removed and possibly replaced with new images.

The numbering of mimage LVs is done (apparently) from left to right. But the second case looks odd indeed, but so far I got lost trying to understand the code that allocates these numbers. If it went like 0 -> sdb1, 2 -> sdc1 and 3 -> sdn1 that would be reasonable, right (looking at the 2way for reference).

Comment 3 Corey Marthaler 2009-12-17 21:37:10 UTC

Petr we do also track the actual devices. We attempt to track all devices and mimages both before and after a failure to ensure that everything is where it's expected.

If it did end up going to "0 -> sdb1, 2 -> sdc1 and 3 -> sdn1" then yes, that would be reasonable, assuming all types of mirrors behave like that because it would be consistent. However I think that may still get confusing depending on the number of legs that got failed as you'd have leg images getting shuffled all over.

I think the best bet is to add them to the end. so if you have the following:
mimage_0 sdb1
mimage_1 sdc1
mimage_2 sdd1
mimage_3 sde1

and you fail sdc1 and sde1, then the new allocated spots should go to 

mimage_0 sdb1
mimage_2 sdd1
mimage_4 sdf1
mimage_5 sdg1

* Note how the new image starts at 4, and not 3 even though 3 was failed.

Comment 5 Corey Marthaler 2010-01-13 21:03:20 UTC

Ah, the reason 2-leg mirrors appear different is because they only have one leg remaining after the device failure. I just learned that n-way mirrors behave the same way when enough legs are failed to leave only one remaining. What happens is the remaining device gets shuffled to mimage_0 (the new primary leg) and all others that were failed and reallocated, get incrementally added to that. So you end up with the exact same images that you had before the failure.

Leg failures that don't result in all but one leg remaining either get added to mimage_n (if the last leg was failed) or mimage_n+1 (if the last leg was not failed). 

So it's still odd, but a little more understandable then I had originally thought. Now we just need each customer to figure this out as well and nothing will appear random or crazy. :)

Comment 6 Alasdair Kergon 2010-01-13 21:09:25 UTC

This should go into the man page.

Comment 8 Ludek Smid 2010-03-11 12:22:53 UTC

Since it is too late to address this issue in RHEL 5.5, it has been proposed for RHEL 5.6.  Contact your support representative if you need to escalate this issue.

Comment 18 RHEL Program Management 2011-04-26 09:54:54 UTC

Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Note You need to log in before you can comment on or make changes to this bug.