Bug 1041462

Summary: Cannot create RAID
Product: [Fedora] Fedora Reporter: Jan Safranek <jsafrane>
Component: python-blivetAssignee: mulhern <amulhern>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rawhideCC: amulhern, anaconda-maint-list, bcl, dlehman
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-02-26 20:43:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Reproducer
none
blivet log none

Description Jan Safranek 2013-12-12 16:13:05 UTC
Created attachment 835884 [details]
Reproducer

I cannot create MD RAID with new blivet.

'b.devicetree.processActions(dryRun=False)' creates the RAID, but subsequent 'b.reset()' destroys it. I can see in the log 'ERROR:blivet: failed to scan md array blivet00' and blivet then calls 'mdadm -S'. See attachment for complete reproducer and log.

Version-Release number of selected component (if applicable):
python-blivet-0.31-1.fc21.noarch

How reproducible:
always

The same code was working a couple of weeks ago, I suspect it's caused by recent RAID rewrites.

Comment 1 Jan Safranek 2013-12-12 16:13:46 UTC
Created attachment 835885 [details]
blivet log

Comment 2 David Lehman 2013-12-12 16:21:43 UTC
INFO:program: Running... mdadm --examine --export /dev/sdc1
INFO:program: MD_LEVEL=raid6
INFO:program: MD_DEVICES=4
INFO:program: MD_NAME=rawhide:blivet00
INFO:program: MD_ARRAY_SIZE=195.04MB
INFO:program: MD_UUID=2bcefded:e757bcb3:05b3cd2c:e0337c3b
INFO:program: MD_UPDATE_TIME=1386864551
INFO:program: MD_DEV_UUID=e3923673:09396bb5:66a9dd4f:eafaa2f0
INFO:program: MD_EVENTS=1
DEBUG:program: Return code: 0
INFO:program: Running... mdadm --examine --brief /dev/sdc1
INFO:program: ARRAY /dev/md/blivet00  metadata=1.2 UUID=2bcefded:e757bcb3:05b3cd2c:e0337c3b name=rawhide:blivet00
DEBUG:program: Return code: 0
<snip>
ERROR:blivet: failed to create md array: memberDevices cannot be greater than totalDevices


The way we find things is such that we instantiate an MDRaidArrayDevice with just one member device, then we add the other members as we find them.

Comment 3 mulhern 2014-02-07 17:56:57 UTC
This behaviour in MDRaidArrayDevice constructor was introduced in commit 1256be6ee0795ade15edcd75bfcdf5eee6cda9d5, in all probability, but that doesn't mean the behaviour is wrong. When would it ever be right for memberDevices to be larger than totalDevices?

I think what we should really do is make it right in constructor and make a few of those semi-redundant parameters optional.

The problem is that this constructor gets called in two completely distinct contexts...when setting up a plan for an md raid device and when finding out the current state of the device and expectations on parameters need to be different in those two different contexts.

There's a parameter that distinguishes between those two conditions, it's the exists parameter. So, some checks in constructor should vary whether exists is True or not.

Comment 4 Jan Safranek 2014-02-10 10:00:52 UTC
On today's rawhide (python-blivet-0.40-1.fc21.noarch) the reproducer script works well and creates MD RAID, So for me the bug is fixed. I never pass memberDevices higher than totalDevices (see the reproducer), blivet must have messed it up on its own and that has been fixed.

Comment 5 mulhern 2014-02-10 16:42:41 UTC
I'm not convinced that this good situation will last. In handleUdevMDMemberFormat() the MDRaidArrayDevice constructor is invoked. But at that point there is insufficient information available to the function to determine totalDevices, just memberDevices.

I think that handleUdevMDMemberFormat() should be beefed up so that totalDevices information is available. I believe that this can be done by using mdadm --detail and think that this approach should be explored further. I'm quite clear that mdadm --detail will yield that information...I've already written the code and tests for that. It's just using it properly in handleUdevMDMemberFormat() that isn't quite so obvious.

The MDRaidArrayDevice constructor should be beefed up as well.

Comment 6 mulhern 2014-02-26 20:43:33 UTC
Ok. I'm going to close this but hold on to the code in case this comes up again.

It looks to me like what must be happening is that the array is being found by getDeviceByUuid() so the code that had the error last time is never reached.

If you could grab the logs for the working code next time something like this happens that would be much appreciated.

Thanks!

- mulhern