Bug 1804080

Summary: anaconda unable to finish installation with software raid partition
Product: [Fedora] Fedora Reporter: lnie <lnie>
Component: python-blivetAssignee: Vojtech Trefny <vtrefny>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 32CC: anaconda-maint-list, awilliam, bcotton, blivet-maint-list, dlehman, fzatlouk, jkonecny, jonathan, kellin, lnie, lruzicka, mkolman, robatino, rvykydal, vanmeeuwen+fedora, vponcova, vtrefny, wwoods
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: openqa AcceptedBlocker
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-10 23:00:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1705303    
Attachments:
Description Flags
anaconda.log
none
tb
none
storage.log none

Description lnie 2020-02-18 07:57:43 UTC
Created attachment 1663662 [details]
anaconda.log

Description of problem:
Boot the installer with virt-manager, create a Raid partition,
crash will happen in the Installation process page.
 

Version-Release number of selected component (if applicable):
Fedora-Server-dvd-x86_64-32-20200217.n.0.iso 
Fedora-Workstation-Live-x86_64-32-20200217.n.0.iso     

How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 lnie 2020-02-18 07:58:46 UTC
Created attachment 1663664 [details]
tb

Comment 2 lnie 2020-02-18 08:04:08 UTC
Created attachment 1663666 [details]
storage.log

Comment 3 Fedora Blocker Bugs Application 2020-02-18 08:05:08 UTC
Proposed as a Blocker for 32-beta by Fedora user lnie using the blocker tracking app because:

 This affects:
The installer must be able to: Correctly interpret...any disk with a valid ms-dos or gpt disk label and partition table containing...software RAID arrays at RAID levels 0, 1 and 5 containing ext4 partitions

Comment 4 lnie 2020-02-18 09:15:00 UTC
*** Bug 1804066 has been marked as a duplicate of this bug. ***

Comment 5 Vendula Poncova 2020-02-18 09:28:19 UTC
It seems to be an issue in the storage configuration library. Reassigning to blivet.

Comment 6 Adam Williamson 2020-02-20 00:17:44 UTC
Now that other bugs are fixed, I can see openQA is running into this too.

Comment 7 Ben Cotton 2020-02-21 18:30:23 UTC
Is this the same as BZ 1798792, which is already accepted as an F32 Beta blocker?

https://bugzilla.redhat.com/show_bug.cgi?id=1798792

Comment 8 Adam Williamson 2020-02-21 18:55:29 UTC
No. That bug is about the installer handling a *pre-existing* software RAID array. This is about creating a *new* one. Different issues.

Comment 9 David Lehman 2020-02-21 21:33:37 UTC
We create the md array such that mdadm should auto-activate it after creation. Something a little weird happens with that, then blivet thinks the array is inactive right afterward, so it tries to activate it and mdadm complains that the devices are busy.

From program.log:
02:16:28,760 INF program: Running [57] mdadm --create /dev/md/root00 --run --level=raid0 --raid-devices=2 --metadata=default --chunk=512 /dev/vda3 /dev/vdb2 ...
02:16:33,544 INF program: stdout[57]: 
02:16:33,549 INF program: stderr[57]: mdadm: array /dev/md/root00 started.
mdadm: timeout waiting for /dev/md/root00

02:16:33,551 INF program: ...done [57] (exit code: 0)
02:16:33,552 INF program: Running... udevadm settle --timeout=300
02:16:33,609 DBG program: Return code: 0
02:16:33,628 INF program: Running [58] mdadm --assemble /dev/md/root00 --run /dev/vda3 /dev/vdb2 ...
02:16:34,348 INF program: Running... lsblk --bytes -o NAME,SIZE,OWNER,GROUP,MODE,FSTYPE,LABEL,UUID,PARTUUID,FSAVAIL,FSUSE%,MOUNTPOINT
02:16:34,389 INF program: NAME               SIZE OWNER GROUP MODE       FSTYPE            LABEL                  UUID                                   PARTUUID                                FSAVAIL FSUSE% MOUNTPOINT
<<snip>>
02:16:34,391 INF program: vda         17179869184 root  disk  brw-rw----
02:16:34,391 INF program: |-vda1       1073741824 root  disk  brw-rw---- xfs                                      d2588f60-6c49-4395-8ecd-26bfaa570bfb   b3abfe32-01
02:16:34,392 INF program: |-vda2       3225419776 root  disk  brw-rw----                                                                                 b3abfe32-02
02:16:34,392 INF program: `-vda3       6450839552 root  disk  brw-rw---- linux_raid_member localhost:root00       b218771f-c105-a69d-28b9-2d6d03a98ebd   b3abfe32-03
02:16:34,392 INF program: `-md127   12891193344 root  disk  brw-rw----
02:16:34,392 INF program: vdb         17179869184 root  disk  brw-rw----
02:16:34,392 INF program: |-vdb1       3225419776 root  disk  brw-rw---- LVM2_member                              rmn2nu-kVYF-zoVx-rlIT-yGRg-c2jE-aOUSmS 2fb40cb3-01
02:16:34,392 INF program: `-vdb2       6450839552 root  disk  brw-rw---- linux_raid_member localhost:root00       b218771f-c105-a69d-28b9-2d6d03a98ebd   2fb40cb3-02
02:16:34,393 INF program: `-md127   12891193344 root  disk  brw-rw----
02:16:34,393 DBG program: Return code: 0


Perhaps something has changed in mdadm recently?

Comment 10 František Zatloukal 2020-02-24 20:34:47 UTC
Discussed during the 2020-02-24 blocker review meeting: [1]

The decision to classify this bug as an AcceptedBlocker was made:

"The installer must be able to correctly interpret… any disk with a valid ms-dos or gpt disk label and partition table containing… software RAID arrays at RAID levels 0, 1 and 5 containing ext4 partitions."

[1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2020-02-24/f32-blocker-review.2020-02-24-17.00.log.txt

Comment 11 Adam Williamson 2020-03-09 16:42:49 UTC
cmurf suggested this might possibly share a cause with https://bugzilla.redhat.com/show_bug.cgi?id=1809117 - does that seem plausible, David/Vojtech?

Comment 12 David Lehman 2020-03-09 17:03:44 UTC
(In reply to Adam Williamson from comment #11)
> cmurf suggested this might possibly share a cause with
> https://bugzilla.redhat.com/show_bug.cgi?id=1809117 - does that seem
> plausible, David/Vojtech?

Seems plausible to me, but I'm only a casual observer in this case.

Comment 13 Vojtech Trefny 2020-03-09 17:18:44 UTC
(In reply to Adam Williamson from comment #11)
> cmurf suggested this might possibly share a cause with
> https://bugzilla.redhat.com/show_bug.cgi?id=1809117 - does that seem
> plausible, David/Vojtech?

I'm quite sure this bug is caused by the mdadm udev rules misplacement. I didn't close this as duplicate of 1809117 only because this one is accepted as a blocker.

Comment 14 Adam Williamson 2020-03-10 23:00:46 UTC
Confirmed that this is fixed in today's compose, with the udev rule placement fixed - openQA software RAID install tests all passed.

Comment 15 Vojtech Trefny 2020-03-25 09:10:30 UTC
*** Bug 1812513 has been marked as a duplicate of this bug. ***