Description of problem: "dracut" uses "blkid" which in turn links "libblkid" to detect UUID of device, particularly, RAID devices. It seems that "libblkid" from "util-linux-2.39.2-1" does not report the UUID properly, but, sometime, the PARTUUID. The consequence is that "dracut" does not identify the RAID and it does not assemble it at boot (which is all but nice). The previous package, "util-linux-2.38.1-4", from Fedora 38 was working fine and replacing "libblkid.so.1.1.0" with this one fixes the problem. Version-Release number of selected component (if applicable): util-linux-2.39.2-1 How reproducible: Always Steps to Reproduce: 1. Have a system with "md" RAID on DOS partitions or on non-partitioned devices. 2. dracut -f blkid /dev/some_md_device 3. Actual results: "dracut" does not generate the appropriate "initramfs", "blkid" returns only the PARTUUID ("dracut" uses the UUID to identify the RAID). Expected results: "blkid" should return the full information: UUID, SUB_UUID, LABEL, TYPE, PARTUUID The previous version of "libblkid" is returning all of the above. Additional info: It seems that the partition with ID=0x83 (Linux) is handled properly, that is the full correct information is returned, even if it belongs to a "md" device (RAID-1, version 1.0). The partition with ID=0xDA, (Non-FS data) is not, only PARTUUID is returned. It could be the problem is mainly visible with "md", but still the new behavior does not seem correct. Again, this prevents booting on some (all) system with "md" RAID, so it is pretty serious issue. Hope this helps, pg
It seems I ran into the same issue after upgrading from F37 to F39 a few days ago. Only one raid device was assembled properly after reboot. Two failed. At first all my drives were using mbr. In the cause of troubleshooting I came across a warning regarding duplicate UUIDs and fixed that by converting one of the drives to gpt and using gdisk to randomize disk and partitions GUIDs. Here is the output of `blkid` for all raid partitions on Fedora 39 (blkid from util-linux 2.39.2 (libblkid 2.39.2, 17-Aug-2023)): # sudo blkid /dev/sd[a-c][2-4] /dev/sda2: PARTLABEL="Linux RAID" PARTUUID="f6f1aa67-4057-49a9-aad9-c86d08ad1936" /dev/sda3: PARTLABEL="Linux RAID" PARTUUID="dc76e2b3-02e4-4bfc-b260-c7a2ec8bc103" /dev/sda4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="f0245de8-5d17-30f6-200d-32671744ffed" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTLABEL="Linux RAID" PARTUUID="16d46970-7302-427b-ab54-36c774a006a0" /dev/sdb2: PARTUUID="d2b6d202-02" /dev/sdb3: PARTUUID="d2b6d202-03" /dev/sdb4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="af10814d-0950-ad5f-3cdf-effe7a4c97c6" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTUUID="d2b6d202-04" /dev/sdc2: PARTUUID="9c3d16cb-02" /dev/sdc3: PARTUUID="9c3d16cb-03" /dev/sdc4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="60defeb5-e1fd-3807-5211-24e73927ee3a" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTUUID="9c3d16cb-04" Compare that to Fedora 38 (blkid from util-linux 2.38.1 (libblkid 2.38.1, 04-Aug-2022): # sudo blkid /dev/sd[a-c][2-4] /dev/sda2: UUID="39295d93-e5a7-5797-b722-87f351563755" UUID_SUB="4bed6fca-21ec-efb3-c1ee-c35f0254ff2b" LABEL="urras.penguinpee.nl:5" TYPE="linux_raid_member" PARTLABEL="Linux RAID" PARTUUID="f6f1aa67-4057-49a9-aad9-c86d08ad1936" /dev/sda3: UUID="4a2c44b5-25f2-a6c9-0e7f-6cae37a8a9cc" UUID_SUB="b5ab5a58-2c95-8982-f162-8e648048b6a8" LABEL="urras.penguinpee.nl:1" TYPE="linux_raid_member" PARTLABEL="Linux RAID" PARTUUID="dc76e2b3-02e4-4bfc-b260-c7a2ec8bc103" /dev/sda4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="f0245de8-5d17-30f6-200d-32671744ffed" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTLABEL="Linux RAID" PARTUUID="16d46970-7302-427b-ab54-36c774a006a0" /dev/sdb2: UUID="39295d93-e5a7-5797-b722-87f351563755" UUID_SUB="27b15be2-7aa2-3334-11eb-748bd7e57df1" LABEL="urras.penguinpee.nl:5" TYPE="linux_raid_member" PARTUUID="d2b6d202-02" /dev/sdb3: UUID="4a2c44b5-25f2-a6c9-0e7f-6cae37a8a9cc" UUID_SUB="e89a1593-d3a3-9398-54ce-414dae37d801" LABEL="urras.penguinpee.nl:1" TYPE="linux_raid_member" PARTUUID="d2b6d202-03" /dev/sdb4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="af10814d-0950-ad5f-3cdf-effe7a4c97c6" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTUUID="d2b6d202-04" /dev/sdc2: UUID="39295d93-e5a7-5797-b722-87f351563755" UUID_SUB="af87a72c-1cb7-7e54-fa38-8f1c30d18b8e" LABEL="urras.penguinpee.nl:5" TYPE="linux_raid_member" PARTUUID="9c3d16cb-02" /dev/sdc3: UUID="4a2c44b5-25f2-a6c9-0e7f-6cae37a8a9cc" UUID_SUB="3dc2ac9a-95cf-7103-8ebc-05820f3bed6f" LABEL="urras.penguinpee.nl:1" TYPE="linux_raid_member" PARTUUID="9c3d16cb-03" /dev/sdc4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="60defeb5-e1fd-3807-5211-24e73927ee3a" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTUUID="9c3d16cb-04" I'm not entirely sure if it's the missing `UUID` or the missing `TYPE` or both. But it sure breaks the assembly of raid devices on boot. Partitions sd[abc]4 make up raid device /dev/md54 on my system. That's the one that does come up. The other two (sd[abc]2 and sd[abc]3) don't. The only notable difference I could make out so far, is the superblock version. The failing devices use version 1.1, whereas /dev/md54 uses version 1.2.
Proposed as a Blocker for 40-beta by Fedora user gui1ty using the blocker tracking app because: This went unnoticed during F39 beta and release tests, but it breaks upgrading from a previous release (F37 in my case, but F38 is also affected) to F39. It will do the same for anyone wanting to upgrade from F38 to F40. There's no workaround if one of the essential parts needed for boot is on an md raid device. In my particular case, I was unable to assemble one of the raid devices required for boot manually in the emergency environment. The system would freeze running `mdadm --assemble`. This might or might not be related. But it sure does prevent manual workarounds.
After some bisecting, apparently, excluding errors from my side, the guilty patch is: b8889c0a214aeb3dd47bf1ab280fe5534b64d2aa is the first bad commit commit b8889c0a214aeb3dd47bf1ab280fe5534b64d2aa Author: Luca Boccassi <bluca> Date: Thu Feb 9 01:21:07 2023 +0000 libblkid: try LUKS2 first when probing If a device is locked with OPAL we'll get an I/O error when probing past the LUKS2 header. Try it first to avoid this issue. It is useful regardless, as when we find a LUKS superblock we stop there anyway, no need to check futher. Signed-off-by: Luca Boccassi <bluca> libblkid/src/superblocks/superblocks.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) Hope this helps, bye, pg
The offending commit was reverted upstream: https://github.com/util-linux/util-linux/commit/93ba7961779789217a1f814ce3110ff8c040c8c3 We got a new util-linux version the day before yesterday: https://bodhi.fedoraproject.org/updates/FEDORA-2024-aab6aa64a3 it broke Silverblue installs, but...it should fix this. Can somebody try? Thanks!
FEDORA-2024-b24d74c260 (util-linux-2.39.3-5.fc39) has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2024-b24d74c260
Since it seems like a pretty clear-cut case, I backported the reversion for F39 too.
Hi, thanks for the update. Apparently this version fixes the problem. I just installed "libblkid" (together with "libmount" and "libuuid", which seem required) for Koji and "blkid /dev/sdX" now returns the information as the working version. Thanks again! pg
Thanks! Did you test with the F39 update, or the Rawhide package?
FEDORA-2024-b24d74c260 has been pushed to the Fedora 39 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-b24d74c260` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-b24d74c260 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
I installed the F39 update and can confirm that the issue with RAID assembly is solved. Output of `blkid` looks healthy and system boots as of old.
Can anyone confirm that this works OK with the rawhide build too, by any chance? It *should*, but it'd be nice to be sure. Just booting a current nightly from https://openqa.fedoraproject.org/nightlies.html and checking would be enough - get a nightly from today, don't worry that it's not marked as "last known good". Thanks!
FEDORA-2024-b24d74c260 (util-linux-2.39.3-5.fc39) has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.
I'm reopening this preliminary. I just updated my system and upon reboot two md raid devices came up degraded because the partitions residing on /dev/sda were not added to the arrays. lsblk -f NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS sda ├─sda1 swap 1 d9d88c35-eb44-446b-b65c-d831a19daeac [SWAP] ├─sda2 ├─sda3 └─sda4 Strangely enough the swap partition on sda1 did not cause any issue. cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md1 : active raid1 sda3[3](S) sdc3[4] sdb3[2] 524286976 blocks super 1.1 [2/2] [UU] bitmap: 0/4 pages [0KB], 65536KB chunk md5 : active raid5 sdc2[4] sdb2[5] 1048573952 blocks super 1.1 level 5, 512k chunk, algorithm 2 [3/2] [_UU] bitmap: 1/4 pages [4KB], 65536KB chunk md54 : active raid5 sdc4[3] sdb4[4] 1797031936 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU] bitmap: 4/7 pages [16KB], 65536KB chunk I was able to restore the arrays using `mdadm /dev/md5 --re-add /dev/sda2` (similar for md54) without any problems. Recovery only took a little while thanks to the write intent bitmap. After everything was in working order again, the missing information was present: lsblk -f NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS sda ├─sda1 swap 1 d9d88c35-eb44-446b-b65c-d831a19daeac [SWAP] ├─sda2 │ └─md5 LVM2_member LVM2 001 b2nOV2-SYbK-N5WT-gi1p-99rp-Fvf6-QmhflM │ ├─vg_urras_raid5-lv_home ext4 1.0 home 3e8c03fd-dff5-468d-894b-9357f35117bb 10.5G 97% /home │ ├─vg_urras_raid5-lv_video ext4 1.0 video 25487924-8f60-4f8a-bad8-f846cce352c6 682.4M 100% /data/video │ └─vg_urras_raid5-lv_music ext4 1.0 music 5039214b-c2f6-49c2-a60c-20fafff82742 32.5G 34% /data/music ├─sda3 └─sda4 └─md54 LVM2_member LVM2 001 JaewFP-8Dx9-JakO-En0g-onza-Dhve-Oq3gfZ ├─vg_urras_raid5-lv_home ext4 1.0 home 3e8c03fd-dff5-468d-894b-9357f35117bb 10.5G 97% /home ├─vg_urras_raid5-lv_video ext4 1.0 video 25487924-8f60-4f8a-bad8-f846cce352c6 682.4M 100% /data/video ├─vg_urras_raid5-lv_opt ext4 1.0 opt 0831d40c-921d-4bb9-a989-81283a30af16 9.5G 51% /opt ├─vg_urras_raid5-lv_games ext4 1.0 games f35bac33-11bd-4cc2-8bb9-a8820599368b 125.6G 77% /data/games I poked through the logs with `journalct -xb` but I haven't found anything so far. I am on util-linux-2.39.3-5.fc39.x86_64.
This bug appears to have been reported against 'rawhide' during the Fedora Linux 40 development cycle. Changing version to 40.
well, Piergiorgio bisected it. We never actually confirmed that you and he were seeing the same issue, I guess.
True. However, before the update, the issue was reproducible withe every boot as well as booting from installation media as I recorded in the companion troubleshooting thread on Discussion: https://discussion.fedoraproject.org/t/system-fails-to-boot-after-dnf-system-upgrade-due-to-missing-md-raid-devices/100218 Since the update I have booted twice with success and once with issues. The troublesome boot resulting in comment 13, later turned out to be more troublesome. Sync processes would hang as discovered when running mock: https://github.com/rpm-software-management/mock/issues/1327#issuecomment-1946046409 For now, I would like to keep it open. I may run some more investigation if the need arises. Though, bisecting is not my strongest part and the system affected is my daily driver, which, for obvious reasons, I don't like to mess around with too much.
it's also a bit tricky as this is a proposed blocker for F40, and F40 has a newer util-linux that you're not testing. can you maybe at least try booting an f40 live image a few times and seeing if it brings up the array correctly?
Right, now I see why you were pushing for rawhide (now F40) tests in comment 11. I can certainly try booting with F40 media. I'll leave the needinfo flag set, so I don't forget updating the bug with my findings.
Discussed during the 2024-02-19 blocker review meeting: [0] The decision to delay the classification of this as a blocker bug was made as the reporter who reopened this issue may not have the same bug as the original reporter, the situation seems less clear-cut than it was for that reporter. We will delay the decision while attempting to get more information on Sandro's case, including testing on Fedora 40. [0] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2024-02-19/f40-blocker-review.2024-02-19-17.00.log.txt
I have tested booting from F40 iso with util-linux-2.40. No issues occurred. The UUIDs of all drives were listed correctly and the RAID devices assembled. Since I was curious if the issue I saw might be a different bug in util-linux, I also bisected and arrived at the same commit as in comment 3. With that being reverted, I'd say the issue that made me reopen this bug was probably one of the drives having a glitch, being unresponsive during boot or the like.
awesome, thanks for testing.