Bug 2249392 - libblkid returns incomplete information for partitons (e.g. UUID and TYPE are missing) preventing assembly of md raid devices
Summary: libblkid returns incomplete information for partitons (e.g. UUID and TYPE are...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: util-linux
Version: 40
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Karel Zak
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: BetaBlocker, F40BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2023-11-12 18:47 UTC by Piergiorgio Sartor
Modified: 2024-04-08 16:05 UTC (History)
8 users (show)

Fixed In Version: util-linux-2.39.3-5.fc39
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-02-22 17:36:59 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Piergiorgio Sartor 2023-11-12 18:47:03 UTC
Description of problem:
"dracut" uses "blkid" which in turn links "libblkid" to detect UUID of device, particularly, RAID devices.
It seems that "libblkid" from "util-linux-2.39.2-1" does not report the UUID properly, but, sometime, the PARTUUID.
The consequence is that "dracut" does not identify the RAID and it does not assemble it at boot (which is all but nice).
The previous package, "util-linux-2.38.1-4", from Fedora 38 was working fine and replacing "libblkid.so.1.1.0" with this one fixes the problem.

Version-Release number of selected component (if applicable):
util-linux-2.39.2-1

How reproducible:
Always

Steps to Reproduce:
1.
Have a system with "md" RAID on DOS partitions or on non-partitioned devices.
2.
dracut -f
blkid /dev/some_md_device
3.

Actual results:
"dracut" does not generate the appropriate "initramfs", "blkid" returns only the PARTUUID ("dracut" uses the UUID to identify the RAID).

Expected results:
"blkid" should return the full information: UUID, SUB_UUID, LABEL, TYPE, PARTUUID
The previous version of "libblkid" is returning all of the above.

Additional info:
It seems that the partition with ID=0x83 (Linux) is handled properly, that is the full correct information is returned, even if it belongs to a "md" device (RAID-1, version 1.0). The partition with ID=0xDA, (Non-FS data) is not, only PARTUUID is returned.
It could be the problem is mainly visible with "md", but still the new behavior does not seem correct.

Again, this prevents booting on some (all) system with "md" RAID, so it is pretty serious issue.

Hope this helps,

pg

Comment 1 Sandro 2023-12-31 17:39:49 UTC
It seems I ran into the same issue after upgrading from F37 to F39 a few days ago. Only one raid device was assembled properly after reboot. Two failed. At first all my drives were using mbr. In the cause of troubleshooting I came across a warning regarding duplicate UUIDs and fixed that by converting one of the drives to gpt and using gdisk to randomize disk and partitions GUIDs.

Here is the output of `blkid` for all raid partitions on Fedora 39 (blkid from util-linux 2.39.2  (libblkid 2.39.2, 17-Aug-2023)):

# sudo blkid /dev/sd[a-c][2-4]
/dev/sda2: PARTLABEL="Linux RAID" PARTUUID="f6f1aa67-4057-49a9-aad9-c86d08ad1936"
/dev/sda3: PARTLABEL="Linux RAID" PARTUUID="dc76e2b3-02e4-4bfc-b260-c7a2ec8bc103"
/dev/sda4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="f0245de8-5d17-30f6-200d-32671744ffed" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTLABEL="Linux RAID" PARTUUID="16d46970-7302-427b-ab54-36c774a006a0"
/dev/sdb2: PARTUUID="d2b6d202-02"
/dev/sdb3: PARTUUID="d2b6d202-03"
/dev/sdb4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="af10814d-0950-ad5f-3cdf-effe7a4c97c6" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTUUID="d2b6d202-04"
/dev/sdc2: PARTUUID="9c3d16cb-02"
/dev/sdc3: PARTUUID="9c3d16cb-03"
/dev/sdc4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="60defeb5-e1fd-3807-5211-24e73927ee3a" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTUUID="9c3d16cb-04"

Compare that to Fedora 38 (blkid from util-linux 2.38.1  (libblkid 2.38.1, 04-Aug-2022):

# sudo blkid /dev/sd[a-c][2-4]
/dev/sda2: UUID="39295d93-e5a7-5797-b722-87f351563755" UUID_SUB="4bed6fca-21ec-efb3-c1ee-c35f0254ff2b" LABEL="urras.penguinpee.nl:5" TYPE="linux_raid_member" PARTLABEL="Linux RAID" PARTUUID="f6f1aa67-4057-49a9-aad9-c86d08ad1936"
/dev/sda3: UUID="4a2c44b5-25f2-a6c9-0e7f-6cae37a8a9cc" UUID_SUB="b5ab5a58-2c95-8982-f162-8e648048b6a8" LABEL="urras.penguinpee.nl:1" TYPE="linux_raid_member" PARTLABEL="Linux RAID" PARTUUID="dc76e2b3-02e4-4bfc-b260-c7a2ec8bc103"
/dev/sda4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="f0245de8-5d17-30f6-200d-32671744ffed" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTLABEL="Linux RAID" PARTUUID="16d46970-7302-427b-ab54-36c774a006a0"
/dev/sdb2: UUID="39295d93-e5a7-5797-b722-87f351563755" UUID_SUB="27b15be2-7aa2-3334-11eb-748bd7e57df1" LABEL="urras.penguinpee.nl:5" TYPE="linux_raid_member" PARTUUID="d2b6d202-02"
/dev/sdb3: UUID="4a2c44b5-25f2-a6c9-0e7f-6cae37a8a9cc" UUID_SUB="e89a1593-d3a3-9398-54ce-414dae37d801" LABEL="urras.penguinpee.nl:1" TYPE="linux_raid_member" PARTUUID="d2b6d202-03"
/dev/sdb4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="af10814d-0950-ad5f-3cdf-effe7a4c97c6" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTUUID="d2b6d202-04"
/dev/sdc2: UUID="39295d93-e5a7-5797-b722-87f351563755" UUID_SUB="af87a72c-1cb7-7e54-fa38-8f1c30d18b8e" LABEL="urras.penguinpee.nl:5" TYPE="linux_raid_member" PARTUUID="9c3d16cb-02"
/dev/sdc3: UUID="4a2c44b5-25f2-a6c9-0e7f-6cae37a8a9cc" UUID_SUB="3dc2ac9a-95cf-7103-8ebc-05820f3bed6f" LABEL="urras.penguinpee.nl:1" TYPE="linux_raid_member" PARTUUID="9c3d16cb-03"
/dev/sdc4: UUID="fb919273-c6bf-b891-ea1c-a83c0a8b3ad7" UUID_SUB="60defeb5-e1fd-3807-5211-24e73927ee3a" LABEL="urras.penguinpee.nl:54" TYPE="linux_raid_member" PARTUUID="9c3d16cb-04"

I'm not entirely sure if it's the missing `UUID` or the missing `TYPE` or both. But it sure breaks the assembly of raid devices on boot.

Partitions sd[abc]4 make up raid device /dev/md54 on my system. That's the one that does come up. The other two (sd[abc]2 and sd[abc]3) don't. The only notable difference I could make out so far, is the superblock version. The failing devices use version 1.1, whereas /dev/md54 uses version 1.2.

Comment 2 Fedora Blocker Bugs Application 2023-12-31 17:48:36 UTC
Proposed as a Blocker for 40-beta by Fedora user gui1ty using the blocker tracking app because:

 This went unnoticed during F39 beta and release tests, but it breaks upgrading from a previous release (F37 in my case, but F38 is also affected) to F39. It will do the same for anyone wanting to upgrade from F38 to F40.

There's no workaround if one of the essential parts needed for boot is on an md raid device. In my particular case, I was unable to assemble one of the raid devices required for boot manually in the emergency environment. The system would freeze running `mdadm --assemble`. This might or might not be related. But it sure does prevent manual workarounds.

Comment 3 Piergiorgio Sartor 2024-01-09 20:44:49 UTC
After some bisecting, apparently, excluding errors from my side, the guilty patch is:

b8889c0a214aeb3dd47bf1ab280fe5534b64d2aa is the first bad commit
commit b8889c0a214aeb3dd47bf1ab280fe5534b64d2aa
Author: Luca Boccassi <bluca>
Date:   Thu Feb 9 01:21:07 2023 +0000

    libblkid: try LUKS2 first when probing
    
    If a device is locked with OPAL we'll get an I/O error when probing
    past the LUKS2 header. Try it first to avoid this issue. It is
    useful regardless, as when we find a LUKS superblock we stop there
    anyway, no need to check futher.
    
    Signed-off-by: Luca Boccassi <bluca>

 libblkid/src/superblocks/superblocks.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Hope this helps,

bye,

pg

Comment 4 Adam Williamson 2024-02-06 17:33:02 UTC
The offending commit was reverted upstream:
https://github.com/util-linux/util-linux/commit/93ba7961779789217a1f814ce3110ff8c040c8c3

We got a new util-linux version the day before yesterday:
https://bodhi.fedoraproject.org/updates/FEDORA-2024-aab6aa64a3

it broke Silverblue installs, but...it should fix this. Can somebody try? Thanks!

Comment 5 Fedora Update System 2024-02-06 18:04:45 UTC
FEDORA-2024-b24d74c260 (util-linux-2.39.3-5.fc39) has been submitted as an update to Fedora 39.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-b24d74c260

Comment 6 Adam Williamson 2024-02-06 18:06:28 UTC
Since it seems like a pretty clear-cut case, I backported the reversion for F39 too.

Comment 7 Piergiorgio Sartor 2024-02-06 18:18:54 UTC
Hi, thanks for the update.

Apparently this version fixes the problem.
I just installed "libblkid" (together with "libmount" and "libuuid", which seem required) for Koji and "blkid /dev/sdX" now returns the information as the working version.

Thanks again!

pg

Comment 8 Adam Williamson 2024-02-06 23:22:22 UTC
Thanks! Did you test with the F39 update, or the Rawhide package?

Comment 9 Fedora Update System 2024-02-07 01:29:57 UTC
FEDORA-2024-b24d74c260 has been pushed to the Fedora 39 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-b24d74c260`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-b24d74c260

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 10 Sandro 2024-02-07 09:41:19 UTC
I installed the F39 update and can confirm that the issue with RAID assembly is solved. Output of `blkid` looks healthy and system boots as of old.

Comment 11 Adam Williamson 2024-02-07 18:27:31 UTC
Can anyone confirm that this works OK with the rawhide build too, by any chance? It *should*, but it'd be nice to be sure. Just booting a current nightly from https://openqa.fedoraproject.org/nightlies.html and checking would be enough - get a nightly from today, don't worry that it's not marked as "last known good". Thanks!

Comment 12 Fedora Update System 2024-02-10 01:26:11 UTC
FEDORA-2024-b24d74c260 (util-linux-2.39.3-5.fc39) has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 13 Sandro 2024-02-13 20:35:55 UTC
I'm reopening this preliminary. I just updated my system and upon reboot two md raid devices came up degraded because the partitions residing on /dev/sda were not added to the arrays.

lsblk -f
NAME                                 FSTYPE            FSVER    LABEL                  UUID                                   FSAVAIL FSUSE% MOUNTPOINTS
sda                                                                                                                                          
├─sda1                               swap              1                               d9d88c35-eb44-446b-b65c-d831a19daeac                  [SWAP]
├─sda2                                                                                                                                       
├─sda3                                                                                                                                       
└─sda4

Strangely enough the swap partition on sda1 did not cause any issue.

cat /proc/mdstat 
Personalities : [raid1] [raid6] [raid5] [raid4] 
md1 : active raid1 sda3[3](S) sdc3[4] sdb3[2]
      524286976 blocks super 1.1 [2/2] [UU]
      bitmap: 0/4 pages [0KB], 65536KB chunk

md5 : active raid5 sdc2[4] sdb2[5]
      1048573952 blocks super 1.1 level 5, 512k chunk, algorithm 2 [3/2] [_UU]
      bitmap: 1/4 pages [4KB], 65536KB chunk

md54 : active raid5 sdc4[3] sdb4[4]
      1797031936 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU]
      bitmap: 4/7 pages [16KB], 65536KB chunk

I was able to restore the arrays using `mdadm /dev/md5 --re-add /dev/sda2` (similar for md54) without any problems. Recovery only took a little while thanks to the write intent bitmap.

After everything was in working order again, the missing information was present:

lsblk -f
NAME                                 FSTYPE            FSVER    LABEL                  UUID                                   FSAVAIL FSUSE% MOUNTPOINTS
sda                                                                                                                                          
├─sda1                               swap              1                               d9d88c35-eb44-446b-b65c-d831a19daeac                  [SWAP]
├─sda2                                                                                                                                       
│ └─md5                              LVM2_member       LVM2 001                        b2nOV2-SYbK-N5WT-gi1p-99rp-Fvf6-QmhflM                
│   ├─vg_urras_raid5-lv_home         ext4              1.0      home                   3e8c03fd-dff5-468d-894b-9357f35117bb     10.5G    97% /home
│   ├─vg_urras_raid5-lv_video        ext4              1.0      video                  25487924-8f60-4f8a-bad8-f846cce352c6    682.4M   100% /data/video
│   └─vg_urras_raid5-lv_music        ext4              1.0      music                  5039214b-c2f6-49c2-a60c-20fafff82742     32.5G    34% /data/music
├─sda3                                                                                                                                       
└─sda4                                                                                                                                       
  └─md54                             LVM2_member       LVM2 001                        JaewFP-8Dx9-JakO-En0g-onza-Dhve-Oq3gfZ                
    ├─vg_urras_raid5-lv_home         ext4              1.0      home                   3e8c03fd-dff5-468d-894b-9357f35117bb     10.5G    97% /home
    ├─vg_urras_raid5-lv_video        ext4              1.0      video                  25487924-8f60-4f8a-bad8-f846cce352c6    682.4M   100% /data/video
    ├─vg_urras_raid5-lv_opt          ext4              1.0      opt                    0831d40c-921d-4bb9-a989-81283a30af16      9.5G    51% /opt
    ├─vg_urras_raid5-lv_games        ext4              1.0      games                  f35bac33-11bd-4cc2-8bb9-a8820599368b    125.6G    77% /data/games

I poked through the logs with `journalct -xb` but I haven't found anything so far. I am on util-linux-2.39.3-5.fc39.x86_64.

Comment 14 Aoife Moloney 2024-02-15 23:04:25 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 40 development cycle.
Changing version to 40.

Comment 15 Adam Williamson 2024-02-17 02:26:13 UTC
well, Piergiorgio bisected it. We never actually confirmed that you and he were seeing the same issue, I guess.

Comment 16 Sandro 2024-02-17 09:02:25 UTC
True. However, before the update, the issue was reproducible withe every boot as well as booting from installation media as I recorded in the companion troubleshooting thread on Discussion: https://discussion.fedoraproject.org/t/system-fails-to-boot-after-dnf-system-upgrade-due-to-missing-md-raid-devices/100218

Since the update I have booted twice with success and once with issues. The troublesome boot resulting in comment 13, later turned out to be more troublesome. Sync processes would hang as discovered when running mock: https://github.com/rpm-software-management/mock/issues/1327#issuecomment-1946046409

For now, I would like to keep it open. I may run some more investigation if the need arises. Though, bisecting is not my strongest part and the system affected is my daily driver, which, for obvious reasons, I don't like to mess around with too much.

Comment 17 Adam Williamson 2024-02-19 18:51:58 UTC
it's also a bit tricky as this is a proposed blocker for F40, and F40 has a newer util-linux that you're not testing. can you maybe at least try booting an f40 live image a few times and seeing if it brings up the array correctly?

Comment 18 Sandro 2024-02-19 20:44:58 UTC
Right, now I see why you were pushing for rawhide (now F40) tests in comment 11. I can certainly try booting with F40 media.

I'll leave the needinfo flag set, so I don't forget updating the bug with my findings.

Comment 19 Geoffrey Marr 2024-02-20 05:45:12 UTC
Discussed during the 2024-02-19 blocker review meeting: [0]

The decision to delay the classification of this as a blocker bug was made as the reporter who reopened this issue may not have the same bug as the original reporter, the situation seems less clear-cut than it was for that reporter. We will delay the decision while attempting to get more information on Sandro's case, including testing on Fedora 40.

[0] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2024-02-19/f40-blocker-review.2024-02-19-17.00.log.txt

Comment 20 Sandro 2024-02-22 17:36:59 UTC
I have tested booting from F40 iso with util-linux-2.40. No issues occurred. The UUIDs of all drives were listed correctly and the RAID devices assembled.

Since I was curious if the issue I saw might be a different bug in util-linux, I also bisected and arrived at the same commit as in comment 3. With that being reverted, I'd say the issue that made me reopen this bug was probably one of the drives having a glitch, being unresponsive during boot or the like.

Comment 21 Adam Williamson 2024-02-22 21:02:34 UTC
awesome, thanks for testing.


Note You need to log in before you can comment on or make changes to this bug.