Bug 1878970
| Summary: | kernel-5.8.8-200.fc32.x86_64 - IMSM RAID not recognized | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Clay Jordan <claywj> | ||||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
| Status: | CLOSED NEXTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 32 | CC: | acaringi, airlied, bskeggs, hdegoede, ichavero, ipilcher, itamar, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, kernel-maint, lgoncalv, linville, masami256, mchehab, Michael.Riss, mjg59, steved | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2020-09-15 22:08:55 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
I just saw something similar. In my case, only my /boot filesystem in on the IMSM RAID, so it was easy enough to boot by commenting out that line in /etc/fstab.
For whatever reason, it appears that the IMSM (Intel RAID) signature isn't recognized by the new kernel. Running 5.8.7-200.fc32.x86_64 (which works), I get the following:
[pilcher@ian ~]$ uname -r
5.8.7-200.fc32.x86_64
[pilcher@ian ~]$ sudo mdadm --detail-platform
Platform : Intel(R) Rapid Storage Technology
Version : 11.2.0.1527
RAID Levels : raid0 raid1 raid10 raid5
Chunk Sizes : 4k 8k 16k 32k 64k 128k
2TB volumes : supported
2TB disks : supported
Max Disks : 6
Max Volumes : 2 per array, 4 per controller
I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA)
Port3 : /dev/sdd (MSK5235H2PJ7TG)
Port1 : /dev/sdb (43P2YEVGS)
Port2 : /dev/sdc (MSK5235H29X18G)
Port5 : - non-disk device (HL-DT-ST BD-RE WH16NS60) -
Port0 : /dev/sda (S21CNSAG402179X)
Port4 : - no device attached -
[pilcher@ian ~]$ sudo mdadm --examine --verbose /dev/sdc
/dev/sdc:
Magic : Intel Raid ISM Cfg Sig.
Version : 1.1.00
Orig Family : d7e8a7e3
Family : d7e8a7e3
Generation : 005a1992
Attributes : All supported
UUID : 1ebd7712:2a74af1f:34298316:cb855b50
Checksum : d22bf36a correct
MPB Sectors : 1
Disks : 2
RAID Devices : 1
Disk00 Serial : MSK5235H29X18G
State : active
Id : 00000002
Usable Size : 1953519616 (931.51 GiB 1000.20 GB)
[Volume0]:
UUID : 3d7bd72f:82a8cbcc:2d217397:12f3ff95
RAID Level : 1
Members : 2
Slots : [UU]
Failed disk : none
This Slot : 0
Sector Size : 512
Array Size : 1953519616 (931.51 GiB 1000.20 GB)
Per Dev Size : 1953519880 (931.51 GiB 1000.20 GB)
Sector Offset : 0
Num Stripes : 7630936
Chunk Size : 64 KiB
Reserved : 0
Migrate State : idle
Map State : normal
Dirty State : clean
RWH Policy : off
Disk01 Serial : MSK5235H2PJ7TG
State : active
Id : 00000003
Usable Size : 1953519616 (931.51 GiB 1000.20 GB)
But running 5.8.8-200.fc32.x86_64 I see:
[pilcher@ian system]$ uname -r
5.8.8-200.fc32.x86_64
[pilcher@ian system]$ sudo mdadm --detail-platform
Platform : Intel(R) Rapid Storage Technology
Version : 11.2.0.1527
RAID Levels : raid0 raid1 raid10 raid5
Chunk Sizes : 4k 8k 16k 32k 64k 128k
2TB volumes : supported
2TB disks : supported
Max Disks : 6
Max Volumes : 2 per array, 4 per controller
I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA)
Port3 : /dev/sdd (MSK5235H2PJ7TG)
Port1 : /dev/sdb (43P2YEVGS)
Port2 : /dev/sdc (MSK5235H29X18G)
Port5 : - non-disk device (HL-DT-ST BD-RE WH16NS60) -
Port0 : /dev/sda (S21CNSAG402179X)
Port4 : - no device attached -
[pilcher@ian system]$ sudo mdadm --examine --verbose /dev/sdc
/dev/sdc:
MBR Magic : aa55
Partition[0] : 204800 sectors at 2048 (type 07)
Partition[1] : 2048000 sectors at 206848 (type 83)
Partition[2] : 61440000 sectors at 2254848 (type 07)
Partition[3] : 1889824768 sectors at 63694848 (type 05)
Created attachment 1714949 [details]
Output of 'strace mdadm --examine --verbose /dev/sdc' on kernel 5.8.7 (works)
Created attachment 1714950 [details]
Output of 'strace mdadm --examine --detail /dev/sdc' on kernel 5.8.8 (doesn't work)
I've just attached the strace output of 'mdadm --examine --verbose /dev/sdc' on both kernel 5.8.7 (works) and 5.8.8 (doesn't work). The first significant difference I see is on line 185, where the error code returned by the BLKPG_DEL_PARTITION ioctl has changed from ENXIO to ENOMEM. git bisect says:
692d0626557451c4b557397f20b7394b612d0289 is the first bad commit
commit 692d0626557451c4b557397f20b7394b612d0289
Author: Christoph Hellwig <hch>
Date: Tue Sep 1 11:59:41 2020 +0200
block: fix locking in bdev_del_partition
[ Upstream commit 08fc1ab6d748ab1a690fd483f41e2938984ce353 ]
We need to hold the whole device bd_mutex to protect against
other thread concurrently deleting out partition before we get
to it, and thus causing a use after free.
Fixes: cddae808aeb7 ("block: pass a hd_struct to delete_partition")
Reported-by: syzbot+6448f3c229bc52b82f69.com
Signed-off-by: Christoph Hellwig <hch>
Signed-off-by: Jens Axboe <axboe>
Signed-off-by: Sasha Levin <sashal>
block/partitions/core.c | 27 +++++++++++++--------------
1 file changed, 13 insertions(+), 14 deletions(-)
And fixed by:
commit 88ce2a530cc9865a894454b2e40eba5957a60e1a
Author: Christoph Hellwig <hch>
Date: Tue Sep 8 16:15:06 2020 +0200
block: restore a specific error code in bdev_del_partition
mdadm relies on the fact that deleting an invalid partition returns
-ENXIO or -ENOTTY to detect if a block device is a partition or a
whole device.
Fixes: 08fc1ab6d748 ("block: fix locking in bdev_del_partition")
Reported-by: kernel test robot <rong.a.chen>
Signed-off-by: Christoph Hellwig <hch>
Signed-off-by: Jens Axboe <axboe>
diff --git a/block/partitions/core.c b/block/partitions/core.c
index 5b4869c08fb3..722406b841df 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -537,7 +537,7 @@ int bdev_del_partition(struct block_device *bdev, int partno)
bdevp = bdget_disk(bdev->bd_disk, partno);
if (!bdevp)
- return -ENOMEM;
+ return -ENXIO;
mutex_lock(&bdevp->bd_mutex);
mutex_lock_nested(&bdev->bd_mutex, 1);
Fixed in kernel-5.8.9-200.fc32.x86_64. Update with 'dnf --enablerepo=updates-testing update kernel'. Thank you Ian. That was amazingly fast. I had a chance to install the kernel this evening and can confirm it is fixed. |
1. Please describe the problem: 3 systems all using intel RST raid were updated to kernel 5.8.8-200.fc32.x86_64 concurrently along with a 4th not using Intel RST whose root vol is located on a stand-alone NVME. The 3 intel RST systems all failed to boot, dropping to an emergency shell after dracut-initqueue timeout. The 4th rebooted without issue. The symptom as all the same. in the shell, journalctl shows message "cannot activate LV's in VG Fedora_carrot while PV's appear on duplicate devices.". and "lvm vgscan" reflects similar messages (not using device /dev/sdb2 for PV ...", "PV ... prefers device /dev/sda2"). What concerns me also is this update for me also updated grub-tools which I don't know if it's related, but it appears dracut the md device as it should. I was able to reboot all three systems on the previous kernel, 5.8.7-200.fc32.x86_64. Per dnf history list, this update installed: Install kernel-5.8.8-200.fc32.x86_64 @updates Install kernel-core-5.8.8-200.fc32.x86_64 @updates Install kernel-modules-5.8.8-200.fc32.x86_64 @updates Install kernel-modules-extra-5.8.8-200.fc32.x86_64 @updates Upgrade grub2-common-1:2.04-22.fc32.noarch @updates Upgraded grub2-common-1:2.04-21.fc32.noarch @@System Upgrade grub2-efi-x64-1:2.04-22.fc32.x86_64 @updates Upgraded grub2-efi-x64-1:2.04-21.fc32.x86_64 @@System Upgrade grub2-tools-1:2.04-22.fc32.x86_64 @updates Upgraded grub2-tools-1:2.04-21.fc32.x86_64 @@System Upgrade grub2-tools-efi-1:2.04-22.fc32.x86_64 @updates Upgraded grub2-tools-efi-1:2.04-21.fc32.x86_64 @@System Upgrade grub2-tools-extra-1:2.04-22.fc32.x86_64 @updates Upgraded grub2-tools-extra-1:2.04-21.fc32.x86_64 @@System Upgrade grub2-tools-minimal-1:2.04-22.fc32.x86_64 @updates Upgraded grub2-tools-minimal-1:2.04-21.fc32.x86_64 @@System Upgrade kernel-headers-5.8.8-200.fc32.x86_64 @updates Upgraded kernel-headers-5.8.6-200.fc32.x86_64 @@System Removed kernel-5.6.6-300.fc32.x86_64 @@System Removed kernel-core-5.6.6-300.fc32.x86_64 @@System Removed kernel-modules-5.6.6-300.fc32.x86_64 @@System nothing shows in grubby info for the bad vs good, example from one system: index=0 kernel="/boot/vmlinuz-5.8.8-200.fc32.x86_64" args="ro resume=/dev/mapper/fedora_popcorn-swap rd.lvm.lv=fedora_popcorn/root rd.lvm.lv=fedora_popcorn/swap nomodeset rhgb quiet" root="/dev/mapper/fedora_popcorn-root" initrd="/boot/initramfs-5.8.8-200.fc32.x86_64.img" title="Fedora (5.8.8-200.fc32.x86_64) 32 (Thirty Two)" id="cd37b505ac9e4ac4bf9dbad3bdd26142-5.8.8-200.fc32.x86_64" index=1 kernel="/boot/vmlinuz-5.8.7-200.fc32.x86_64" args="ro resume=/dev/mapper/fedora_popcorn-swap rd.lvm.lv=fedora_popcorn/root rd.lvm.lv=fedora_popcorn/swap nomodeset rhgb quiet" root="/dev/mapper/fedora_popcorn-root" initrd="/boot/initramfs-5.8.7-200.fc32.x86_64.img" title="Fedora (5.8.7-200.fc32.x86_64) 32 (Thirty Two)" id="cd37b505ac9e4ac4bf9dbad3bdd26142-5.8.7-200.fc32.x86_64" from same system as above /etc/default/grub: GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)" GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true GRUB_TERMINAL_OUTPUT="console" GRUB_CMDLINE_LINUX="resume=/dev/mapper/fedora_popcorn-swap rd.lvm.lv=fedora_popcorn/root rd.lvm.lv=fedora_popcorn/swap nomodeset rhgb quiet" GRUB_DISABLE_RECOVERY="true" GRUB_ENABLE_BLSCFG=true ls /etc/grub.d does not show any recent changes after the last three kernel updates so it should not be the problem. I did an lsinitrd on each initramfs followed by a vimdiff and the changes are all in kernel modules, 131 in total and the most relevant are: -rw-r--r-- 1 root root 23060 May 29 13:35 usr/lib/modules/5.8.8-200.fc32.x86_64/kernel/drivers/md/raid1.ko.xz -rw-r--r-- 1 root root 6552 May 29 13:35 usr/lib/modules/5.8.8-200.fc32.x86_64/kernel/drivers/pinctrl/intel/pinctrl-broxton.ko.xz -rw-r--r-- 1 root root 6032 May 29 13:35 usr/lib/modules/5.8.8-200.fc32.x86_64/kernel/drivers/pinctrl/intel/pinctrl-cannonlake.ko.xz -rw-r--r-- 1 root root 4572 May 29 13:35 usr/lib/modules/5.8.8-200.fc32.x86_64/kernel/drivers/pinctrl/intel/pinctrl-geminilake.ko.xz -rw-r--r-- 1 root root 3768 May 29 13:35 usr/lib/modules/5.8.8-200.fc32.x86_64/kernel/drivers/pinctrl/intel/pinctrl-jasperlake.ko.xz -rw-r--r-- 1 root root 3688 May 29 13:35 usr/lib/modules/5.8.8-200.fc32.x86_64/kernel/drivers/pinctrl/intel/pinctrl-lewisburg.ko.xz -rw-r--r-- 1 root root 7556 May 29 13:35 usr/lib/modules/5.8.8-200.fc32.x86_64/kernel/drivers/pinctrl/intel/pinctrl-lynxpoint.ko.xz -rw-r--r-- 1 root root 4796 May 29 13:35 usr/lib/modules/5.8.8-200.fc32.x86_64/kernel/drivers/pinctrl/intel/pinctrl-sunrisepoint.ko.xz -rw-r--r-- 1 root root 4112 May 29 13:35 usr/lib/modules/5.8.8-200.fc32.x86_64/kernel/drivers/pinctrl/intel/pinctrl-tigerlake.ko.xz -rw-r--r-- 1 root root 7648 May 29 13:35 usr/lib/modules/5.8.8-200.fc32.x86_64/kernel/drivers/pinctrl/pinctrl-amd.ko.xz 2. What is the Version-Release number of the kernel: 5.8.8-200.fc32.x86_64 3. Did it work previously in Fedora? If so, what kernel version did the issue *first* appear? Old kernels are available for download at Appeared in: 5.8.8-200.fc32.x86_64 Not present: 5.8.7-200.fc32.x86_64 4. Can you reproduce this issue? If so, please provide the steps to reproduce the issue below: Yes. install updates, allow grubby to regen initramfs, and try to boot. 5. Does this problem occur with the latest Rawhide kernel? To install the Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by ``sudo dnf update --enablerepo=rawhide kernel``: Have not tried due to space. 6. Are you running any modules that not shipped with directly Fedora's kernel?: no 7. Please attach the kernel logs. You can get the complete kernel log for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the issue occurred on a previous boot, use the journalctl ``-b`` flag. lost after reboot, I posted the relevant messages from the console.