== Comment: #0 - Viktor Mihajlovski <MIHAJLOV.com> - 2020-09-01 06:51:44 == ---Problem Description--- Fedora kernel 5.8.4 fails to boot from DASD in a KVM guest. Contact Information = Viktor Mihajlovski <mihajlov.com> ---uname output--- Linux localhost.localdomain 5.8.4-200.fc32.s390x #1 SMP Wed Aug 26 22:12:29 UTC 2020 s390x s390x s390x GNU/Linux Machine Type = 3096-703 ---System Hang--- After an update to kernel 5.8.4 the system fails to detect the filesystems and eventually ends up in the emergency shell. Looking at /dev I don't see any partitions but only the full disk /dev/vda. This also matches the dmesg output, so maybe the partition detection is broken for virtio-attached DASD. ---Debugger--- A debugger is not configured ---Steps to Reproduce--- 1. Install a Fedora 32 KVM guest on a DASD, e.g. using virt-install $ virt-install --name s22 --memory 2048 --disk path=/dev/disk/by-path/ccw-0.0.a03f --location https://ftp-stud.hs-esslingen.de/pub/fedora-secondary/releases/32/Everything/s390x/os/ 2. Accept all defaults in the text installers, let the installation finish and reboot 3. Login to the system and run dnf update 4. Reboot, this will lead to the hang. It is possible to recover by selecting the originally installed kernel. Stack trace output: no Oops output: no System Dump Info: The system is not configured to capture a system dump. I haven't tried with the latest upstream kernel, but as Fedora is pretty close to upstream I could imagine that this issue exists there as well.
This sounds familiar to me, perhaps there is a similar report on the enterprise side ...
Viktor, so you have updated from 5.7 kernel to 5.8.4 in the guest, right? And can be reproduced with the network install, because it gets 5.8 kernel from updates during the installation?
I'm not aware of a similar problem on the enterprise side. Update to a newer RHEL-8.3 kernel (tested with kernel-4.18.0-234.el8) works fine in KVM (no boot problems). But I'm able to reproduce the problem after updating kernel to 5.8.6-201.fc32 in KVM, and also after re-creating the initrd in no-hostonly mode: [root@localhost ~]# dracut --no-hostonly /boot/initramfs-5.8.6-201.fc32.s390x.img 5.8.6-201.fc32.s390x -f dracut: Disabling early microcode, because kernel does not support it. CONFIG_MICROCODE_[AMD|INTEL]!=y [root@localhost ~]# zipl .. [root@localhost ~]# reboot ... [ OK ] Reached target Basic System. [ 6.411734] virtio_blk virtio1: [vda] 1803060 4096-byte logical blocks (7.39 GB/6.88 GiB) [ 6.411918] vda: detected capacity change from 0 to 7385333760 [ 7.204466] alg: No test for crc32be (crc32be-vx) [ 7.653375] virtio_net virtio2 enc1: renamed from eth0 [ 200.989524] dracut-initqueue[403]: Warning: dracut-initqueue timeout - starting timeout scripts But all this testing was done on a RHEL-8 system. We can also try Fedora-Rawhide, which currently uses kernel-5.8.0-1.fc33 even for the installation - I will try tomorrow.
------- Comment From MIHAJLOV.com 2020-09-09 04:54 EDT------- (In reply to comment #11) > Viktor, so you have updated from 5.7 kernel to 5.8.4 in the guest, right? > And can be reproduced with the network install, because it gets 5.8 kernel > from updates during the installation? I did a network install initially from Fedora mirror https://ftp-stud.hs-esslingen.de/pub/fedora-secondary/releases/32/Everything/s390x/os. After the installation I had the 5.6.6 running, so it doesn't seem to be updated during the installation. The trouble started, when I did a dnf update.
In theory it could be related to https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=662155e2898dd1c3915e420378bb6c0826548e70 (which appears in 5.7 first). I think we will need IBM's s390 kernel people to take a look.
------- Comment From cborntra.com 2020-09-09 06:57 EDT------- An alternative might be the rework from Christoph Hellwig regarding the removal ioctl_by_bdev that triggered a change in the dasd driver. Stefan haberland did tested that, though. Another thing, it seems that vanilla upstream 5.8 kernel does not have an issue with partition detection on dasd via virtio-blk.
Now I wonder if it could be caused by a different kernel config and/or by a missing module in initrd ...
Tried installation of compose https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20200902.n.1/compose/Server/s390x/os/ with kernel-5.8.0-1.fc33.s390x. The installation was successful, vda1 and vdb2 were created, but the installed system didn't boot with the error reported in this bug. When I restarted the installation using the same disk, only the /dev/vda device was created. Parted can see both partitions: [anaconda root@fedora ~]# ls /dev/vda* /dev/vda [anaconda root@fedora ~]# [anaconda root@fedora ~]# parted /dev/vda GNU Parted 3.3 Using /dev/vda Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) print Model: Virtio Block Device (virtblk) Disk /dev/vda: 7385MB Sector size (logical/physical): 4096B/4096B Partition Table: dasd Disk Flags: Number Start End Size File system Flags 1 98.3kB 1074MB 1074MB xfs 2 1074MB 7385MB 6311MB lvm (parted)
Also reproduced with 5.9.0-0.rc3.20200902git9c7d619be5a0.1.fc34.s390x
------- Comment From cborntra.com 2020-10-07 04:33 EDT------- So it seems to be dependent on the kernel config. With the fedora32 config I could reproduce this with the upstream kernels. 5.7 is fine, 5.8 is broken. So I could bisect this to 26d7e28e38206b1b3207af1409eee2269ab36f82 is the first bad commit commit 26d7e28e38206b1b3207af1409eee2269ab36f82 Author: Stefan Haberland <sth.com> Date: Tue May 19 16:22:59 2020 +0200 s390/dasd: remove ioctl_by_bdev calls The IBM partition parser requires device type specific information only available to the DASD driver to correctly register partitions. The current approach of using ioctl_by_bdev with a fake user space pointer is discouraged. Fix this by replacing IOCTL calls with direct in-kernel function calls. Suggested-by: Christoph Hellwig <hch> Signed-off-by: Stefan Haberland <sth.com> Reviewed-by: Jan Hoeppner <hoeppner.com> Reviewed-by: Peter Oberparleiter <oberpar.com> Reviewed-by: Christoph Hellwig <hch> Signed-off-by: Jens Axboe <axboe> MAINTAINERS | 1 + block/partitions/ibm.c | 24 ++++++++++++++++++------ drivers/s390/block/dasd_ioctl.c | 34 ++++++++++++++++++++++++++++++++++ include/linux/dasd_mod.h | 9 +++++++++ 4 files changed, 62 insertions(+), 6 deletions(-) create mode 100644 include/linux/dasd_mod.h With the defconfig the problem is not present. Will try to identify which config option is problematic together with this patch.
------- Comment From cborntra.com 2020-10-07 05:22 EDT------- So the problem happens when CONFIG_DASD=m and it does not happen with CONFIG_DASD=y this is sad, since we only need virtio-blk and the ibm partition code.
------- Comment From cborntra.com 2020-10-07 11:39 EDT------- Fix is queued in the linux-block tree for 5.9 and 5.8 stable. https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.9&id=7370997d48520ad923e8eb4deb59ebf290396202
------- Comment From cborntra.com 2020-10-09 03:38 EDT------- Patch merged upstream https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7370997d48520ad923e8eb4deb59ebf290396202 and queued for 5.8 stable.
Christian, thanks for your work on this issue.
------- Comment From cborntra.com 2020-10-19 06:41 EDT------- Dan, any chance to get this into F33 before the release as well as into F32 updates soon?
If my git queries are correct, then the commit in question is first included (for stable) in 5.8.15 and kernel-5.8.15-301.fc33 is currently the latest in the F-33 nightly composes and thus should be in the GA too. For F-32 the 5.8.15 update has been already pushed out as stable. I think we are looking good.