Bug 2103766 - Display on kernels since kernel-5.19.0-0.rc4.20220701gita175eca0f3d7.36.fc37 hangs during boot on UEFI VMs
Summary: Display on kernels since kernel-5.19.0-0.rc4.20220701gita175eca0f3d7.36.fc37 ...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: openqa
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-04 19:45 UTC by Adam Williamson
Modified: 2022-07-07 17:42 UTC (History)
20 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2022-07-07 17:42:11 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journalctl output from failed boot kernel-5.19.0-0.rc5.39.fc37 (335.33 KB, text/plain)
2022-07-06 17:44 UTC, Steven Usdansky
no flags Details

Description Adam Williamson 2022-07-04 19:45:20 UTC
Since kernel-5.19.0-0.rc4.20220701gita175eca0f3d7.36.fc37 landed in Rawhide, all openQA UEFI tests fail as display output seems to hang during boot.

I can confirm this on a local VM, too. I installed from the 20220701.n.0 Workstation live (the last before the bug appeared), installed both kernel-5.19.0-0.rc4.20220701gita175eca0f3d7.36.fc37 and kernel-5.19.0-0.rc5.39.fc37 , and tried booting them multiple times; in all cases, they appear to hang during boot. The last working kernel is kernel-5.19.0-0.rc4.20220630gitd9b2ba67917c.35.fc37 .

This happens with virtio, qxl and vga graphics. It happens whether the VM's disk drive is emulated SATA or virtio.

The system *does* boot, though. If I pass console=ttyS0 I get a login prompt on the serial console and can log in. However, the display is stuck displaying early boot messages.

The journals from failed boots have kernel NULL pointer dereferences in them. These seem to vary a bit depending on the graphics adapter chosen. Here's a sample with virtio graphics:

Jul 04 12:38:26 fedora kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Jul 04 12:38:26 fedora kernel: #PF: supervisor read access in kernel mode
Jul 04 12:38:26 fedora kernel: #PF: error_code(0x0000) - not-present page
Jul 04 12:38:26 fedora kernel: PGD 0 P4D 0 
Jul 04 12:38:26 fedora kernel: Oops: 0000 [#1] PREEMPT SMP PTI
Jul 04 12:38:26 fedora kernel: CPU: 0 PID: 376 Comm: systemd-udevd Not tainted 5.19.0-0.rc5.39.fc37.x86_64 #1
Jul 04 12:38:26 fedora kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Jul 04 12:38:26 fedora kernel: RIP: 0010:kernfs_find_and_get_ns+0x11/0x70
Jul 04 12:38:26 fedora kernel: Code: 08 48 83 40 40 01 49 8b 46 08 48 83 40 58 01 31 c0 eb d1 66 0f 1f 44 00 00 0f 1f 44 00 00 41 55 49 89 d5 41 54 49 89 f4 55 53 <48> 8b 47 08 48 89 fb 48 85 c0 48 0f 44 c7 48 8b 68 50 48 83 c5 60
Jul 04 12:38:26 fedora kernel: RSP: 0018:ffffab33405cba58 EFLAGS: 00010246
Jul 04 12:38:26 fedora kernel: RAX: 0000000000000000 RBX: ffffffff83323580 RCX: ffffab33405cba38
Jul 04 12:38:26 fedora kernel: RDX: 0000000000000000 RSI: ffffffff833236c8 RDI: 0000000000000000
Jul 04 12:38:26 fedora kernel: RBP: 0000000000000000 R08: 0000000000000040 R09: 00000000c0d4c000
Jul 04 12:38:26 fedora kernel: R10: 0000000000000000 R11: ffff8c9306a0c29c R12: ffffffff833236c8
Jul 04 12:38:26 fedora kernel: R13: 0000000000000000 R14: ffff8c9306f92cc0 R15: 0000000000000000
Jul 04 12:38:26 fedora kernel: FS:  00007f1edfa17b40(0000) GS:ffff8c9379e00000(0000) knlGS:0000000000000000
Jul 04 12:38:26 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 04 12:38:26 fedora kernel: CR2: 0000000000000008 CR3: 0000000106d80001 CR4: 0000000000370ef0
Jul 04 12:38:26 fedora kernel: Call Trace:
Jul 04 12:38:26 fedora kernel:  <TASK>
Jul 04 12:38:26 fedora kernel:  sysfs_unmerge_group+0x18/0x60
Jul 04 12:38:26 fedora kernel:  dpm_sysfs_remove+0x20/0x60
Jul 04 12:38:26 fedora kernel:  device_del+0xb2/0x3f0
Jul 04 12:38:26 fedora kernel:  platform_device_del.part.0+0x13/0x70
Jul 04 12:38:26 fedora kernel:  platform_device_unregister+0x1c/0x30
Jul 04 12:38:26 fedora kernel:  sysfb_disable+0x2b/0x60
Jul 04 12:38:26 fedora kernel:  remove_conflicting_framebuffers+0x1b/0xc0
Jul 04 12:38:26 fedora kernel:  remove_conflicting_pci_framebuffers+0xce/0x120
Jul 04 12:38:26 fedora kernel:  drm_aperture_remove_conflicting_pci_framebuffers+0x57/0x80
Jul 04 12:38:26 fedora kernel:  virtio_gpu_probe.cold+0x93/0x9e [virtio_gpu]
Jul 04 12:38:26 fedora kernel:  ? kernfs_create_link+0x5d/0xa0
Jul 04 12:38:26 fedora kernel:  ? vp_modern_set_features+0x3d/0x50
Jul 04 12:38:26 fedora kernel:  virtio_dev_probe+0x1af/0x260
Jul 04 12:38:26 fedora kernel:  really_probe+0x1bf/0x390
Jul 04 12:38:26 fedora kernel:  __driver_probe_device+0xfc/0x170
Jul 04 12:38:26 fedora kernel:  driver_probe_device+0x1f/0x90
Jul 04 12:38:26 fedora kernel:  __driver_attach+0xbb/0x1b0
Jul 04 12:38:26 fedora kernel:  ? __device_attach_driver+0xe0/0xe0
Jul 04 12:38:26 fedora kernel:  bus_for_each_dev+0x62/0x90
Jul 04 12:38:26 fedora kernel:  bus_add_driver+0x159/0x200
Jul 04 12:38:26 fedora kernel:  driver_register+0x89/0xe0
Jul 04 12:38:26 fedora kernel:  ? 0xffffffffc0357000
Jul 04 12:38:26 fedora kernel:  do_one_initcall+0x44/0x200
Jul 04 12:38:26 fedora kernel:  ? do_init_module+0x22/0x1f0
Jul 04 12:38:26 fedora kernel:  ? kmem_cache_alloc_trace+0x16c/0x2b0
Jul 04 12:38:26 fedora kernel:  do_init_module+0x4a/0x1f0
Jul 04 12:38:26 fedora kernel:  __do_sys_init_module+0x127/0x180
Jul 04 12:38:26 fedora kernel:  ? folio_add_lru+0x8b/0x100
Jul 04 12:38:26 fedora kernel:  do_syscall_64+0x5b/0x80
Jul 04 12:38:26 fedora kernel:  ? handle_mm_fault+0xae/0x280
Jul 04 12:38:26 fedora kernel:  ? do_user_addr_fault+0x1e2/0x670
Jul 04 12:38:26 fedora kernel:  ? exc_page_fault+0x70/0x170
Jul 04 12:38:26 fedora kernel:  entry_SYSCALL_64_after_hwframe+0x46/0xb0
Jul 04 12:38:26 fedora kernel: RIP: 0033:0x7f1ee05cfa0e
Jul 04 12:38:26 fedora kernel: Code: 48 8b 0d 15 54 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e2 53 0c 00 f7 d8 64 89 01 48
Jul 04 12:38:26 fedora kernel: RSP: 002b:00007fffed987328 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
Jul 04 12:38:26 fedora kernel: RAX: ffffffffffffffda RBX: 00005654c48c9670 RCX: 00007f1ee05cfa0e
Jul 04 12:38:26 fedora kernel: RDX: 00007f1ee070e43c RSI: 0000000000033d06 RDI: 00005654c50dd970
Jul 04 12:38:26 fedora kernel: RBP: 00007f1ee070e43c R08: 00005654c48c4a80 R09: 00007fffed984ed6
Jul 04 12:38:26 fedora kernel: R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000020000
Jul 04 12:38:26 fedora kernel: R13: 00005654c48c3630 R14: 0000000000000000 R15: 00005654c48cf030
Jul 04 12:38:26 fedora kernel:  </TASK>
Jul 04 12:38:26 fedora kernel: Modules linked in: net_failover virtio_gpu(+) failover virtio_dma_buf ip6_tables ip_tables ipmi_devintf ipmi_msghandler fuse qemu_fw_cfg
Jul 04 12:38:26 fedora kernel: CR2: 0000000000000008
Jul 04 12:38:26 fedora kernel: ---[ end trace 0000000000000000 ]---
Jul 04 12:38:26 fedora kernel: RIP: 0010:kernfs_find_and_get_ns+0x11/0x70
Jul 04 12:38:26 fedora kernel: Code: 08 48 83 40 40 01 49 8b 46 08 48 83 40 58 01 31 c0 eb d1 66 0f 1f 44 00 00 0f 1f 44 00 00 41 55 49 89 d5 41 54 49 89 f4 55 53 <48> 8b 47 08 48 89 fb 48 85 c0 48 0f 44 c7 48 8b 68 50 48 83 c5 60
Jul 04 12:38:26 fedora kernel: RSP: 0018:ffffab33405cba58 EFLAGS: 00010246
Jul 04 12:38:26 fedora kernel: RAX: 0000000000000000 RBX: ffffffff83323580 RCX: ffffab33405cba38
Jul 04 12:38:26 fedora kernel: RDX: 0000000000000000 RSI: ffffffff833236c8 RDI: 0000000000000000
Jul 04 12:38:26 fedora kernel: RBP: 0000000000000000 R08: 0000000000000040 R09: 00000000c0d4c000
Jul 04 12:38:26 fedora kernel: R10: 0000000000000000 R11: ffff8c9306a0c29c R12: ffffffff833236c8
Jul 04 12:38:26 fedora kernel: R13: 0000000000000000 R14: ffff8c9306f92cc0 R15: 0000000000000000
Jul 04 12:38:26 fedora kernel: FS:  00007f1edfa17b40(0000) GS:ffff8c9379e00000(0000) knlGS:0000000000000000
Jul 04 12:38:26 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 04 12:38:26 fedora kernel: CR2: 0000000000000008 CR3: 0000000106d80001 CR4: 0000000000370ef0
Jul 04 12:38:26 fedora kernel: virtio_blk virtio6: 2/0/0 default/read/poll queues
Jul 04 12:38:26 fedora systemd-udevd[357]: virtio0: Worker [376] terminated by signal 9 (KILL).

I haven't yet tested if this affects a UEFI install on a bare metal system, but I'll try that next.

Comment 1 Adam Williamson 2022-07-04 21:46:53 UTC
Javier says this is resolved by https://cgit.freedesktop.org/drm/drm-misc/commit/?h=drm-misc-fixes&id=bf43e4521ff3223a613f3a496991a22a4d78e04b , and he'll open an MR for a downstream backport tomorrow.

Comment 2 Matt Fagnani 2022-07-04 22:36:05 UTC
I reported what I think might be the same problem when booting 5.19.0-0.rc4.20220701gita175eca0f3d7.36.fc37 on bare metal when amdgpu started and in VMs when virtio-vga started but only with EFI enabled at https://bugzilla.redhat.com/show_bug.cgi?id=2103512 Thanks for figuring things out further.

Comment 3 Steven Usdansky 2022-07-06 17:44:20 UTC
Created attachment 1894987 [details]
journalctl output from failed boot kernel-5.19.0-0.rc5.39.fc37

Comment 4 Adam Williamson 2022-07-06 18:11:11 UTC
Steven: we believe the cause of this is already known and it should be fixed in kernel-5.19.0-0.rc5.20220706gite35e5b6f695d.42.fc37 .

Comment 5 Steven Usdansky 2022-07-06 22:03:54 UTC
Kernel-5.19.0-0.rc5.20220706gite35e5b6f695d.42.fc37 does, indeed, fix the problem for me.

Comment 6 Adam Williamson 2022-07-07 17:42:11 UTC
yup, openQA confirms also.


Note You need to log in before you can comment on or make changes to this bug.