1. Please describe the problem: The following issue happened when booting up a machine. dracu[ 16.855926] ccp 0000:a1:00.1: no command queues available t-initqueue.…i[ 16.862367] ccp 0000:a1:00.1: psp enabled ce - dracut [ 16.867973] Microchip SmartPQI Driver (v2.1.14-035) initqueue hook..[ 16.874127] smartpqi 0000:43:00.0: Microchip Smart Family Controller found . [ 16.883871] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 16.890887] #PF: supervisor read access in kernel mode [ 16.896067] #PF: error_code(0x0000) - not-present page [ 16.896070] PGD 0 P4D 0 [ 16.896075] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 16.896079] CPU: 0 PID: 1616 Comm: kworker/0:3 Not tainted 023656 #1 [ 16.896084] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/08/2021 [ 16.896087] Workqueue: events work_for_cpu_fn [ 16.896096] RIP: 0010:kernfs_find_and_get_ns+0x11/0x70 [ 16.896104] Code: 08 48 83 40 40 01 49 8b 46 08 48 83 40 58 01 31 c0 eb d1 66 0f 1f 44 00 00 0f 1f 44 00 00 41 55 49 89 d5 41 54 49 89 f4 55 53 <48> 8b 47 08 48 89 fb 48 85 c0 48 0f 44 c7 48 8b 68 50 48 83 c5 60 [ 16.896106] RSP: 0018:ffffabc291e0fcc0 EFLAGS: 00010246 [ 16.896110] RAX: 0000000000000000 RBX: ffffffff9b323680 RCX: ffffabc291e0fca0 [ 16.896113] RDX: 0000000000000000 RSI: ffffffff9b3237c8 RDI: 0000000000000000 [ 16.896115] RBP: 0000000000000000 R08: 0000000000000040 R09: 00000000e5000000 [ 16.896117] R10: 0000000000000000 R11: ffff9966bb61729c R12: ffffffff9b3237c8 [ 16.896119] R13: 0000000000000000 R14: ffff996686f83bc0 R15: 0000000000000000 [ 16.896121] FS: 0000000000000000(0000) GS:ffff99857d000000(0000) knlGS:0000000000000000 [ 16.896123] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.896125] CR2: 0000000000000008 CR3: 00000001428a2000 CR4: 0000000000350ef0 [ 16.896128] Call Trace: [ 16.896131] <TASK> [ 16.896135] sysfs_unmerge_group+0x18/0x60 [ 16.896141] dpm_sysfs_remove+0x20/0x60 [ 16.896148] device_del+0xb2/0x3f0 [ 16.896156] platform_device_del.part.0+0x13/0x70 [ 16.896162] platform_device_unregister+0x1c/0x30 [ 16.896165] sysfb_disable+0x2b/0x60 [ 16.896171] remove_conflicting_framebuffers+0x1b/0xc0 [ 16.896178] remove_conflicting_pci_framebuffers+0xce/0x120 [ 16.896183] drm_aperture_remove_conflicting_pci_framebuffers+0x57/0x80 [ 16.896190] mgag200_pci_probe+0x26/0x5b0 [mgag200] [ 16.896203] local_pci_probe+0x41/0x80 [ 16.896210] work_for_cpu_fn+0x16/0x20 [ 16.896214] process_one_work+0x1c7/0x380 [ 16.896219] worker_thread+0x1ab/0x380 [ 16.896224] ? _raw_spin_lock_irqsave+0x23/0x50 [ 16.896232] ? process_one_work+0x380/0x380 [ 16.896235] kthread+0xe9/0x110 [ 16.896241] ? kthread_complete_and_exit+0x20/0x20 [ 16.896244] ret_from_fork+0x22/0x30 [ 16.896254] </TASK> [ 16.896255] Modules linked in: smartpqi(+) ghash_clmulni_intel ccp usb_storage mgag200(+) hpwdt scsi_transport_sas sp5100_tco wmi ipmi_devintf ipmi_msghandler [ 16.896272] CR2: 0000000000000008 [ 16.896276] ---[ end trace 0000000000000000 ]--- [ 17.003538] RIP: 0010:kernfs_find_and_get_ns+0x11/0x70 [ 17.233698] Code: 08 48 83 40 40 01 49 8b 46 08 48 83 40 58 01 31 c0 eb d1 66 0f 1f 44 00 00 0f 1f 44 00 00 41 55 49 89 d5 41 54 49 89 f4 55 53 <48> 8b 47 08 48 89 fb 48 85 c0 48 0f 44 c7 48 8b 68 50 48 83 c5 60 [ 17.233702] RSP: 0018:ffffabc291e0fcc0 EFLAGS: 00010246 [ 17.233707] RAX: 0000000000000000 RBX: ffffffff9b323680 RCX: ffffabc291e0fca0 [ 17.233709] RDX: 0000000000000000 RSI: ffffffff9b3237c8 RDI: 0000000000000000 [ 17.233711] RBP: 0000000000000000 R08: 0000000000000040 R09: 00000000e5000000 [ 17.233715] R10: 0000000000000000 R11: ffff9966bb61729c R12: ffffffff9b3237c8 [ 17.286590] R13: 0000000000000000 R14: ffff996686f83bc0 R15: 0000000000000000 [ 17.286593] FS: 0000000000000000(0000) GS:ffff99857d000000(0000) knlGS:0000000000000000 [ 17.286595] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 17.286597] CR2: 0000000000000008 CR3: 00000001428a2000 CR4: 0000000000350ef0 [-1;-1f Starting plymouth-start.se…[0m - Show Plymouth Boot Screen... 2. What is the Version-Release number of the kernel: kernel-5.19.0-0.rc4.a175eca0f3d7.36.test.fc37.x86_64 3. more logs: https://datawarehouse.cki-project.org/kcidb/tests/4180584
From Javier Martinez Canillas, there is one patch missing in v5.19-rc4: commit fb84efa28a48 ("drm/aperture: Run fbdev removal before internal helpers") This commit is in drm-fixes and should be in the rc5. There is also a tmp MR to test this on rawhide: https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1904 @bgoncalv is it possible to check this MR on the "problematic" hardware, to confirm the fix is working ? Thanks,
Jocelyn, I was not able to build a kernel using that MR, but I did test with kernel ark 5.19.0-0.rc5.e8a4e1c1bb69.44.test.fc37 on the same machine and I didn't hit the problem. [-1;-1f[ 16.741249] Microchip SmartPQI Driver (v2.1.14-035) Startin[ 16.746405] smartpqi 0000:43:00.0: Microchip Smart Family Controller found g plymo[ 16.755822] mgag200 0000:61:00.1: vgaarb: deactivate vga console uth-start.se…[0m - Show Plymouth Boot Screen... [ OK ] Started [ 16.768060] usbcore: registered new interface driver uas plymout[ 16.768247] [drm] Initialized mgag200 1.0.0 20110418 for 0000:61:00.1 on minor 0 h-start.ser…e - Show Plymo[ 16.785333] fbcon: mgag200drmfb (fb0) is primary device uth Boot Screen.[ 16.785337] fbcon: Deferring console take-over [ OK [ 16.797196] mgag200 0000:61:00.1: [drm] fb0: mgag200drmfb frame buffer device ] Started systemd-ask-passwo…uests to Plymouth Directory Watch.
Hi Bruno, Thanks a lot for testing, and confirming the fix works and is included in 5.19-rc5