Created attachment 1897658 [details] journalctl --no-hostname -k > dmesg_5.18.11.txt Hard freeze on kernel 5.18.11-200.fc36 when Libvirt/KVM/Qemu Windows 7 VM is started 1. Please describe the problem: After upgrading to kernel 5.18.11-200.fc36, my LENOVO ThinkPad L490 20Q6S4JH01 laptop started to crash. What causes the crash I did not find out, journalctl is cleanly cut with no warnings or kernel errors. Also the local console is stuck, the cursor is not blinking, the ssh connection is frozen and ping to the computer is not responding. I find that the hit occurs shortly after starting a virtual machine with Windows 7. The VM with Windows 10 works without issue. I tried disabling all mitigations and retbleed mitigation but it did not solve the problem. Also a clean install of the Windows 7 VM causes the crash (during installation). I am inserting this bug in Fedora Bugzilla, because on my desktop with vanilla kernel 5.18.11 this problem did not show up (Debian 11) 2. What is the Version-Release number of the kernel: 5.18.11-200.fc36 3. Did it work previously in Fedora? If so, what kernel version did the issue *first* appear? Old kernels are available for download at https://koji.fedoraproject.org/koji/packageinfo?packageID=8 : Yes, on 5.18.10-200.fc36 Windows 7 VMs run without problem 4. Can you reproduce this issue? If so, please provide the steps to reproduce the issue below: Boot 5.18.11-200.fc36 kernel 1. Install libvirt, virt-manager, qemu, qemu-kvm... 2. Run virt-manager 3. Create a new VM 4. Select local install media (ISO image or cd-rom) 5. Insert the ISO (in my case cs_windows_7_professional_x64_dvd_x15-65799.iso) 6. Microsoft Windows 7 should be selected as the template. Forward 7. Forward (default CPUs and RAM) 8. Default storage. Forward 9. Finish (no customization) The VM is started and within moments the computer crashes. SSH connection does not respond, ping does not respond, cursor in local terminal freezes. Already installed Windows 7 VM also freezes the computer after startup This procedure works fine on kernel 5.18.10-200.fc36 5. Does this problem occur with the latest Rawhide kernel? To install the Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by ``sudo dnf update --enablerepo=rawhide kernel``: 5.19.0-0.rc6.20220714git4a57a8400075.49.fc37 is the same way as 5.18.11-200.fc36 6. Are you running any modules that not shipped with directly Fedora's kernel?: dmesg | grep -i tainted is empty 7. Please attach the kernel logs. You can get the complete kernel log for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the issue occurred on a previous boot, use the journalctl ``-b`` flag. dmesg_5.18.11
Created attachment 1897953 [details] debug info from terminal I did: git bisect good kernel-5.18.10-0 git bisect bad kernel-5.18.11-0 from git https://gitlab.com/cki-project/kernel-ark.git git bisect start # bad: [974e3e09ea6fdef8a1b1b68b209ce849f05e0063] Turn on configs for retbleed git bisect bad 974e3e09ea6fdef8a1b1b68b209ce849f05e0063 # good: [934aec1d7c14549447ba3b53a65f0eb948fc023b] [redhat] kernel-5.18.10-0 git bisect good 934aec1d7c14549447ba3b53a65f0eb948fc023b # good: [2783414e6ef725bac946dc5d4d9288e34b6f5a13] ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is supported git bisect good 2783414e6ef725bac946dc5d4d9288e34b6f5a13 # good: [5aca0c5b86a52e2487c4d846ac08f20d5fb9ce11] x86/vsyscall_emu/64: Don't use RET in vsyscall emulation git bisect good 5aca0c5b86a52e2487c4d846ac08f20d5fb9ce11 # bad: [9072ecef88a18bba73dd59c78d202c9966574aab] x86/cpu/amd: Add Spectral Chicken git bisect bad 9072ecef88a18bba73dd59c78d202c9966574aab # bad: [b6755754d19816815235c8fca8979856763afbc9] x86/bugs: Optimize SPEC_CTRL MSR writes git bisect bad b6755754d19816815235c8fca8979856763afbc9 # bad: [98db9034780970f94cf0fd66f6c3371ce5bd1da0] x86: Add magic AMD return-thunk git bisect bad 98db9034780970f94cf0fd66f6c3371ce5bd1da0 # bad: [b881f755be2f276dca2ff2563d5cc4ae38561c51] x86: Use return-thunk in asm code git bisect bad b881f755be2f276dca2ff2563d5cc4ae38561c51 # good: [9b9b256ca2665c776a56acd643e8e90d7c8ad1b4] x86/sev: Avoid using __x86_return_thunk git bisect good 9b9b256ca2665c776a56acd643e8e90d7c8ad1b4 # first bad commit: [b881f755be2f276dca2ff2563d5cc4ae38561c51] x86: Use return-thunk in asm code So according to bisect, this commit is causing the crash: b881f755be2f276dca2ff2563d5cc4ae38561c51 x86: Use return-thunk in asm code At least for my laptop with i5-8365U CPU I compiled the kernel with make binrpm-pkg You simply cannot revert this commit using git revert. During the tests I caught the kernel panic on camera: fastop x86_emulate_insn x86_emulate_instruction kvm_arch_vcpu_ioctl_run kvm_vcpu_ioctl __seccomp_filter __x64_sus_ioctl __x64_sus_ioctl syscall_exit_to_user_mode do_syscall_64 More in the attachments. So it looks like it's not a fedora bug but a kernel bug in general. I should probably file it on kernel.org bugzilla. Thank you
Created attachment 1897954 [details] debug info from terminal 2
Created attachment 1897955 [details] debug info from terminal 3
Created attachment 1898168 [details] Netconsole kernel panic after W7 is started I tried to compile vanilla kernel 5.18.11 with configuration from fedora 36 5.18.11-200. (.config is quite different from fedora, and probably mitigations for retbleed are not enabled). Did not trigger kernel panic. However I managed to get the whole kernel panic via netconsole (attached).
I'm seeing this too. In my case, kernel-5.18.11-200.fc36.x86_64 reliably borks on one particular hardware. On different hardware it starts fine, so there's a hardware component involved here.
I have also experienced a hard lockup with kernel-5.18.11-200.fc36.x86_64 after I started a VM with VMM. This is on a 15-year old system with a Intel core2duo processor,a Nvidia Geforce GT 630 video card using the nouveau driver, and 8 gig of memory. I also have a 2-year old Acer Aspire system with a Intel Core-I5 processor and the integrated Intel graphics. I do not have the lockup there.
kernel-5.18.13-200.fc36 fixed for me.
This issue went way for me with the kernel-5.18.13-200.fc36, which I found in koji.
Yep, I can confirm 5.18.13-200.fc36 is OK for all my VMs. No hard freeze yet. Hope it stays this way :-)
5.18.15-200.fc36.x86_64 from koji works as well. Closing this bug, thank you