Description of problem: Virtual machines reboots or are frozen after complete a hypervisor upgrade and the source version is below or equal to 4.5.1. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Start a virtual machine on a RHVH 4.5.1.1 hypervisor. 2. Migrate to a host deployed with RHVH 4.5.2 or upper release. Actual results: On Intel processor the vm is rebooted. On AMD processor the vm is frozen. Expected results: VM running on the destination host as usual. Additional info:
Hi José, Thanks for reporting this bug, if possible could you please share us the qemu command line either ? Thanks a lot. eg, ps -aux|grep qemu-kvm Best regards Min
Can you guys provide some more (public) info on this topic? Or maybe create a KB article? I am interested in knowing if I run into this issue on my next rhv upgrades :)
Hello all, QE could reproduce the similar issue on two AMD hosts. AMD:dell-per7525-27.lab.eng.pek2.redhat.com dell-per7525-28.lab.eng.pek2.redhat.com SRC: qemu-kvm-6.2.0-11.module+el8.6.0+14707+5aa4b42d.x86_64 or qemu-kvm-6.2.0-11.module+el8.6.0+18167+43cf40f3.8.x86_64 (even the *same* qemu-kvm version) kenrel-4.18.0-372.16.1.el8_6.x86_64 DST: qemu-kvm-6.2.0-11.module+el8.6.0+18167+43cf40f3.8.x86_64 kernel-4.18.0-372.63.1.el8_6.x86_64 Steps: 1.boot up a guest with /usr/libexec/qemu-kvm \ -name "mouse-vm" \ -sandbox on \ -machine pc-q35-rhel8.6.0,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \ -nodefaults \ -cpu EPYC,ibpb=on,virt-ssbd=on,monitor=off,x2apic=on,hypervisor=on,svm=off,topoext=on,+kvm_pv_unhalt \ -device '{"driver":"pcie-root-port","port":1,"chassis":1,"id":"pcie-root-port0","multifunction":true,"bus":"pcie.0","addr":"0x4"}' \ -device '{"driver":"virtio-scsi-pci","id":"scsi0","bus":"pcie-root-port0"}' \ -device '{"driver":"pcie-root-port","port":2,"chassis":2,"id":"pcie-root-port1","bus":"pcie.0","addr":"0x4.0x1"}' \ -device '{"driver":"scsi-hd","bus":"scsi0.0","lun":0,"drive":"drive-virtio-disk0","id":"virtio-disk0","bootindex":1}' \ -device virtio-vga \ -blockdev '{"driver":"file","cache":{"direct":true,"no-flush":false},"filename":"/home/new_bug/rhel860-64-virtio-scsi-ovmf.qcow2","node-name":"drive_sys3"}' \ -blockdev '{"driver":"qcow2","node-name":"drive-virtio-disk0","file":"drive_sys3"}' \ -blockdev '{"node-name":"file_ovmf_code","driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.secboot.fd","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"drive_ovmf_code","driver":"raw","read-only":true,"file":"file_ovmf_code"}' -blockdev '{"node-name":"file_ovmf_vars","driver":"file","filename":"/home/new_bug/rhel860-64-virtio-scsi-ovmf.qcow2_VARS.fd","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"drive_ovmf_vars","driver":"raw","read-only":false,"file":"file_ovmf_vars"}' \ -object '{"qom-type":"memory-backend-ram","id":"mem-1","prealloc":true,"size":2147483648,"host-nodes":[0],"policy":"bind"}' -object '{"qom-type":"memory-backend-ram","id":"mem-2","prealloc":true,"size":2147483648,"host-nodes":[0],"policy":"bind"}' \ -m 4096,slots=256,maxmem=32G -smp 8,cores=1,threads=1,sockets=8 -vnc :10 -rtc base=utc,clock=host -boot order=cdn,once=c,menu=on,strict=on -enable-kvm -qmp tcp:0:3333,server,nowait -qmp tcp:0:9999,server=on,wait=off -qmp tcp:0:9888,server=on,wait=off -serial tcp:0:4444,server,nowait -monitor stdio -watchdog-action reset 2.do live migration Actual results, the guest hung Expected results, the guest is migrated well. But after I upgraded the kernel from kernel-4.18.0-372.16.1.el8_6.x86_64 to kernel-4.18.0-372.63.1.el8_6.x86_64 on SRC side, I can't reproduce it (no matter which qemu package is used(14707 or 18167) on SRC side). Now I'm doing a full regression test between qemu-kvm-6.2.0-11.module+el8.6.0+14707+5aa4b42d.x86_64 (src) with kernel-kernel-4.18.0-372.63.1.el8_6.x86_64 and qemu-kvm-6.2.0-11.module+el8.6.0+18167+43cf40f3.8.x86_64(dst) with kernel-kernel-4.18.0-372.63.1.el8_6.x86_64. Will update the results later. Thanks Min
Hello Juan, I have question, is this customer's scenario preferred if they do live migration (not stable guest abi) with different minor qemu-kvm and kernel SRC:qemu version: 6.2.0qemu-kvm-6.2.0-11.module+el8.6.0+14707+5aa4b42d, kernel: 4.18.0-372.16.1.el8_6.x86_64 DST:qemu version: 6.2.0qemu-kvm-6.2.0-11.module+el8.6.0+18167+43cf40f3.8, kernel: 4.18.0-372.57.1.el8_6.x86_64 Thanks. Min
Hi Min, I guess that you are referring me as Juan, I'm Jose :D, no worries. The problem is that a minor RHV upgrade can't be done without disturbing Virtual Machines, live migration functionality is compromised with this bug, so I guess that we should fix the issue. Regards.
(In reply to Klaas Demter from comment #6) > Can you guys provide some more (public) info on this topic? Or maybe create > a KB article? I am interested in knowing if I run into this issue on my next > rhv upgrades :) You are right, I'm going to create a KCS regarding this issue, but I was waiting to be reproduced by QA.
(In reply to José Enrique from comment #9) > Hi Min, > > I guess that you are referring me as Juan, I'm Jose :D, no worries. > The problem is that a minor RHV upgrade can't be done without disturbing > Virtual Machines, live migration functionality is compromised with this bug, > so I guess that we should fix the issue. > > Regards. Hi Jose, Thanks for your feedback. Juan is our live migration developer ,I just want to make sure the scenario with him twice ;)! Could you mind provide the error log, qemu command and output of #lscpu from those hosts? We need to compare failure from your side with what I've gotten. Live migration feature's QE(xiaohli) shared me with a fixed bug, please refer to bug 2131756 - VMs hang after migration [rhel-8.6.0.z]. I hope it's helpful for you! Thank you. Min Failure from QE side. From vm's console (dst), I got the following failures. [ 0.000000] Linux version 4.18.0-372.46.1.el8_6.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC)) #1 SMP Thu Feb 16 13:46:57 EST 2023 [ 0.000000] Command line: elfcorehdr=0x6e000000 BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-372.46.1.el8_6.x86_64 ro console=tty0 resume=/dev/mapper/rhel_vm--212--229-swap biosdevname=0 net.ifnames=0 console=ttyS0,115200 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never nokaslr novmcoredd hest_disable disable_cpu_apicid=0 iTCO_wdt.pretimeout=0 [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [ 0.000000] signal: max sigframe size: 1776 [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000002ffff] usable [ 0.000000] BIOS-e820: [mem 0x0000000000030000-0x000000000004ffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000050000-0x000000000009efff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009f000-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x000000006e001000-0x0000000079ffffff] usable [ 0.000000] BIOS-e820: [mem 0x000000007d08d000-0x000000007d095fff] reserved [ 0.000000] BIOS-e820: [mem 0x000000007e6e5000-0x000000007e964fff] reserved [ 0.000000] BIOS-e820: [mem 0x000000007e965000-0x000000007e97dfff] ACPI data [ 0.000000] BIOS-e820: [mem 0x000000007e97e000-0x000000007e9fdfff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x000000007f000000-0x000000007fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] extended physical RAM map: [ 0.000000] reserve setup_data: [mem 0x0000000000000000-0x0000000000000fff] reserved [ 0.000000] reserve setup_data: [mem 0x0000000000001000-0x000000000002ffff] usable [ 0.000000] reserve setup_data: [mem 0x0000000000030000-0x000000000004ffff] reserved [ 0.000000] reserve setup_data: [mem 0x0000000000050000-0x000000000009efff] usable [ 0.000000] reserve setup_data: [mem 0x000000000009f000-0x000000000009ffff] reserved [ 0.000000] reserve setup_data: [mem 0x000000006e001000-0x0000000079ff82af] usable [ 0.000000] reserve setup_data: [mem 0x0000000079ff82b0-0x0000000079ff831f] usable [ 0.000000] reserve setup_data: [mem 0x0000000079ff8320-0x0000000079ffffff] usable [ 0.000000] reserve setup_data: [mem 0x000000007d08d000-0x000000007d095fff] reserved [ 0.000000] reserve setup_data: [mem 0x000000007e6e5000-0x000000007e964fff] reserved [ 0.000000] reserve setup_data: [mem 0x000000007e965000-0x000000007e97dfff] ACPI data [ 0.000000] reserve setup_data: [mem 0x000000007e97e000-0x000000007e9fdfff] ACPI NVS [ 0.000000] reserve setup_data: [mem 0x000000007f000000-0x000000007fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x00000000b0000000-0x00000000bfffffff] reserved [ 0.000000] efi: EFI v2.70 by EDK II [ 0.000000] efi: SMBIOS=0x7e7cd000 ACPI=0x7e97d000 ACPI 2.0=0x7e97d014 MEMATTR=0x7d578198 MOKvar=0x7e77a000 [ 0.000000] secureboot: Secure boot disabled [ 0.000000] SMBIOS 2.8 present. [ 0.000000] DMI: Red Hat KVM/RHEL-AV, BIOS 0.0.0 02/06/2015 [ 0.000000] Hypervisor detected: KVM [ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00 [ 0.000000] kvm-clock: cpu 0, msr 79201001, primary cpu clock [ 0.000000] kvm-clock: using sched offset of 127727498361 cycles [ 0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns [ 0.000000] tsc: Detected 1996.246 MHz processor [ 0.000000] last_pfn = 0x7a000 max_arch_pfn = 0x400000000 [ 0.000000] Disabled [ 0.000000] x86/PAT: MTRRs disabled, skipping PAT initialization too. [ 0.000000] CPU MTRRs all blank - virtualized system. [ 0.000000] x86/PAT: Configuration [0-7]: WB WT UC- UC WB WT UC- UC [ 0.000000] x2apic: enabled by BIOS, switching to x2apic ops [ 0.000000] Using GB pages for direct mapping [ 0.000000] RAMDISK: [mem 0x747c3000-0x761fffff] [ 0.000000] ACPI: Early table checksum verification disabled [ 0.000000] ACPI: RSDP 0x000000007E97D014 000024 (v02 BOCHS ) [ 0.000000] ACPI: XSDT 0x000000007E97C0E8 000054 (v01 BOCHS BXPC 00000001 01000013) [ 0.000000] ACPI: FACP 0x000000007E970000 0000F4 (v03 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: DSDT 0x000000007E971000 00A29E (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: FACS 0x000000007E9DC000 000040 [ 0.000000] ACPI: APIC 0x000000007E96F000 0000B0 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: SRAT 0x000000007E96E000 000150 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: MCFG 0x000000007E96D000 00003C (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: WAET 0x000000007E96C000 000028 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: BGRT 0x000000007E96B000 000038 (v01 INTEL EDK2 00000002 01000013) [ 0.000000] ACPI: Reserving FACP table memory at [mem 0x7e970000-0x7e9700f3] [ 0.000000] ACPI: Reserving DSDT table memory at [mem 0x7e971000-0x7e97b29d] [ 0.000000] ACPI: Reserving FACS table memory at [mem 0x7e9dc000-0x7e9dc03f] [ 0.000000] ACPI: Reserving APIC table memory at [mem 0x7e96f000-0x7e96f0af] [ 0.000000] ACPI: Reserving SRAT table memory at [mem 0x7e96e000-0x7e96e14f] [ 0.000000] ACPI: Reserving MCFG table memory at [mem 0x7e96d000-0x7e96d03b] [ 0.000000] ACPI: Reserving WAET table memory at [mem 0x7e96c000-0x7e96c027] [ 0.000000] ACPI: Reserving BGRT table memory at [mem 0x7e96b000-0x7e96b037]
Hi I have been looking at this bug for a while. 1st thing, we though that this could be related to the AMD bugs that we have have in the near past, but: - Per comment 7, I infer that we are testing between identical machines: AMD:dell-per7525-27.lab.eng.pek2.redhat.com dell-per7525-28.lab.eng.pek2.redhat.com And the ones that we have in the near past were Naples <-> Milan if I remember correctly. The important bit is that it was between different AMD generations. - Per comment 1 It also happens on Intel, so it can't be the bug that we fixed in the past. In comment 1, you are testing with different kernel and different qemus. Could it be possible to try to reproduce with the same kernel and different qemus, and with same qemu and different kernels. So we can decide where the problems is? In comment 7 they say that you can reproduce with the same qemu and different kernels, so the problem should be in the kernel. Just double checking. In that comment they says that the problem dissapears when they upgrade kernel on source, so problem is there. I would suggest that @leobras if he has some idea. But the problem looks to be in the kernel, not in qemu. I don't have any suggesting about what differences are between that two kernels.
(In reply to Min Deng from comment #8) > Hello Juan, > I have question, is this customer's scenario preferred if they do live > migration (not stable guest abi) with different minor qemu-kvm and kernel > SRC:qemu version: 6.2.0qemu-kvm-6.2.0-11.module+el8.6.0+14707+5aa4b42d, > kernel: 4.18.0-372.16.1.el8_6.x86_64 > DST:qemu version: 6.2.0qemu-kvm-6.2.0-11.module+el8.6.0+18167+43cf40f3.8, > kernel: 4.18.0-372.57.1.el8_6.x86_64 > > Thanks. > Min You say on comment 7 that: different kernels + same qemu -> fail same kernel + different qemu -> works So the problem is clearly in the kernel. Could you test with the same qemu different kernels changing a couple of things: > -name "mouse-vm" \ > -sandbox on \ This shouldn't matter, but you can try to drop it. > -machine pc-q35-rhel8.6.0,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \ nothing really stands up here. > -nodefaults \ > -cpu EPYC,ibpb=on,virt-ssbd=on,monitor=off,x2apic=on,hypervisor=on,svm=off,topoext=on,+kvm_pv_unhalt \ can you try just -cpu EPYC And if it works, we can bisect the other features that you are using? > -device '{"driver":"pcie-root-port","port":1,"chassis":1,"id":"pcie-root-port0","multifunction":true,"bus":"pcie.0","addr":"0x4"}' \ > -device '{"driver":"virtio-scsi-pci","id":"scsi0","bus":"pcie-root-port0"}' \ You are using virtio-scsi, I normally use virtio-blk, but as the problem is in the kernel, not qemu, I will think this don't matter. > -device '{"driver":"pcie-root-port","port":2,"chassis":2,"id":"pcie-root-port1","bus":"pcie.0","addr":"0x4.0x1"}' \ > -device '{"driver":"scsi-hd","bus":"scsi0.0","lun":0,"drive":"drive-virtio-disk0","id":"virtio-disk0","bootindex":1}' \ > -device virtio-vga \ > -blockdev '{"driver":"file","cache":{"direct":true,"no-flush":false},"filename":"/home/new_bug/rhel860-64-virtio-scsi-ovmf.qcow2","node-name":"drive_sys3"}' \ > -blockdev '{"driver":"qcow2","node-name":"drive-virtio-disk0","file":"drive_sys3"}' \ > -blockdev '{"node-name":"file_ovmf_code","driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.secboot.fd","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"drive_ovmf_code","driver":"raw","read-only":true,"file":"file_ovmf_code"}' > > > -blockdev '{"node-name":"file_ovmf_vars","driver":"file","filename":"/home/new_bug/rhel860-64-virtio-scsi-ovmf.qcow2_VARS.fd","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"drive_ovmf_vars","driver":"raw","read-only":false,"file":"file_ovmf_vars"}' \ Except for the use of virtio-scsi (and it is supported and shouldn't create problems), the rest of the storage looks really simple and correct. > -object '{"qom-type":"memory-backend-ram","id":"mem-1","prealloc":true,"size":2147483648,"host-nodes":[0],"policy":"bind"}' -object '{"qom-type":"memory-backend-ram","id":"mem-2","prealloc":true,"size":2147483648,"host-nodes":[0],"policy":"bind"}' \ I will try as second option to drop the preallocation, just to see if the problem is here. > -m 4096,slots=256,maxmem=32G -smp 8,cores=1,threads=1,sockets=8 And it this fails, can you just do: -m 4096 -smp 8 And see if it helps. > -vnc :10 -rtc base=utc,clock=host Wild shoot: Can you check if the clocks between the two hosts drift? on host 1: date; ssh host2 date; date And see if they are synchronized. This could explain the freeze, but my understanding is that we have double check in libvirt/rhev/cnv that the clocks are right, but it is a very easy to check, so ... > -boot order=cdn,once=c,menu=on,strict=on -enable-kvm -qmp tcp:0:3333,server,nowait -qmp tcp:0:9999,server=on,wait=off -qmp tcp:0:9888,server=on,wait=off -serial tcp:0:4444,server,nowait -monitor stdio -watchdog-action reset Nothing else has anything relation to the kernel as far as I can see. Only other thing that I can think of is for you to send dmesg output for the guest after booting on host with old kernel and host with new kernel and see what are the differences that the guest sees. Sorry to not have a better suggestion, but we can start with this. Later, Juan.
Per Juan, still can reproduce the frozen issue. Also provide the output of two AMDs and lscpu from guest running both kernels /usr/libexec/qemu-kvm \ -name "mouse-vm" \ -sandbox on \ -machine pc-q35-rhel8.6.0,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \ -nodefaults \ -cpu EPYC \ -device '{"driver":"pcie-root-port","port":1,"chassis":1,"id":"pcie-root-port0","multifunction":true,"bus":"pcie.0","addr":"0x4"}' \ -device '{"driver":"virtio-scsi-pci","id":"scsi0","bus":"pcie-root-port0"}' \ -device '{"driver":"pcie-root-port","port":2,"chassis":2,"id":"pcie-root-port1","bus":"pcie.0","addr":"0x4.0x1"}' \ -device '{"driver":"scsi-hd","bus":"scsi0.0","lun":0,"drive":"drive-virtio-disk0","id":"virtio-disk0","bootindex":1}' \ -device virtio-vga \ -blockdev '{"driver":"file","cache":{"direct":true,"no-flush":false},"filename":"/home/new_bug/rhel860-64-virtio-scsi-ovmf.qcow2","node-name":"drive_sys3"}' \ -blockdev '{"driver":"qcow2","node-name":"drive-virtio-disk0","file":"drive_sys3"}' \ -blockdev '{"node-name":"file_ovmf_code","driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.secboot.fd","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"drive_ovmf_code","driver":"raw","read-only":true,"file":"file_ovmf_code"}' -blockdev '{"node-name":"file_ovmf_vars","driver":"file","filename":"/home/new_bug/rhel860-64-virtio-scsi-ovmf.qcow2_VARS.fd","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"drive_ovmf_vars","driver":"raw","read-only":false,"file":"file_ovmf_vars"}' \ -m 4096 -smp 8 -vnc :10 -rtc base=utc,clock=host -boot order=cdn,once=c,menu=on,strict=on -enable-kvm -qmp tcp:0:3333,server,nowait -qmp tcp:0:9999,server=on,wait=off -qmp tcp:0:9888,server=on,wait=off -serial tcp:0:4444,server,nowait -monitor stdio -watchdog-action reset
My theory points that direction, but this is not the right problem (tm). struct kvm_xsave1 { uint32_t region[1024]; }; struct kvm_xsave2 { uint32_t region[1024]; uint32_t extra[0]; }; int main(void) { printf("xsave1 %lu xsave2 %lu\n", sizeof(struct kvm_xsave1), sizeof(struct kvm_xsave2)); return 0; } $ ./kk xsave1 4096 xsave2 4096 So structs still have the right size when they are not allocated with nothing extra. arrays with zero sizes have this interesting behaviour, I know, but I always forgot.
Hi Min Could you do a bisect on the released kernels between: - kernel-4.18.0-372.16.1.el8_6 - kernel-4.18.0-372.57.1.el8_6 And say what is the first kernel rpm where this problem happens? Thanks very much.
(In reply to Juan Quintela from comment #37) > Hi Min > > Could you do a bisect on the released kernels between: > > - kernel-4.18.0-372.16.1.el8_6 > - kernel-4.18.0-372.57.1.el8_6 > > And say what is the first kernel rpm where this problem happens? Hi Juan I'd like to help do the bisect, but could you please be more specific ? Is the kernel-4.18.0-372.16.1.el8_6 first one ? Correct me if I were wrong. Thank you Min
Hi Min Machine that is source of migration: It stays at - kernel-4.18.0-372.16.1.el8_6 Machine that is the destination of the migration. We need to know the 1st released kernel that fails. There are 57-16: 42 kernels. Using bisect on the number of kernel should take around 4 or 5 tries. Once we found that, we would look at what is the specific change that caused this regression. Later, Juan.
Thanks for the info, will update the result ASAP.
(In reply to Min Deng from comment #40) > Thanks for the info, will update the result ASAP. Should break from kernel-4.18.0-372.21.1.el8_6.x86_64 guest console 72.073286] Hardware name: Red Hat KVM/RHEL-AV, BIOS 0.0.0 02/06/2015 [ 72.075101] RIP: 0010:ex_handler_fprestore+0x43/0x50 [ 72.076465] Code: 00 00 00 74 0f e8 9d 5b fb ff b8 01 00 00 00 e9 43 61 b8 00 48 89 c6 48 c7 c7 50 39 cd 9e c6 05 2c 22 ce 01 01 e8 4a 65 07 00 <0f> 0b eb d7 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 80 3d 0d 22 [ 72.081698] RSP: 0018:ffffbceb0126fe00 EFLAGS: 00010086 [ 72.083235] RAX: 0000000000000000 RBX: ffffbceb0126fe48 RCX: 0000000000000007 [ 72.085261] RDX: 0000000000000007 RSI: 00000000ffff7fff RDI: ffff9a7ffbc16790 [ 72.087096] RBP: 000000000000000d R08: 0000000000000000 R09: c0000000ffff7fff [ 72.089113] R10: 0000000000000001 R11: ffffbceb0126fc18 R12: ffff9a7f88e82800 [ 72.091091] R13: ffff9a7f88e82800 R14: 0000000000000000 R15: 0000000000000000 [ 72.092940] FS: 00007f542b39eb80(0000) GS:ffff9a7ffbc00000(0000) knlGS:0000000000000000 [ 72.095190] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 72.096740] CR2: 000055bcfc06edf0 CR3: 0000000112f82000 CR4: 00000000003506f0 [ 72.098835] Call Trace: [ 72.100227] fixup_exception+0x33/0x46 [ 72.101160] do_general_protection+0x42/0x150 [ 72.102333] general_protection+0x1e/0x30 [ 72.103486] RIP: 0010:restore_fpregs_from_fpstate+0x45/0xa0 [ 72.104965] Code: db 04 24 0f 1f 44 00 00 48 8b 0c 24 0f 1f 44 00 00 48 8b 05 d5 25 17 01 48 8d 79 40 48 21 d8 48 89 c2 48 c1 ea 20 48 0f ae 2f <48> 83 c4 08 5b 5d e9 d0 06 bd 00 48 8b 69 18 65 48 8b 05 64 4e 3e [ 72.110282] RSP: 0018:ffffbceb0126fef8 EFLAGS: 00010056 [ 72.111707] RAX: 0000000000000007 RBX: 00000000000604ff RCX: ffff9a7f88e84c80 [ 72.113505] RDX: 0000000000000000 RSI: 00000000000604ff RDI: ffff9a7f88e84cc0 [ 72.115357] RBP: ffff9a7f88e82800 R08: 0000000000000000 R09: ffffbceb0126fb5c [ 72.117149] R10: 0000000000000000 R11: 00000010c5dbe7c0 R12: ffff9a7f88e83c40 [ 72.118905] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 72.120684] ? switch_fpu_return+0x4b/0xd0 [ 72.121738] ? do_syscall_64+0x18f/0x1b0 [ 72.122795] ? entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 72.124095] ---[ end trace 714cd02b578aa77f ]--- [ 72.126030] traps: realmd[1126] general protection fault ip:7f54290d9ac1 sp:7ffe07040ad0 error:10 in libc-2.28.so[7f5428fb5000+1bc000]
Hello guys, This issue starts between kernel release 4.18.0-372.19.1 and 4.18.0-372.26.1.
I've uploaded the kernel.spec diff from both versions in html format, obtained from pkgdiff application.
(In reply to José Enrique from comment #42) > Hello guys, > > This issue starts between kernel release 4.18.0-372.19.1 and 4.18.0-372.26.1. Wow. Thanks very much for the bisect. This even makes sense (TM) for migration. There is a patch there that changes how async page faults work. @pbonzini , @peterx Could you take a look. commit 72285b20be4e2f9600cc210ee9e0294c9b2930b2 Author: Vitaly Kuznetsov <vkuznets> Date: Thu Apr 21 11:39:24 2022 +0200 KVM: x86/mmu: make apf token non-zero to fix bug Bugzilla: https://bugzilla.redhat.com/2105340 Y-Commit: 1286bb27b5a7b825db78a778c751bd5959fd9586 O-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2074835 commit 6f3c1fc53d86d580d8d6d749c4af23705e4f6f79 Author: Liang Zhang <zhangliang5> Date: Tue Feb 22 11:12:39 2022 +0800 KVM: x86/mmu: make apf token non-zero to fix bug The code change in particular is: @@ -3861,12 +3861,23 @@ static void shadow_page_table_clear_flood(struct kvm_vcpu *vcpu, gva_t addr) walk_shadow_page_lockless_end(vcpu); } +static u32 alloc_apf_token(struct kvm_vcpu *vcpu) +{ + /* make sure the token value is not 0 */ + u32 id = vcpu->arch.apf.id; + + if (id << 12 == 0) + vcpu->arch.apf.id = 1; + + return (vcpu->arch.apf.id++ << 12) | vcpu->vcpu_id; +} + static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, gfn_t gfn) { struct kvm_arch_async_pf arch; - arch.token = (vcpu->arch.apf.id++ << 12) | vcpu->vcpu_id; + arch.token = alloc_apf_token(vcpu); arch.gfn = gfn; arch.direct_map = vcpu->arch.mmu->direct_map; arch.cr3 = vcpu->arch.mmu->get_guest_pgd(vcpu); And no, I have no clue why fixing the apf token value it breaks fpu registers. Could any of you confirm that this could be the cause and how to fix it? Rest of changes on that release are: - docs - amdgpu - intel mmu - sound as the problem happens on both AMD and intel. I don't know if they are using postcopy, so adding a need info on next comment.
Hi José Are you using postcopy when the problem happens? Later, Juan.
Juan, Did you perhaps overlook on Min's test result? (In reply to Min Deng from comment #41) > Should break from kernel-4.18.0-372.21.1.el8_6.x86_64 If this means it starts failing only from 21.1 then the issue commit should be within 20.1-21.1, am I right? I do see a bunch of fpu changes there indeed: $ git ls kernel-4.18.0-372.20.1.el8_6..kernel-4.18.0-372.21.1.el8_6 567c79a430755 [redhat] kernel-4.18.0-372.21.1.el8_6 9fd907bc7276e Merge commit '9955186b3d934c470523ffd83729503b49d7f11c' into 8.6 9955186b3d934 Merge branch 'update-rhmaintainers' into 'main' 4d2e7e3d0d39e Merge branch 'mlx' into 'main' 4be65bd684449 Merge: Fix bad page state in process qemu-kvm when using TDP_MMU [rhel-8.6.z] ed6d9b30e3cc7 Merge: Adding KVM AMX support / zstream 8.6 58593d4385bed update yaml2RHMAINTAINERS.go for better Description block 07f78e3278fd2 Merge branch 'memstick_changes' into 'main' eb00c428aa048 Merge branch 'update-nfs-team' into 'main' a11d2673428c3 owners.yaml: Correct memstick.h maintainer 002a4f7d3bcbc RHEL-only: KVM: selftests: Fix AArch64 compilation 6776e21d43f6c x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) 8c3e840d9bec3 KVM: x86: Use ERR_PTR_USR() to return -EFAULT as a __user pointer a1fb1a22c0398 KVM: x86: add system attribute to retrieve full set of supported xsave states 83804ad7d31f1 KVM: x86: Add a helper to retrieve userspace address from kvm_device_attr 6088f0e8b2796 tools: arch: x86: pull in pvclock headers 1f2f65b896a48 KVM: x86: Expose TSC offset controls to userspace 3e7e2be81c67e KVM: x86: Refactor tsc synchronization code e9799ce21ad2b selftests: kvm: move vm_xsave_req_perm call to amx_test 18237434e7b50 RHEL-only: KVM: selftests: Remove unused modes 2c791f3cb1d8a tools headers UAPI: Sync linux/kvm.h with the kernel sources 7366a5e0663f2 kvm: selftests: sync uapi/linux/kvm.h with Linux header fa0c388efd9a1 kvm: selftests: conditionally build vm_xsave_req_perm() 957359f9fad7e x86/kvm/fpu: Remove kvm_vcpu_arch.guest_supported_xcr0 37e8baced1a16 x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0 6f8de1daebca6 KVM: x86/cpuid: Exclude unpermitted xfeatures sizes at KVM_GET_SUPPORTED_CPUID 0d0b19f922815 KVM: x86: Move CPUID.(EAX=0x12,ECX=1) mangling to __kvm_update_cpuid_runtime() 0b831a986f896 KVM: x86/cpuid: Clear XFD for component i if the base feature is missing f1ddf445cf92f KVM: x86: Do runtime CPUID update before updating vcpu->arch.cpuid_entries 4b165fa9e1b2f x86/fpu: Fix inline prefix warnings 570dc66c72f72 selftest: kvm: Add amx selftest 71b2b13149cb5 selftest: kvm: Move struct kvm_x86_state to header a334679a1a519 selftest: kvm: Reorder vcpu_load_state steps for AMX b8eedddf2acc0 kvm: x86: Disable interception for IA32_XFD on demand 41bb24919da7d x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() 9d8daf63538e3 kvm: selftests: Add support for KVM_CAP_XSAVE2 3dcfef223082e kvm: x86: Add support for getting/setting expanded xstate buffer 05d91f95a6b92 x86/fpu: Add uabi_size to guest_fpu 89f343bc7632c kvm: x86: Add CPUID support for Intel AMX 891e0ebaedb7e kvm: x86: Add XCR0 support for Intel AMX 9fb8021dbf842 kvm: x86: Disable RDMSR interception of IA32_XFD_ERR 0b947800b779f kvm: x86: Emulate IA32_XFD_ERR for guest 1ee527e0ee58f kvm: x86: Intercept #NM for saving IA32_XFD_ERR 8bda05485ea30 x86/fpu: Prepare xfd_err in struct fpu_guest 3c062dd3fd79f kvm: x86: Add emulation for IA32_XFD dc6632b859d99 x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation d20b70310f7d7 kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2 97bdbe0db20c3 x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM cc87f9a3b7ac9 x86/fpu: Add guest support to xfd_enable_feature() 395e8fb0ed032 x86/fpu: Make XFD initialization in __fpstate_reset() a function argument fce076b59bce3 kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID e7be8de766931 kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule 05b2dac029f2c x86/fpu: Prepare guest FPU for dynamically enabled FPU features 1157aec34ff7f x86/fpu: Extend fpu_xstate_prctl() with guest permissions 81befa094f374 kvm: selftests: move ucall declarations into ucall_common.h bba5a6234628f kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h 21b19cf1dde7b cpuid: kvm_find_kvm_cpuid_features() should be declared 'static' ac6c7da53ecb3 KVM: x86: Make sure KVM_CPUID_FEATURES really are KVM_CPUID_FEATURES 01c0f72fc9772 KVM: x86: Add helper to consolidate core logic of SET_CPUID{2} flows dff7d9f4545ce tools arch x86: Sync the msr-index.h copy with the kernel sources ebca2a45c6e7e Nvidia: mellanox: Add Mohammad Kabat 850ee5796bce6 Nvidia: mellanox: s/Alaa/Amir/ 2fb11c3ada0e9 owners: networking: add mlx5 documentation path dfb972ebfddf9 owners: update NFS team 1808bea0a0028 KVM: x86/mmu: Don't advance iterator after restart due to yielding Juan, worth try a round of bisection? (or maybe Paolo can already spot something in the list?)
(In reply to Juan Quintela from comment #46) > Hi José > > Are you using postcopy when the problem happens? > > Later, Juan. Hi Juan, I can see the problem in both live migration methods, Post-copy and Pre-copy.