Bug 1986665
Summary: | [Fwcfg64] dump-guest-memory -w command report error "win-dump: failed to read CPU #2 ContextFrame location" on Windows desktop | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Peixiu Hou <phou> |
Component: | qemu-kvm | Assignee: | Virtualization Maintenance <virt-maint> |
qemu-kvm sub component: | General | QA Contact: | leidwang <leidwang> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | ailan, lijin, lmiksik, mdean, menli, mrezanin, qizhu, virt-maint, vrozenfe, yvugenfi |
Version: | 9.0 | Keywords: | Triaged |
Target Milestone: | rc | ||
Target Release: | 9.0 | ||
Hardware: | x86_64 | ||
OS: | Windows | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-7.2.0-1.el9 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-05-09 07:19:27 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2135806 | ||
Bug Blocks: | 1972056 |
Description
Peixiu Hou
2021-07-28 03:30:39 UTC
Hi Vadim, I Tested this issue on RHE9.0 host, it can be reproduced. Versions: kernel-5.14.0-9.el9.x86_64 qemu-kvm-6.1.0-5.el9.x86_64 seabios-bin-1.14.0-7.el9.noarch virtio-win-prewhql-214 BTW, If fwcfg will be included in virtio-win rpm package of RHEL8.5.z? or it will be start included from RHEL8.6.0/RHEL9? Thanks~ Peixiu (In reply to Peixiu Hou from comment #12) > Hi Vadim, > > I Tested this issue on RHE9.0 host, it can be reproduced. > > Versions: > kernel-5.14.0-9.el9.x86_64 > qemu-kvm-6.1.0-5.el9.x86_64 > seabios-bin-1.14.0-7.el9.noarch > virtio-win-prewhql-214 > > BTW, If fwcfg will be included in virtio-win rpm package of RHEL8.5.z? or it > will be start included from RHEL8.6.0/RHEL9? > > Thanks~ > Peixiu Thank you, Peixiu, AFAIK there is no plan to release this driver in 8.5.z. It probably will be done in 8.6/9.0 time frame. Best regards, Vadim. (In reply to Vadim Rozenfeld from comment #13) > (In reply to Peixiu Hou from comment #12) > > Hi Vadim, > > > > I Tested this issue on RHE9.0 host, it can be reproduced. > > > > Versions: > > kernel-5.14.0-9.el9.x86_64 > > qemu-kvm-6.1.0-5.el9.x86_64 > > seabios-bin-1.14.0-7.el9.noarch > > virtio-win-prewhql-214 > > > > BTW, If fwcfg will be included in virtio-win rpm package of RHEL8.5.z? or it > > will be start included from RHEL8.6.0/RHEL9? > > > > Thanks~ > > Peixiu > > Thank you, Peixiu, > > AFAIK there is no plan to release this driver in 8.5.z. > It probably will be done in 8.6/9.0 time frame. > Got it, thank you~ > Best regards, > Vadim. I test on win10(ovmf), hit the same issue, following are the details: package info: qemu-kvm-6.1.0-6.el9.x86_64 kernel-5.14.0-12.el9.x86_64 edk2-ovmf-20210527gite1999b264f1f-7.el9.noarch seabios-bin-1.14.0-7.el9.noarch virtio-win-prewhql-214 1. boot a guest with following command: /usr/libexec/qemu-kvm \ -name "mouse-vm" \ -machine q35,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars -nodefaults \ -cpu 'Skylake-Server',hv_stimer,hv_synic,hv_vpindex,hv_reset,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv-tlbflush,+kvm_pv_unhalt \ -device pcie-root-port,port=0x10,chassis=1,id=root0,bus=pcie.0,multifunction=on,addr=0x2 \ -device pcie-root-port,port=0x11,chassis=2,id=root1,bus=pcie.0,addr=0x2.0x1 \ -device pcie-root-port,port=0x12,chassis=3,id=root2,bus=pcie.0,addr=0x2.0x2 \ -device pcie-root-port,port=0x14,chassis=4,id=root3,bus=pcie.0,addr=0x2.0x3 \ -device pcie-root-port,port=0x15,chassis=5,id=root4,bus=pcie.0,addr=0x2.0x4 \ -device pcie-root-port,port=0x16,chassis=6,id=root5,bus=pcie.0,addr=0x2.0x5 \ -device pcie-root-port,port=0x17,chassis=7,id=root6,bus=pcie.0,addr=0x2.0x6 \ -device pcie-root-port,port=0x18,chassis=8,id=root7,bus=pcie.0,addr=0x2.0x7 \ -blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/mnt/123/win10-64-virtio-scsi.qcow2,node-name=drive_sys3 \ -blockdev driver=qcow2,node-name=drive-virtio-disk0,file=drive_sys3 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=root1 \ -device scsi-hd,id=image1,drive=drive-virtio-disk0,bus=virtio_scsi_pci0.0,channel=0,scsi-id=0,lun=0,bootindex=0 \ -device virtio-net-pci,mac=9a:36:83:b6:3d:05,id=idJVpmsF,netdev=id23ZUK6 \ -netdev tap,id=id23ZUK6,vhost=on \ -vga std \ -device vmcoreinfo \ -blockdev node-name=file_ovmf_code,driver=file,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \ -blockdev node-name=file_ovmf_vars,driver=file,filename=/mnt/123/win10-64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \ -m 4096 \ -smp 6 \ -vnc :10 \ -boot order=cdn,once=c,menu=on,strict=on \ -enable-kvm \ -qmp tcp:0:3333,server,nowait \ -monitor stdio \ -rtc base=localtime,clock=host,driftfix=slew 2. send qmp command: {"execute": "human-monitor-command", "arguments": {"command-line": "dump-guest-memory -w /var/tmp/Memory.dmp"}, "id": "MofT1uZU"} Actual result: {"timestamp": {"seconds": 1638355510, "microseconds": 262315}, "event": "STOP"} {"timestamp": {"seconds": 1638355510, "microseconds": 262810}, "event": "DUMP_COMPLETED", "data": {"result": {"total": 4294770688, "status": "failed", "completed": 0}, "error": "win-dump: failed to read CPU #4 ContextFrame location"}} {"timestamp": {"seconds": 1638355510, "microseconds": 262856}, "event": "RESUME"} {"return": "Error: win-dump: failed to read CPU #4 ContextFrame location\r\n", "id": "MofT1uZU"} Additional info: 1. change '-smp 6' to '-smp 4', not hit the issue 2. not hit the issue on win2016 with the same qemu command. I've faced similar issue with Windows 10 (10.0.19619) and QEMU 6.2.50: (qemu) dump-guest-memory -w 1.dmp Error: win-dump: failed to read CPU #2 ContextFrame location So, QEMU found CPU #0 and #1 contexts but failed to find #2. /usr/local/bin/qemu-system-x86_64 \ -name guest=win10,debug-threads=on \ -machine pc-q35-6.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram \ -cpu Icelake-Server,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,rdpid=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,hle=off,rtm=off,clwb=off,intel-pt=off,la57=off,wbnoinvd=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff \ -m 6144 \ -object memory-backend-memfd,id=pc.ram,share=yes,size=6442450944 \ -overcommit mem-lock=off \ -smp 4 \ -monitor stdio \ -rtc base=localtime,driftfix=slew \ -no-hpet \ -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 \ -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 \ -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 \ -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 \ -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 \ -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 \ -drive file=/home/vp/vms/Win10x64_2004_19619.qcow2,format=qcow2 \ -usb \ -usbdevice tablet \ -spice port=5905,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on \ -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x3 \ -nic user \ -device vmcoreinfo QEMU is configured with -smp 4, but Windows discovers only 2 sockets with 1 core and 1 thread: Item Value Processor Intel Xeon Processor (Icelake), 3600 Mhz, 1 Core(s), 1 Logical Processor(s) Processor Intel Xeon Processor (Icelake), 3600 Mhz, 1 Core(s), 1 Logical Processor(s) Also, Windows reports NumberProcessors = 2 through crash dump header. The other 2 CPUs are not used by the system for some reason, so there are no context frames for them. Based on the comment #18, The issue needs more investigation. Number of CPU sockets is very limited in desktop versions of Windows: https://codeinsecurity.wordpress.com/2022/04/07/cpu-socket-and-core-count-limits-in-windows-10-and-how-to-remove-them/ For example, only 2 sockets are available on Windows 10 Pro. So, if such guest Windows is running on a system with 4 sockets, only 2 sockets are actually utilized by OS (this is actually what we are observing). Besides of that, behavior of '-smp X' option was changed between QEMU 6.1 and 6.2: 6.1: prefer sockets over cores, '-smp X' means X sockets, 1 core 6.2: prefer cores over sockets, '-smp X' means 1 socket, X cores The same rules are applicable for machine types pc-q35-6.1 and pc-q35-6.2. In QEMU 7.2.50 described logic is in hw/core/machine-smp.c. To sum up, desktop Windows on QEMU with machine type <= 6.1 and number of CPUs described as '-smp X' (without details) may not fit into the socket limit. In case of such a discrepancy, 'dump-guest-memory -w' fails because it assumes that QEMU number of CPUs are the same as number of CPUs utilized by Windows. So, I'm preparing a patch to limit number of CPUs processed by 'dump-guest-memory -w' by number of CPUs taken from guest Windows. My NeedInfo has been removed since the TestBlocker that triggered it is no longer relevant. QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Tested with qemu-kvm-7.2.0-1.el9,command "dump-guest-memory -w /home/a.dmp" execute success, and Memory dump file can be saved normally. Move this bz to VERIFIED. Thanks! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:2162 |