RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1445603 - Windows 2016 guest will crash after hot plug one vcpu
Summary: Windows 2016 guest will crash after hot plug one vcpu
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.0
Hardware: x86_64
OS: Windows
high
high
Target Milestone: rc
: 8.0
Assignee: ybendito
QA Contact: Yumei Huang
URL:
Whiteboard:
Depends On: 1377155
Blocks: 1473046 1558351 1649160 1746622
TreeView+ depends on / blocked
 
Reported: 2017-04-26 06:04 UTC by Guo, Zhiyi
Modified: 2021-03-31 08:45 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-01 07:28:38 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Guo, Zhiyi 2017-04-26 06:04:20 UTC
Description of problem:
Boot windows 2016 guest with 2GB or less memory, guest will crash after hot plug one vcpu

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.9.0-1.el7.x86_64
3.10.0-655.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot windows 2016 guest with cli:
/usr/libexec/qemu-kvm -name win2016 -m 2G -machine pc,accel=kvm\
	-S \
        -cpu qemu64,enforce \
        -smp 1,maxcpus=4 \
        -vnc :0 \
        -monitor stdio \
        -device VGA \
        -serial unix:/tmp/console,server,nowait \
        -drive file=/home/test1.qcow2,if=none,id=drive-scsi-disk0,format=qcow2,cache=none,werror=stop,rerror=stop  -device ide-drive,drive=drive-scsi-disk0 \
	-netdev tap,id=idinWyYp,vhost=on -device e1000,mac=42:ce:a9:d2:4d:d7,id=idlbq7eA,netdev=idinWyYp \
	-qmp tcp:0:4444,server,nowait \

2.After guest boot, hot plug one vcpu through qmp:
{ "execute": "qmp_capabilities" }
{ "execute": "device_add","arguments":{"driver":"qemu64-x86_64-cpu","core-id": 0, "thread-id":0, "socket-id": 1,"id":"core1"}}
3.Check vcpu number inside guest

Actual results:
Guest will reboot immediately.

Expected results:
No reboot happen after cpu hotplug

Additional info:
No such issue happen if boot guest with 4G or above ram.No such issue happen to windows 10.

Comment 2 Igor Mammedov 2017-04-26 16:22:14 UTC
One probably needs to apply workaround to WS2016 for broken by default CPU hotplug
 https://bugzilla.redhat.com/show_bug.cgi?id=1377155#c17
to trigger the crash, otherwise windows won't even try to online hotplugged cpu.

Comment 4 Igor Mammedov 2017-04-26 16:48:32 UTC
Bug reproduces in both KVM and TCG modes, and according to KVM trace, hotplugged CPU wakes up but then during bring up it goes into triple fault and guest reboots.

Googling also shows that the same regression happens on vmware hosts.

Comment 8 ybendito 2019-06-13 08:53:04 UTC
There is latest (announced June 11) cumulative update for 2016 KB4503267.
It was probably was expected to solve this problem and reboot does not happen upon cpu-add.
But the CPU does not work, PnP operation does not finish and the system stops working correctly.
I've running the qemu as '-smp 2,maxcpus=4,sockets=4,cores=1,threads=1', then add 3rd cpu as 'cpu-add 2'
msinfo32 does not work, taskmgr does not show tasks, shutdown/reboot stucks.
All this happens when memory size set to 2G(2048M)
When it is set to 2080M - cpu is added correctly.
Note that the same thing happens with 'core' server (without desktop experience), which does not declare 2G as minimal amount of memory.
I'm going to open a support ticket at Microsoft.

Comment 9 ybendito 2019-06-13 09:21:52 UTC
Support request 119061321000566

Comment 10 ybendito 2019-07-04 13:17:58 UTC
According to Microsoft feedback: 
"the issue initially reported is in effect by a bug that affect Windows 2016 (it was solved in Windows 2019 in the KB4482887) that needs to be solved as soon as possible. According my notes from the develop team the solution for this bug is planned to be published with the last hotfix KB next month of August"
So, we will put this on hold till August and will check it with next cumulative update of 2016.

Comment 12 Igor Mammedov 2019-07-23 13:13:37 UTC
Reopening it to RHEL8, to keep track on a fix from Microsoft side.

Comment 13 Marina Kalinin 2019-09-06 19:56:07 UTC
Is it even realistic scenario when Windows machine has only 2G of RAM? I see they recommend minimum 512M. But from my experience, usually it takes 4G+ to make things working.

Comment 15 Ademar Reis 2020-02-05 22:43:32 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 18 Yumei Huang 2020-11-19 02:53:48 UTC
The issue still exists on 8.3-av. 
(In reply to ybendito from comment #10)
> According to Microsoft feedback: 
> "the issue initially reported is in effect by a bug that affect Windows 2016
> (it was solved in Windows 2019 in the KB4482887) that needs to be solved as
> soon as possible. According my notes from the develop team the solution for
> this bug is planned to be published with the last hotfix KB next month of
> August"
> So, we will put this on hold till August and will check it with next
> cumulative update of 2016.

Hi Yuri, 

Seems KB4482887 is only provided for windows 10 and 2019 according to [1]. Would you please double check if they will fix windows 2016? Thanks.


[1] https://www.catalog.update.microsoft.com/Search.aspx?q=KB4482887

Comment 19 RHEL Program Management 2020-12-01 07:28:38 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 21 Yvugenfi@redhat.com 2020-12-21 13:15:40 UTC
As we cannot force MS to release a hotfix for Windows Server 2016, closing based on comment 18.

Comment 22 Peixiu Hou 2021-01-08 03:14:17 UTC
Also hit this issue with -m 14336 (mem>4G) on win2016 guest vm.

1. Boot win2016 with qemu commands:
MALLOC_PERTURB_=1  /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2 \
    -m 14336  \
    -smp 2,maxcpus=4,cores=2,threads=1,dies=1,sockets=2  \
    -cpu 'Skylake-Server',hv_stimer,hv_synic,hv_vpindex,hv_relaxed,hv_spinlocks=0xfff,hv_vapic,hv_time,hv_frequencies,hv_runtime,hv_tlbflush,hv_reenlightenment,hv_stimer_direct,hv_ipi,+kvm_pv_unhalt \
    -device pvpanic,ioport=0x505,id=idASHu6b \
    -device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,num_queues=4,bus=pci.0,addr=0x4 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/win2016-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -device virtio-net-pci,mac=9a:cf:62:20:54:41,id=ida7fdGT,netdev=idsWmwuh,bus=pci.0,addr=0x5  \
    -netdev tap,id=idsWmwuh,vhost=on \
    -blockdev node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/windows/winutils.iso,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_cd1 \
    -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on  \
    -vnc :0  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -qmp tcp:0:4445,server,nowait \
    -enable-kvm \
    -monitor stdio

2. {'execute': 'qmp_capabilities', 'id': 'i7jIHH13'}
{"return": {}, "id": "i7jIHH13"}
{'execute': 'device_add', 'arguments': {'id':'vcpu1','driver': 'Skylake-Server-x86_64-cpu', 'socket-id':1, 'die-id': 0, 'core-id':0, 'thread-id':0},'id': 'UBjvk5E2'}
{"return": {}, "id": "UBjvk5E2"}
{"timestamp": {"seconds": 1610025648, "microseconds": 249453}, "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}}
{"timestamp": {"seconds": 1610025648, "microseconds": 262512}, "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}}
{"timestamp": {"seconds": 1610025663, "microseconds": 181241}, "event": "RTC_CHANGE", "data": {"offset": 30512}}
{"timestamp": {"seconds": 1610025663, "microseconds": 181566}, "event": "RTC_CHANGE", "data": {"offset": 42932}}

3. Check the guest vm, guest reset immediately after hot added a cpu.

used versions:
kernel-4.18.0-240.5.1.el8_3.x86_64
qemu-kvm-5.1.0-15.module+el8.3.1+8772+a3fdeccd.x86_64
virtio-win-1.9.15-0.el8
seabios-1.14.0-1.module+el8.3.0+7638+07cf13d2.x86_64

Best Regards~
Peixiu

Comment 23 Yumei Huang 2021-01-08 03:44:35 UTC
Hi Yuri, 

Would you please have a look at comment 22? The guest memory is more than 4G, is it the same issue in windows 2016? Thanks!

Comment 24 ybendito 2021-01-08 12:59:05 UTC
Peixiu, do you have the VM fully updated?

Comment 25 Peixiu Hou 2021-01-12 14:20:38 UTC
(In reply to ybendito from comment #24)
> Peixiu, do you have the VM fully updated?

Hi Yuri,

Sorry for late reply, for comment#22 result, did not with fully updates.
I tried to check latest updates and installed them today, There were 3 updates installed, KB4593226, KB4576750, KB4049065, rerun this job, guest vm hang after the cpu hotplugged.

Thanks~
Peixiu

Comment 26 Peixiu Hou 2021-01-12 14:26:44 UTC
qemu command line as:
/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2 \
    -m 14336  \
    -smp 2,maxcpus=4,cores=2,threads=1,dies=1,sockets=2  \
    -cpu 'Skylake-Server',hv_stimer,hv_synic,hv_vpindex,hv_relaxed,hv_spinlocks=0xfff,hv_vapic,hv_time,hv_frequencies,hv_runtime,hv_tlbflush,hv_reenlightenment,hv_stimer_direct,hv_ipi,+kvm_pv_unhalt \
    -chardev socket,nowait,path=/tmp/avocado_vjgk4b08/monitor-qmpmonitor1-20210112-090715-W7219jUP,id=qmp_id_qmpmonitor1,server  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \
    -chardev socket,nowait,path=/tmp/avocado_vjgk4b08/monitor-catch_monitor-20210112-090715-W7219jUP,id=qmp_id_catch_monitor,server  \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idMxL9VE \
    -chardev socket,nowait,path=/tmp/avocado_vjgk4b08/serial-serial0-20210112-090715-W7219jUP,id=chardev_serial0,server \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20210112-090715-W7219jUP,path=/tmp/avocado_vjgk4b08/seabios-20210112-090715-W7219jUP,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20210112-090715-W7219jUP,iobase=0x402 \
    -device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,num_queues=4,bus=pci.0,addr=0x4 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/win2016-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -device virtio-net-pci,mac=9a:3b:a1:89:70:38,id=idckNDqD,netdev=idZlnS4u,bus=pci.0,addr=0x5  \
    -netdev tap,id=idZlnS4u,vhost=on,vhostfd=20,fd=14 \
    -blockdev node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/windows/winutils.iso,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_cd1 \
    -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on  \
    -vnc :1  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm

Comment 27 ybendito 2021-01-12 14:45:25 UTC
Please open a BZ for that. This BZ is for hotplug with small memory size.
For new BZ please specify the qemu version, we'll need to check whether this is a regression of qemu or not.
Probably such test was done in the past for 2016.

Comment 28 Peixiu Hou 2021-01-13 10:28:00 UTC
(In reply to ybendito from comment #27)
> Please open a BZ for that. This BZ is for hotplug with small memory size.
> For new BZ please specify the qemu version, we'll need to check whether this
> is a regression of qemu or not.
> Probably such test was done in the past for 2016.

Ok, filed a new Bug 1915715        - Windows 2016 guest will reboot/hang/quit after hot plug a vcpu with large memory

Thanks a lot~
Peixiu


Note You need to log in before you can comment on or make changes to this bug.