Created attachment 1272265 [details] 133-pass Description of problem: job "PNP Rebanlance RequestNew Resources Device Test" and other two jobs failed on win10+ guestsss Version-Release number of selected component (if applicable): virtio-win-prewhql-133/135 qemu-kvm-rhev-2.8.0-5/6.el7.x86_64 kernel-3.10.0-574/634.el7.x86_64 seabios-1.10.1-2.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. boot guest with qemupciserial /usr/libexec/qemu-kvm -name 133QSRW10S64TGQ -enable-kvm -m 6G -smp 8 -uuid f21be03e-dc00-4892-acab-1e88c3303ad6 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/133QSRW10S64TGQ,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb -drive file=133QSRW10S64TGQ,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2016_x64_dvd_9327751.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=133QSRW10S64TGQ.vfd,if=floppy,id=drive-fdc0-0-0,format=raw,cache=none -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:52:6a:42:d4:3e -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:1 -vga std -M pc -chardev socket,path=/tmp/133QSRW10S64TGQ_serial0,server,nowait,id=serial0 -device pci-serial,chardev=serial0,id=pciserial0 2.submit job Actual results: still failed with filter Expected results: it could be filter pass or pass Additional info: 1 it could pass with filter before, but it still cannot pass with former version(as below) virtio-win-prewhql-133 qemu-kvm-rhev-2.8.0-5.el7.x86_64 kernel-3.10.0-574.el7.x86_64 seabios-1.10.1-2.el7.x86_64 2 tried on 3 hlk server ,still cannot pass 3 logs for build-133(pass) and build-135(fail) refer to attachment. The same error, but could not pass now. 4 jobs failed as below DF - PNP Stop (Rebalance) Device Test (Reliability) failed on win10-32/64/ws2016 DF - PNP Rebalance Request New Resources Device Test (Reliability) failed on win10-32/64/ws2016 DF - PNP Remove Device Test (Reliability) only failed on win2016
Created attachment 1272267 [details] 135-fail The same error as build 133,but could not filter pass this time. and now it cannot pass with build 133 too. WDTF_SIMPLE_IO : - Open(Communications Port (COM3) MF\PCI#VEN_1B36&DEV_0002&SUBSYS_11001AF4&REV_01\4&3227F39&0&28#CHILD0000 ) Failed : Device is reporting a problem code (Status Flags=0x1806400 (DN_HAS_PROBLEM DN_DISABLEABLE DN_REMOVABLE DN_NT_ENUMERATOR DN_NT_DRIVER) Problem Code=a (CM_PROB_FAILED_START)) HRESULT=0x80004005 WDTF_SIMPLE_IO : Device Status: Status Flags=0x1806400 (DN_HAS_PROBLEM DN_DISABLEABLE DN_REMOVABLE DN_NT_ENUMERATOR DN_NT_DRIVER) Problem Code=a (CM_PROB_FAILED_START)
Created attachment 1272479 [details] com-error When running jobs in comment#0, it occurred error (Code10) as screenshot Insufficient system resources exist to complete the API Thanks Yu Wang
Any update about this bug?
I was able to reproduce this and have a theory on what might be wrong. Gal, feel free to re-assign to me.
Debugging notes: I can reproduce this consistently by running the test named "DF - PNP Rebalance Request New Resources Device Test (Reliability)". In the middle of the test after resources are rebalanced the COMn device gets the yellow exclamation point and reports Code 10, insufficient resources. Running "info pci" in HMP, I see the IRQ number changing and the I/O BAR staying the same. Originally I thought that the problem is in QEMU not catching the IRQ update. Unlike the multi-port PCI serial device, the single-port one doesn't use pci_set_irq to set the interrupt so it doesn't (AFAICT) re-read the relevant part of the PCI config space. Sadly, fixing that didn't help. Fortunately serial.sys shipped with Windows was compiled with some debugging info so I could do: 0: kd> bp serial!SerialDbgPrintEx "da rdx; g" to get crude debug print outs and see the code flow. The second invocation of serial!SerialFinishStartDevice didn't seem to run to completion and sure enough, tracing the function confirmed that serial!SerialGetPortInfo called at this callstack: serial!SerialFinishStartDevice+0x2b0 serial!SerialStartDevice+0xcf serial!SerialPnpDispatch+0x390 nt!IovCallDriver+0x252 returned c000009a (STATUS_INSUFFICIENT_RESOURCES).
STATUS_INSUFFICIENT_RESOURCES is caused by the IRP not having the expected non-NULL parameters. Specifically for IRP_MN_START_DEVICE, arg1 and arg2 of the current IO stack location should contain AllocatedResources and AllocatedResourcesTranslated but are both NULL. Let's see what kind of power IRPs the driver is getting: 0: kd> bp serial!SerialPnpDispatch "!irp @rdx; g" ... First IRP_MN_START_DEVICE (all is good): Irp is active with 3 stacks 2 is current (= 0xffffd8843744cf28) No Mdl: No System Buffer: Thread ffff9d88eb634040: Irp stack trace. cmd flg cl Device File Completion-Context [N/A(0), N/A(0)] 0 10 00000000 00000000 00000000-00000000 Args: 00000000 00000000 00000000 00000000 >[IRP_MJ_PNP(1b), IRP_MN_START_DEVICE(0)] 0 e0 ffff9d88eb6f3070 00000000 fffff808443c1200-ffffc800a30c8628 Success Error Cancel \Driver\Serial serenum!SerenumSyncCompletion Args: ffff88046d5c3690 ffff88046d6f8db0 00000000 00000000 [IRP_MJ_PNP(1b), IRP_MN_START_DEVICE(0)] 0 e0 ffff9d88eb6f3d60 00000000 fffff80200f73088-ffff9d88ebc27400 Success Error Cancel \Driver\Serenum nt!PnpDeviceCompletionRoutine Args: ffff88046d5c3690 ffff88046d6f8db0 00000000 00000000 ... Second IRP_MN_START_DEVICE (device fails to start): Irp is active with 3 stacks 2 is current (= 0xffffd884373c6f28) No Mdl: No System Buffer: Thread ffff9d88ebef2040: Irp stack trace. cmd flg cl Device File Completion-Context [N/A(0), N/A(0)] 0 10 00000000 00000000 00000000-00000000 Args: 00000000 00000000 00000000 00000000 >[IRP_MJ_PNP(1b), IRP_MN_START_DEVICE(0)] 0 e0 ffff9d88eb6f3070 00000000 fffff808443c1200-ffffc800a3662628 Success Error Cancel \Driver\Serial serenum!SerenumSyncCompletion Args: 00000000 00000000 00000000 00000000 [IRP_MJ_PNP(1b), IRP_MN_START_DEVICE(0)] 0 e0 ffff9d88eb6f3d60 00000000 fffff80200f73088-ffff9d88e9e95880 Success Error Cancel \Driver\Serenum nt!PnpDeviceCompletionRoutine Args: 00000000 00000000 00000000 00000000 They look the same except for the missing Args.
Posted a question on the ntdev list: http://www.osronline.com/showthread.cfm?link=283584
I have tried assigning resources to the port with a simple: HKR,Child0000,ResourceMap,1,00,01,02 instead of: HKR,Child0000,VaryingResourceMap,1,00, 00,00,00,00, 08,00,00,00 HKR,Child0000,ResourceMap,1,02 but it didn't help.
This is very likely a Windows bug. Here's a brief overview of the architecture of the qemupciserial driver: We ship qemupciserial.inf which references the in-box Windows MF (multi-function) driver and provides a recipe for splitting resources among individual UARTs. This is what the VaryingResourceMap and ResourceMap entries are for. They are read by MF.sys which acts like a bus driver and enumerates a PNP0501 device for each UART, then driven by serial.sys. There is a Windows-internal and undocumented interface between MF.sys and serial.sys to communicate this resource allocation. In order for these HLK tests to pass, MF.sys must be able to handle resource rebalancing and do the right thing with respect to it's child devices. And that seems to be broken. I have found reports of MF.sys crashing: https://social.msdn.microsoft.com/Forums/en-US/1003a2be-3463-4601-ae91-55cacc29904c/bsod-when-disabling-or-uninstalling-mfsys and have experienced a verifier violation in MF.sys myself: https://www.osronline.com/showthread.cfm?link=283584#T5 So blaming MF.sys for this is a plausible theory. Unfortunately we can't write our own bus driver to replace MF because of its undocumented resource arbitration protocol. If we wanted to write something, it would have to be a full UART driver and that is not trivial. But, luckily, we support only the 1x flavor of the QEMU PCI serial device so in theory we should be able to make serial.sys drive the UART without MF.sys. That's the next thing to try.
An .inf with no MF.sys dependency appears to work and passes the test for me. Fix committed as: https://github.com/virtio-win/kvm-guest-drivers-windows/commit/539da1a0f8f1d233051f90fef3ee620527c946e6 Note that we now build two copies of the qemupciserial driver. The upstream one supports all three devices and stays in the same location (root of the internal pre-WHQL build). The RHEL driver supports only the 1x device and will be dropped to a new 'rhel' directory in the internal pre-WHQL build. Please use the one in 'rhel' for testing.
With build 137,all whql jobs passed Thanks,Ladi So change status to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2341