Description of problem: BSOD happens on Windows startup or reboot with vioscsi system drive. Happens with Win 8.1 (x86/x64) There is no crash dump on the guest, the blue screen always stays on 0% The problem was investigated with kernel debugger attached. Version-Release number of selected component (if applicable): Was seen with virtio-win 214,215 Other was not tried. Note that very similar behavior (probably the same) happened also with all the enlightenments disabled. How reproducible: Loop of 200 tests on avocado with "testcase=single_driver_install" on Win8.1 32/64 reproduced the problem 2/2 Steps to Reproduce: I've used following technique to investigate the problem 1. Prepare machine in the same network with windbg host, say it's IP is 10.x.x.x 1. Avocado script for first run: #!/bin/bash #OSLIST=Win2016,Win8..1,Win2019,Win2012,Win10 OSLIST=Win8..1 TEST=--testcase=single_driver_install python3 ConfigTest.py $TEST --guestname=$OSLIST \ --platform=x86_64,i386 --clone=yes --machines=q35 --debug \ --customsparams='cpu_model_flags +=,hv_vendor_id=KVMKVM\nnics += " nic2"\nnic_model_nic2 = "e1000e"' 2. During OS installation manually configure kernel debugger: bcdedit /set debug on bcdedit /set dbgsettings net hostip:10.x.x.x port:50000 key:1.2.3.4 3. For further large loop: add --nrepeat=200 and change to --clone=no 4. Increase timeout of avocado to prevent guest reboot when login timeout expired (contact Avocado experts how exactly to do that, I've used locally patched avocado-vt), otherwise default timeout might be as short as ~4 min For example of dump file (collected using windbg), exact qemu command line etc see https://bugzilla.redhat.com/show_bug.cgi?id=1868572#c147
Hi Qing, Peixiu, Menghuan, Please help check the above comment 2, and decide if we need to add this parameters to our automation test. Thanks. Please also consider if layer product use this by default.
Hi Vadim, CNV seems not use this parameter by default, so if we must have it, we might need to ask for document update, thanks.
(In reply to Qianqian Zhu from comment #4) > Hi Vadim, > > CNV seems not use this parameter by default, so if we must have it, we might > need to ask for document update, thanks. probably not. both viostor and vioscsi can auto-generate serial number by themselves if this parameter is not present in the QEMU command line. the only problem with this approach is that the serial will be changes if the PCI topology (the virtio device controller location) is changed too. While the serial name reported from the qemu command line is a kind of persistent one, at least as long as we keep it unchanged. Best, Vadim.
(In reply to Vadim Rozenfeld from comment #5) > (In reply to Qianqian Zhu from comment #4) > > Hi Vadim, > > > > CNV seems not use this parameter by default, so if we must have it, we might > > need to ask for document update, thanks. > > probably not. > both viostor and vioscsi can auto-generate serial number by themselves if > this parameter is not present in the QEMU command line. > the only problem with this approach is that the serial will be changes if > the PCI topology (the virtio device controller location) is > changed too. While the serial name reported from the qemu command line is a > kind of persistent one, at least as long as we keep it unchanged. > Best, > Vadim. Hi Qianqian I was wrong with my previous answer. We don't generate vpd page 0x80 (serial number) only page 0x83. So we need explicitly specify the serial number name to help Windows generate DUID https://learn.microsoft.com/en-us/windows-hardware/drivers/storage/device-unique-identifiers--duids--for-storage-devices Best, Vadim.
Hi Germano, Would you please help check the above comments about qemu storage option 'serial'? It is recommend to be specified in qemu command line, otherwise user could hit this BSOD issue with older version virtio-win driver. I suppose we need to document it, but not sure which type of document; and probably recommend layer product like CNV to specify it by default(currently they don't). Regards, Qianqian
Hi Qianqian, If its really just 8.1 and with older virtio-win, I don't think this is important enough to proactive KCS. RHV will always add that Serial, and on CNV its optional. But given no customer complained yet, I'd expect newer installs to use newer virtio-win, and most likely 8.1 will not be used, as its already EOL: https://learn.microsoft.com/en-us/lifecycle/products/windows-81 So just to confirm if any of these is true, the problem does not happen? a) Windows > 8.1 b) Latest virtion-win If my understanding is not correct, please let me know. Thanks, Germano
(In reply to Germano Veit Michel from comment #10) > Hi Qianqian, > > If its really just 8.1 and with older virtio-win, I don't think this is > important enough to proactive KCS. > RHV will always add that Serial, and on CNV its optional. > > But given no customer complained yet, I'd expect newer installs to use newer > virtio-win, and most likely 8.1 will not be used, as its already EOL: > https://learn.microsoft.com/en-us/lifecycle/products/windows-81 > > So just to confirm if any of these is true, the problem does not happen? > a) Windows > 8.1 > b) Latest virtion-win > > If my understanding is not correct, please let me know. > > Thanks, > Germano Hi Germano, It will be nice to provide "serial" for the newest Windows OSes too. The following MSFT resource https://learn.microsoft.com/en-us/windows-hardware/drivers/storage/device-unique-identifiers--duids--for-storage-devices mentions that information in STORAGE_DEVICE_UNIQUE_IDENTIFIER is a combination of STORAGE_DEVICE_ID_DESCRIPTOR - which we generate automatically, based on the controller's location on PCI bus ,and STORAGE_DEVICE_DESCRIPTOR - the information provided by qemu, with parameter "serial" (for both virtio-scsi and virtio-blk) All the best, Vadim.
Hi Germano, Thanks for your reply. It is true that the issue won't happen on Windows > 8.1 + Latest virtio-win, so I agree documentation for this specific issue might not be necessary. However, as Vadim suggested, serial number is recommended by MS. I just worry if CNV does not specify it, not sure if it will lead to other problems. So what we are trying to do is to ask documentation(or probably other approach) to recommend layer product to enable it, to avoid any potential customer issue. I am okay if you find there is no risk to keep their current configuration, please let me know your decision, we might update our test plan accordingly.Thanks a lot. Regards, Qianqian
Thanks Vadim and Qianqian (In reply to Qianqian Zhu from comment #12) > It is true that the issue won't happen on Windows > 8.1 + Latest virtio-win, > so I agree documentation for this specific issue might not be necessary. OK, thanks. So we agree a KCS is not needed. > However, as Vadim suggested, serial number is recommended by MS. I just > worry if CNV does not specify it, not sure if it will lead to other > problems. So what we are trying to do is to ask documentation(or probably > other approach) to recommend layer product to enable it, to avoid any > potential customer issue. I was reading the linked Microsoft page, and it seems a Serial can indeed help if a device changes location, so there is more information the disk can be properly enumerated. It is also useful on Linux, we recommend customers to use disk id too, which is based on the serial. However, there is no bug that crashes the OS like Win 8.1. So I agree with you, I think we can discuss with the CNV team to add serial by default for new VMs, like RHV does. It could possibly be added for all VMs though, not just Windows. I'll open an RFE against CNV and link here. Thank you!