Bug 1139928
Summary: | win2012r2 guest would be black screen when booting vm | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | ShupingCui <scui> | ||||||||
Component: | qemu-kvm-rhev | Assignee: | Vadim Rozenfeld <vrozenfe> | ||||||||
Status: | CLOSED WORKSFORME | QA Contact: | Yiqian Wei <yiwei> | ||||||||
Severity: | unspecified | Docs Contact: | Jiri Herrmann <jherrman> | ||||||||
Priority: | high | ||||||||||
Version: | 7.1 | CC: | coli, ehabkost, hhuang, huding, jherrman, jinzhao, juzhang, knoel, michen, pbonzini, ricky.schneberger, rkrcmar, shuang, virt-maint, vrozenfe, xuhan, yfu | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 1253168 (view as bug list) | Environment: | |||||||||
Last Closed: | 2017-11-09 07:41:42 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1253168, 1270165 | ||||||||||
Attachments: |
|
Description
ShupingCui
2014-09-10 03:16:55 UTC
Created attachment 935968 [details]
kvm_stat
First suspect is QXL. I think that we already have such bug. Does it happen before Windows boot screen? If so can you please provide bugcheck code or post the relevant screenshot image? Thanks, Vadim. Created attachment 936874 [details]
screenshot
(In reply to Vadim Rozenfeld from comment #4) > Does it happen before Windows boot screen? > If so can you please provide bugcheck code or post > the relevant screenshot image? > > Thanks, > Vadim. Hi Vadim, Could you check comment 5 about screenshot images? Thanks, Shuping (In reply to ShupingCui from comment #6) > (In reply to Vadim Rozenfeld from comment #4) > > Does it happen before Windows boot screen? > > If so can you please provide bugcheck code or post > > the relevant screenshot image? > > > > Thanks, > > Vadim. > > Hi Vadim, > > Could you check comment 5 about screenshot images? > > Thanks, > Shuping Thanks Shuping, Does it work well if you switch to std instead of qxl? Best regards, Vadim. Vadim thinks that the issue is that hv-enlightenment + sandy-bridge, turns on a CPU flags that are missing. See the see-also bugs too. Radim, Eduardo, will any of you take it? (In reply to ShupingCui from comment #0) [...] > -cpu > 'SandyBridge',+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time > \ [...] I am considering to start automatically closing any CPUID-related bug report where the "enforce" CPU flag is not used in the command-line. Please test using the "enforce" flag, otherwise there's no guarantee that all CPUID flags are being propertly exposed to the guest. (Using the "check" flag instead of "enforce" may be acceptable if there's any issue with "enforce" blocking the testing. But in this case, all the warnings shown by QEMU should be included in the bug report. Omitting both "check" and "enforce" is not acceptable, though.) (In reply to Eduardo Habkost from comment #15) > (In reply to ShupingCui from comment #0) > [...] > > -cpu > > 'SandyBridge',+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time > > \ > [...] > > I am considering to start automatically closing any CPUID-related bug report > where the "enforce" CPU flag is not used in the command-line. Please test > using the "enforce" flag, otherwise there's no guarantee that all CPUID > flags are being propertly exposed to the guest. > > (Using the "check" flag instead of "enforce" may be acceptable if there's > any issue with "enforce" blocking the testing. But in this case, all the > warnings shown by QEMU should be included in the bug report. Omitting both > "check" and "enforce" is not acceptable, though.) OK, I will try with using the "enforce" flag and let you know the result. We need some additional testing to find out which CPU model change may be triggering it: I am assuming that the bug is not reproduced using Westmere, and the differences between Westmere and SandyBridge are: * level (11 vs 13) * model (44 vs 42) * CPU features: avx, xsave, tsc-deadline, x2apic, rdtscp My main suspects are: * model * level/xsave/avx (level 13 enables the xsave CPUID leaf) * x2apic So, we need to check the following: * Confirm if the bug is really not reproducible using -cpu Westmere,force * If it can be reproduced, then there's no need to test the combinations below. In that case, we need to check if Nehalem, Penryn, Conroe, and qemu64 allow the bug to be triggered. * Confirm if the bug is reproducible using: -cpu Westmere,level=13,model=42,+avx,+xsave,+tsc-deadline,+x2apic,+rdtscp,enforce * It SHOULD be reproducible using the following, as it is equivalent to -cpu SandyBridge. If it can't be reproduced, please let us know (and there's no need to test the items below). * Test using different flag combinations, to find out which flags really trigger the bug. My suggestion is to test them in the following order: * -cpu Westmere,model=42,enforce * -cpu Westmere,level=13,+avx,+xsave,enforce * -cpu Westmere,model=42,level=13,+avx,+xsave,enforce * -cpu Westmere,+x2apic,enforce * -cpu Westmere,model=42,+x2apic,enforce * -cpu Westmere,+tsc-deadline,enforce * -cpu Westmere,model=42,+tsc-deadline,enforce * -cpu Westmere,+rdtscp,enforce * -cpu Westmere,model=42,+rdtscp,enforce I expect at least one of the above combinations to allow the bug to be reproduced. Is it still reproducible with the most recent qemu and virtio-win drivers setup? Thanks, Vadim. Hi vadim, it happen frequently. qemu-kvm-rhev: qemu-kvm-rhev-2.2.0-16.el7.x86_64 virtio-win: virtio-win-1.7.4.iso What do we need to provide. Regards, Suqin This bug is getting old... Suqin, qemu-kvm-rhev-2.2.0 is not the latest. Did you test with 2.3.0? Host kernel version? Radim, Is there a way to debug this through KVM on the host? it can be reproduced with qemu-kvm-rhev-2.3.0-22.el7.x86_64 it can be reproduced with qemu-kvm-rhev-2.3.0-22.el7.x86_64 Hi Vadim, QE team can reproduce it with comment#27. (In reply to ShupingCui from comment #28) > Hi Vadim, > > QE team can reproduce it with comment#27. Thanks, Can you confirm that it is still stuck in a black screen with no messages or debug information ? Best regards, Vadim. (In reply to Vadim Rozenfeld from comment #29) > (In reply to ShupingCui from comment #28) > > Hi Vadim, > > > > QE team can reproduce it with comment#27. > > Thanks, > Can you confirm that it is still stuck in a black screen with no > messages or debug information ? > > Best regards, > Vadim. Hi Shuang, Can you reply this? Best Regards, Junyi Hi Vadim, scui is reproducing the issue. we will check 1). network status 2). top info 3). kvm_stat any other message do you need. Thanks Suqin Not messages, but can we try generating a crashdump file by triggering NMI interrupt when the system stuck with a black screen next time? Thanks, Vadim. (In reply to Vadim Rozenfeld from comment #32) > Not messages, but can we try generating a crashdump file by triggering NMI > interrupt when the system stuck with a black screen next time? > Thanks, > Vadim. Hi Vadim, Tried on host(3.10.0-310.el7.x86_64) and qemu-kvm-rhev-2.3.0-21.el7.x86_64: when guest black screen, i cannot get a crashdump file by triggering NMI interrupt, always on '0% complete', using top and kvm_stat check status, the logs are: # top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 60115 root 20 0 4928004 3.957g 12136 S 401.0 17.0 10:01.39 qemu-kvm # kvm_stat kvm_entry 9096030 183936 Created attachment 1073901 [details]
screendumps
screendumps and vm_register log
(In reply to ShupingCui from comment #34) > Created attachment 1073901 [details] > screendumps > > screendumps and vm_register log Many thanks for your feedback. Does it only happen with virtio-scsi driver? How does it work with ide and virtio-blk devices? Thanks, Vadim. (In reply to Vadim Rozenfeld from comment #35) > (In reply to ShupingCui from comment #34) > > Created attachment 1073901 [details] > > screendumps > > > > screendumps and vm_register log > > Many thanks for your feedback. > Does it only happen with virtio-scsi driver? How does it work with ide and > virtio-blk devices? > > Thanks, > Vadim. I'm working on with virito-blk and ide now, will update the result when I finished test. (In reply to Vadim Rozenfeld from comment #35) > (In reply to ShupingCui from comment #34) > > Created attachment 1073901 [details] > > screendumps > > > > screendumps and vm_register log > > Many thanks for your feedback. > Does it only happen with virtio-scsi driver? How does it work with ide and > virtio-blk devices? > > Thanks, > Vadim. Tried with ide and virtio-blk devices, still can reproduce black screen issue. (In reply to ShupingCui from comment #37) > (In reply to Vadim Rozenfeld from comment #35) > > (In reply to ShupingCui from comment #34) > > > Created attachment 1073901 [details] > > > screendumps > > > > > > screendumps and vm_register log > > > > Many thanks for your feedback. > > Does it only happen with virtio-scsi driver? How does it work with ide and > > virtio-blk devices? > > > > Thanks, > > Vadim. > > Tried with ide and virtio-blk devices, still can reproduce black screen > issue. Thanks. Can you please check if NMI can create a crash dump file with ide disk? (In reply to Vadim Rozenfeld from comment #38) > (In reply to ShupingCui from comment #37) > > (In reply to Vadim Rozenfeld from comment #35) > > > (In reply to ShupingCui from comment #34) > > > > Created attachment 1073901 [details] > > > > screendumps > > > > > > > > screendumps and vm_register log > > > > > > Many thanks for your feedback. > > > Does it only happen with virtio-scsi driver? How does it work with ide and > > > virtio-blk devices? > > > > > > Thanks, > > > Vadim. > > > > Tried with ide and virtio-blk devices, still can reproduce black screen > > issue. > > Thanks. > Can you please check if NMI can create a crash dump file with ide disk? Tried it, cannot create a crash dump file with ide using NMI, do you have other method to create dump file? (In reply to ShupingCui from comment #39) > (In reply to Vadim Rozenfeld from comment #38) > > (In reply to ShupingCui from comment #37) > > > (In reply to Vadim Rozenfeld from comment #35) > > > > (In reply to ShupingCui from comment #34) > > > > > Created attachment 1073901 [details] > > > > > screendumps > > > > > > > > > > screendumps and vm_register log > > > > > > > > Many thanks for your feedback. > > > > Does it only happen with virtio-scsi driver? How does it work with ide and > > > > virtio-blk devices? > > > > > > > > Thanks, > > > > Vadim. > > > > > > Tried with ide and virtio-blk devices, still can reproduce black screen > > > issue. > > > > Thanks. > > Can you please check if NMI can create a crash dump file with ide disk? > > Tried it, cannot create a crash dump file with ide using NMI, do you have > other method to create dump file? Unfortunately, NMI is the best option I can offer. I will try reproducing this issue on my local system, with WinDbg attached. Best regards, Vadim can not reproduce this bug with latest version. host version: qemu-kvm-rhev-2.9.0-10.el7.x86_64 kernel-3.10.0-684.el7.x86_64 virtio-win-1.9.1-0.el7.noarch guest:win2012r2-64bit test steps 1.boot a guest with virtio-scsi /usr/libexec/qemu-kvm \ -name win2012r2 \ -M pc \ -cpu SandyBridge,enforce \ -m 4096 \ -realtime mlock=off \ -smp 4,sockets=2,cores=2,threads=1 \ -uuid d7828e5c-3a66-421e-9e83-beea9457fcb8 \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,path=/tmp/monitor.sock,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id,path=/tmp/seabios123,server,nowait \ -device isa-debugcon,chardev=seabioslog_id,iobase=0x402 \ -rtc base=localtime,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -global PIIX4_PM.disable_s3=1 \ -global PIIX4_PM.disable_s4=1 \ -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 \ -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 \ -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 \ -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 \ -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x7 \ -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \ -drive file=/root/win2012-64r2-virtio-scsi.qcow2,if=none,id=drive-scsi0-0-0-0,cache=none,snapshot=off,aio=native \ -device scsi-hd,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ -netdev tap,id=hostnet0,vhost=on \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:74:3e:aa,bus=pci.0,addr=0x3 \ -chardev spicevmc,id=charchannel0,name=vdagent \ -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 \ -device usb-tablet,id=input0,bus=usb.0,port=1 \ -spice port=5932,disable-ticketing \ -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 \ -device intel-hda,id=sound0,bus=pci.0,addr=0x4 \ -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 \ -boot order=cdn,once=c,menu=off \ -enable-kvm \ -monitor stdio \ -msg timestamp=on \ -drive id=drive_cd1,if=none,media=cdrom,file=/usr/share/virtio-win/virtio-win-1.9.1.iso \ -device scsi-cd,id=cd1,drive=drive_cd1 \ -monitor unix:/tmp/monitor2,server,nowait \ 2.waiting for guest boot up 3.Repeat steps 1, 50 times with the script. #!/bin/bash for i in $(seq 1 50); do echo echo "===== This is the $i iteration =====" eval "sh /root/bz1139928/cmd.sh"& sleep 48 echo "system_powerdown" | nc -U /tmp/monitor2 sleep 17 done test results: Win2012r2 guest would boot up successfully,guest would not got to black screen. Additional info: 1.boot a guest with virtio-blk,no reproduce this bug 2.host cpu info: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz stepping : 7 microcode : 0x29 cpu MHz : 2280.789 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts bogomips : 6784.06 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: # free -m total used free shared buff/cache available Mem: 11812 4485 6883 16 443 7039 Swap: 6015 0 6015 (In reply to Yiqian Wei from comment #45) > can not reproduce this bug with latest version. > host version: > qemu-kvm-rhev-2.9.0-10.el7.x86_64 > kernel-3.10.0-684.el7.x86_64 > virtio-win-1.9.1-0.el7.noarch > guest:win2012r2-64bit > Thank you. Going to close the bug as worksforme. Best regards, Vadim. |