Bug 835872
Summary: | [WHQL]Windows 8 cannot resume from S4 with ide driver | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Yvugenfi <yvugenfi> |
Component: | qemu-kvm | Assignee: | John Snow <jsnow> |
Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 7.0 | CC: | amit.shah, bcao, bsarathy, juzhang, knoel, lijin, mdeng, mkenneth, qzhang, rbalakri, stefanha, tburke, virt-maint |
Target Milestone: | rc | ||
Target Release: | 7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-04-28 16:03:41 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 921526, 923626, 1105334, 1205796 |
Description
Yvugenfi@redhat.com
2012-06-27 11:45:28 UTC
Why do you think this is a qemu bug? Do you have analysis of what's happening? The description is too terse for someone not familiar with Windows to make sense. I wasn't sure on what component to open the bug. Windows was used without any virtio devices (as you can see from the command line) - so this is definitely not a problem of virtio-win component. Theoretically, yes - it is still can be a problem of the storage driver on Windows. Or it can be a seabios issue. I will investigate it more time to get additional info. Just wanted to have a placeholder for this issue. Cannot reproduce this bug with virtio-block Looks like the problem is in Window8 IDE driver. Debug shows that Windows uses cyclic buffer to read hibernate image from a disk and at the same time it decompress image from the same buffer. At some point new data from the disk starts to overwrite data that is being decompressed. If I slow down IDE read substantially in QEMU by inserting deliberate delays resume succeeds. (In reply to comment #5) > Cannot reproduce this bug with virtio-block Hi, Yan Does it 100% reproduce ? I tried 3 times ,can not reproduce . What's more .I run win8 balloon/serial whql jobs recently ,did not hit this issue as well. (In reply to comment #7) > (In reply to comment #5) > > Cannot reproduce this bug with virtio-block > > Hi, Yan > > Does it 100% reproduce ? > I tried 3 times ,can not reproduce . > What's more .I run win8 balloon/serial whql jobs recently ,did not hit this > issue as well. What is you command line? How do you know it did not reproduce? For me it does not print an error message like in comment #0 but shutdown immediately. On next start it does regular boot. (In reply to comment #8) > (In reply to comment #7) > > (In reply to comment #5) > > > Cannot reproduce this bug with virtio-block > > > > Hi, Yan > > > > Does it 100% reproduce ? > > I tried 3 times ,can not reproduce . > > What's more .I run win8 balloon/serial whql jobs recently ,did not hit this > > issue as well. > What is you command line? How do you know it did not reproduce? For me it > does not print an error message like in comment #0 but shutdown immediately. > On next start it does regular boot. Steps 1. Start guest CLI:usr/libexec/qemu-kvm -boot dc -m 4G -smp 4 --nodefaults -cpu cpu64-rhel6,+x2apic -usb -device usb-tablet -netdev tap,sndbuf=0,id=hostnet2,script=/etc/qemu-ifup,downscript=no -device e1000,netdev=hostnet2,mac=00:52:13:20:F5:22,bus=pci.0,addr=0x6 -uuid 7976cd92-6557-493d-86a3-7e2055a2d4cd -no-kvm-pit-reinjection -monitor stdio -rtc base=localtime,clock=host,driftfix=slew -drive file=win8-64.raw,if=none,media=disk,format=raw,rerror=stop,werror=stop,cache=none,aio=native,id=drive-disk0 -device ide-drive,drive=drive-disk0,id=disk -cdrom /home/Windows8-ReleasePreview-64bit-English.iso -vnc :1 -bios /usr/share/seabios/bios-pm.bin -vga cirrus 2. hibernate guest #shutdown /h 3.resume the guest via the same CLI as step1 Actual Results: 1. Tried 4 times ,did not reproduce ,guest could resume sucessfully (In reply to comment #9) > Steps > 1. Start guest > CLI:usr/libexec/qemu-kvm -boot dc -m 4G -smp 4 --nodefaults -cpu > cpu64-rhel6,+x2apic -usb -device usb-tablet -netdev > tap,sndbuf=0,id=hostnet2,script=/etc/qemu-ifup,downscript=no -device > e1000,netdev=hostnet2,mac=00:52:13:20:F5:22,bus=pci.0,addr=0x6 -uuid > 7976cd92-6557-493d-86a3-7e2055a2d4cd -no-kvm-pit-reinjection -monitor stdio > -rtc base=localtime,clock=host,driftfix=slew -drive > file=win8-64.raw,if=none,media=disk,format=raw,rerror=stop,werror=stop, > cache=none,aio=native,id=drive-disk0 -device try with cache=unsafe or if this does not exists on rhel6 just drop cache part. > ide-drive,drive=drive-disk0,id=disk -cdrom > /home/Windows8-ReleasePreview-64bit-English.iso -vnc :1 -bios > /usr/share/seabios/bios-pm.bin -vga cirrus > 2. hibernate guest > #shutdown /h > 3.resume the guest via the same CLI as step1 > > > Actual Results: > 1. Tried 4 times ,did not reproduce ,guest could resume sucessfully (In reply to comment #10) > (In reply to comment #9) > > Steps > > 1. Start guest > > CLI:usr/libexec/qemu-kvm -boot dc -m 4G -smp 4 --nodefaults -cpu > > cpu64-rhel6,+x2apic -usb -device usb-tablet -netdev > > tap,sndbuf=0,id=hostnet2,script=/etc/qemu-ifup,downscript=no -device > > e1000,netdev=hostnet2,mac=00:52:13:20:F5:22,bus=pci.0,addr=0x6 -uuid > > 7976cd92-6557-493d-86a3-7e2055a2d4cd -no-kvm-pit-reinjection -monitor stdio > > -rtc base=localtime,clock=host,driftfix=slew -drive > > file=win8-64.raw,if=none,media=disk,format=raw,rerror=stop,werror=stop, > > cache=none,aio=native,id=drive-disk0 -device > > try with cache=unsafe or if this does not exists on rhel6 just drop cache > part. Reproduced w/ cache=unsafe . thanks, Seems cache=unsafe/writethrough/writeback is not in QE's test plan (In reply to comment #11) > (In reply to comment #10) > > (In reply to comment #9) > > > Steps > > > 1. Start guest > > > CLI:usr/libexec/qemu-kvm -boot dc -m 4G -smp 4 --nodefaults -cpu > > > cpu64-rhel6,+x2apic -usb -device usb-tablet -netdev > > > tap,sndbuf=0,id=hostnet2,script=/etc/qemu-ifup,downscript=no -device > > > e1000,netdev=hostnet2,mac=00:52:13:20:F5:22,bus=pci.0,addr=0x6 -uuid > > > 7976cd92-6557-493d-86a3-7e2055a2d4cd -no-kvm-pit-reinjection -monitor stdio > > > -rtc base=localtime,clock=host,driftfix=slew -drive > > > file=win8-64.raw,if=none,media=disk,format=raw,rerror=stop,werror=stop, > > > cache=none,aio=native,id=drive-disk0 -device > > > > try with cache=unsafe or if this does not exists on rhel6 just drop cache > > part. > > Reproduced w/ cache=unsafe . thanks, > Seems cache=unsafe/writethrough/writeback is not in QE's test plan For me it is reproducable with cache=none too because my local disk is very fast. What is you host HW? (In reply to comment #12) > (In reply to comment #11) > For me it is reproducable with cache=none too because my local disk is very > fast. What is you host HW? local disk 00:1f.2 RAID bus controller: Intel Corporation 82801 SATA Controller [RAID mode] (rev 04) Subsystem: Dell Device 047e Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin C routed to IRQ 32 Region 0: I/O ports at 30d0 [size=8] Region 1: I/O ports at 30c0 [size=4] Region 2: I/O ports at 30b0 [size=8] Region 3: I/O ports at 30a0 [size=4] Region 4: I/O ports at 3060 [size=32] Region 5: Memory at e1a40000 (32-bit, non-prefetchable) [size=2K] Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee003d8 Data: 0000 Capabilities: [70] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [a8] SATA HBA <?> Capabilities: [b0] PCI Advanced Features AFCap: TP+ FLR+ AFCtrl: FLR- AFStatus: TP- Kernel driver in use: ahci Kernel modules: ahci (In reply to comment #13) > (In reply to comment #12) > > (In reply to comment #11) > > > For me it is reproducable with cache=none too because my local disk is very > > fast. What is you host HW? > > local disk Run "hdparm -tT /dev/xxx" on it please. Use read dev name instead of xxx. (In reply to comment #14) > (In reply to comment #13) > > (In reply to comment #12) > > > (In reply to comment #11) > > > > > For me it is reproducable with cache=none too because my local disk is very > > > fast. What is you host HW? > > > > local disk > Run "hdparm -tT /dev/xxx" on it please. Use read dev name instead of xxx. # hdparm -tT /dev/sda1 /dev/sda1: Timing cached reads: 21920 MB in 2.00 seconds = 10971.78 MB/sec Timing buffered disk reads: 466 MB in 3.00 seconds = 155.28 MB/sec [root@localhost ~]# hdparm -tT /dev/sda /dev/sda: Timing cached reads: 21254 MB in 2.00 seconds = 10638.18 MB/sec Timing buffered disk reads: 466 MB in 3.00 seconds = 155.25 MB/sec (In reply to comment #15) > (In reply to comment #14) > > (In reply to comment #13) > > > (In reply to comment #12) > > > > (In reply to comment #11) > > > > > > > For me it is reproducable with cache=none too because my local disk is very > > > > fast. What is you host HW? > > > > > > local disk > > Run "hdparm -tT /dev/xxx" on it please. Use read dev name instead of xxx. > > # hdparm -tT /dev/sda1 > > /dev/sda1: > Timing cached reads: 21920 MB in 2.00 seconds = 10971.78 MB/sec > Timing buffered disk reads: 466 MB in 3.00 seconds = 155.28 MB/sec > [root@localhost ~]# hdparm -tT /dev/sda > > /dev/sda: > Timing cached reads: 21254 MB in 2.00 seconds = 10638.18 MB/sec > Timing buffered disk reads: 466 MB in 3.00 seconds = 155.25 MB/sec Hey, this is actually faster than my disk :) OK. One last test to satisfy my curiosity. Can you try your original command line (with cache=none and all), but with qcow2 image? (In reply to comment #16) > (In reply to comment #15) > > (In reply to comment #14) > > > (In reply to comment #13) > > > > (In reply to comment #12) > > > > > (In reply to comment #11) > > > > > > > > > For me it is reproducable with cache=none too because my local disk is very > > > > > fast. What is you host HW? > > > > > > > > local disk > > > Run "hdparm -tT /dev/xxx" on it please. Use read dev name instead of xxx. > > > > # hdparm -tT /dev/sda1 > > > > /dev/sda1: > > Timing cached reads: 21920 MB in 2.00 seconds = 10971.78 MB/sec > > Timing buffered disk reads: 466 MB in 3.00 seconds = 155.28 MB/sec > > [root@localhost ~]# hdparm -tT /dev/sda > > > > /dev/sda: > > Timing cached reads: 21254 MB in 2.00 seconds = 10638.18 MB/sec > > Timing buffered disk reads: 466 MB in 3.00 seconds = 155.25 MB/sec > > Hey, this is actually faster than my disk :) OK. One last test to satisfy my > curiosity. Can you try your original command line (with cache=none and all), > but with qcow2 image? Actually two tests. Another one is to use raw by drop aio=native part. (In reply to comment #17) > (In reply to comment #16) > > (In reply to comment #15) > > > (In reply to comment #14) > > > > (In reply to comment #13) > > > > > (In reply to comment #12) > > > > > > (In reply to comment #11) > > > > > > > > > > > For me it is reproducable with cache=none too because my local disk is very > > > > > > fast. What is you host HW? > > > > > > > > > > local disk > > > > Run "hdparm -tT /dev/xxx" on it please. Use read dev name instead of xxx. > > > > > > # hdparm -tT /dev/sda1 > > > > > > /dev/sda1: > > > Timing cached reads: 21920 MB in 2.00 seconds = 10971.78 MB/sec > > > Timing buffered disk reads: 466 MB in 3.00 seconds = 155.28 MB/sec > > > [root@localhost ~]# hdparm -tT /dev/sda > > > > > > /dev/sda: > > > Timing cached reads: 21254 MB in 2.00 seconds = 10638.18 MB/sec > > > Timing buffered disk reads: 466 MB in 3.00 seconds = 155.25 MB/sec > > > > Hey, this is actually faster than my disk :) OK. One last test to satisfy my > > curiosity. Can you try your original command line (with cache=none and all), > > but with qcow2 image? > > Actually two tests. Another one is to use raw by drop aio=native part. Tried 5 times ,can not reproduced. I will install an new one w/ qcow2 image. (In reply to comment #16) > (In reply to comment #15) > > (In reply to comment #14) > > > (In reply to comment #13) > > > > (In reply to comment #12) > > > > > (In reply to comment #11) > > > > > > > > > For me it is reproducable with cache=none too because my local disk is very > > > > > fast. What is you host HW? > > > > > > > > local disk > > > Run "hdparm -tT /dev/xxx" on it please. Use read dev name instead of xxx. > > > > # hdparm -tT /dev/sda1 > > > > /dev/sda1: > > Timing cached reads: 21920 MB in 2.00 seconds = 10971.78 MB/sec > > Timing buffered disk reads: 466 MB in 3.00 seconds = 155.28 MB/sec > > [root@localhost ~]# hdparm -tT /dev/sda > > > > /dev/sda: > > Timing cached reads: 21254 MB in 2.00 seconds = 10638.18 MB/sec > > Timing buffered disk reads: 466 MB in 3.00 seconds = 155.25 MB/sec > > Hey, this is actually faster than my disk :) OK. One last test to satisfy my > curiosity. Can you try your original command line (with cache=none and all), > but with qcow2 image? Tried 4 times ,still can not reproduce Boot win2012-64 guest hit the issue,not sure if is same problem. HOst: # uname -r 2.6.32-358.0.1.el6.x86_64 # rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.355.el6.x86_64 Guest:windows2012 1.Boot guest /usr/libexec/qemu-kvm -M rhel6.4.0 -enable-kvm -m 2G -smp 2 -uuid `uuidgen` -nodefaults -rtc base=utc -drive file=/home/win2012-64.qcow2,if=none,id=drive-system-disk,format=qcow2,cache=writeback,serial=QEMU-DISK1 -device ide-drive,drive=drive-system-disk,id=sytem-disk -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device e1000,netdev=hostnet0,id=net0,mac=52:54:00:60:3f:29,addr=0x4 -monitor stdio -boot menu=on,order=d -usb -device usb-tablet,id=input0 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -qmp tcp:0:4445,server,nowait -spice port=5830,disable-ticketing -vga qxl 2.Configure S4 for win2012 guest Open control panel--->"Change what the power buttons do"--->select "When i press the power button:" Hibernate -->save changes 3.Do S4 (qemu)system_powerdown 4.Boot guest with same CLI Results: After step4 ,Wait about 3 seconds,qemu quit automatically addtional info: 1)If boot guest with "cache=none",not hit the problem 2)If boot guest with virtio disk ,not hit the problem MY HOST: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz stepping : 10 cpu MHz : 2000.000 cache size : 3072 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dts tpr_shadow vnmi flexpriority bogomips : 5320.25 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: (In reply to comment #21) > Boot win2012-64 guest hit the issue,not sure if is same problem. Since you cannot hit the problem with cache=none it indeed looks like the same problem, but instead of BSOD windows restarts. May be it is possible to tell Windwos to restart on BSOD and it is a default on 2012? (In reply to comment #22) > (In reply to comment #21) > > Boot win2012-64 guest hit the issue,not sure if is same problem. > Since you cannot hit the problem with cache=none it indeed looks like the > same problem, but instead of BSOD windows restarts. May be it is possible to > tell Windwos to restart on BSOD and it is a default on 2012? Hi,Gleb Thanks for your guide,i tried as following method to test the issue. 1)Open Control Panel--->View advanced system settings--->choose "advanced" tab ---> Startup and Recovery(settings)-->not choose "Automatically restart"-->press "OK" for save changes 2)Use the same steps and same CLI as comment21 test the issue. Results: After step 4:Boot guest with same CLI Seem guest hang (qemu) info status VM status: running HOst #top Tasks: 154 total, 1 running, 153 sleeping, 0 stopped, 0 zombie Cpu(s): 36.5%us, 14.0%sy, 0.0%ni, 49.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 7509248k total, 5084280k used, 2424968k free, 84356k buffers Swap: 58720240k total, 0k used, 58720240k free, 4560176k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29739 root 20 0 2607m 383m 5304 S 200.2 5.2 11:41.62 qemu-kvm Hi, Can you check if there is a crash dump? If you see one, please send it to me. Thanks, Yan. (In reply to comment #24) > Hi, > > > Can you check if there is a crash dump? If you see one, please send it to me. > > Thanks, > Yan. Hi, Yan Steps like comment21 and comment23,both can not get the crash dump. Thanks fang lang *** Bug 1050801 has been marked as a duplicate of this bug. *** According to Bug 1050801 , We hit this issue during hck test *** Bug 1057543 has been marked as a duplicate of this bug. *** Is this related to QEMU's -win2k-hack option? It delays every 16th IDE write completion IRQ so that Windows 2000 install works. John Snow has determined that this problem is "fixed by a Windows Update (KB#2822241 -- Update Rollup for April 2013) which includes a fix for the problem as described in KB 2823506". This is a race condition in Windows. There is no functional problem in KVM but the timing seems to hit the Windows bug more easily than on bare metal. |