Bug 722954

Summary: guest bsod immediately after change screen resolution (vga qxl )
Product: Red Hat Enterprise Linux 8 Reporter: Xiaoqing Wei <xwei>
Component: spice-qxl-xddmAssignee: Yonit Halperin <yhalperi>
Status: CLOSED CURRENTRELEASE QA Contact: Desktop QE <desktop-qa-list>
Severity: high Docs Contact:
Priority: high    
Version: ---CC: acathrow, alevy, cfergeau, cmeadors, cpelland, dblechte, djasa, iheim, jrb, juzhang, michen, mkenneth, mkrcmari, Rhev-m-bugs, shuang, uril, yhalperi, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Windows   
Whiteboard:
Fixed In Version: qxl-win-0.1-10 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-24 15:22:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
echo kvm > set_event / debug trace info
none
BSOD
none
!analyze -v
none
call stack for BSOD when extending display to additional monitor none

Description Xiaoqing Wei 2011-07-18 15:29:49 UTC
Created attachment 513638 [details]
echo kvm > set_event / debug trace info

Description of problem:
guest bsod immediately after change screen resolution (-vga qxl)

Version-Release number of selected component (if applicable):
spice-server-0.8.0-1.el6.x86_64
spice-client-0.8.0-2.el6.x86_64

guest driver qxl-win-0.1-7
How reproducible:
3/4

Steps to Reproduce:
1.boot guest qemu-kvm ....-vga qxl -global qxl-vga.vram_size=67108864
2.install guest driver
3.change guest resolution by 
right click on desktop -> Properities -> Settings -> set higher resolution (here I am changing from 800x600 to 1024x768)
  
Actual results:
guest bsod with code 0x0000008E

Expected results:
guest resolution changed successfully and work well

Additional info:
Host info :
CPU:
processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 37
model name	: Intel(R) Core(TM) i5 CPU       M 480  @ 2.67GHz

2GB RAM

spice-server-0.8.0-1.el6.x86_64
spice-client-0.8.0-2.el6.x86_64

Guest info :

Win2003-32 with qxl-win-0.1-7

full cmd :
/usr/libexec/qemu-kvm -S -M rhel6.1.0 -enable-kvm -m 1024 -smp 2,sockets=2,cores=1,threads=1 -name win2003_VV -uuid 0bae4632-f6de-980c-9349-b408b20fe878 -monitor stdio -rtc base=localtime -boot order=c,menu=on -drive file=/win2003-32-virtio.qcow2,if=none,id=drive-virtio0-0-0,format=qcow2,cache=none,aio=threads -device virtio-blk-pci,drive=drive-virtio0-0-0,id=virtio0-0-0 -drive file=/home/xwei/Downloads/qxl-0.1-7.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,cache=none,aio=threads -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing \
\
-vga qxl -global qxl-vga.vram_size=67108864 \
\
-device AC97,id=sound0,bus=pci.0,addr=0x7 -netdev tap,id=tap1,ifname=vhost1,vhost=on -device virtio-net-pci,mac=aa:bb:c0:12:33:44,netdev=tap1,id=virt1


drop "-vga qxl -global qxl-vga.vram_size=67108864 " from cmd and it work well.

Comment 1 Xiaoqing Wei 2011-07-18 15:32:23 UTC
Cpu(s): 16.4%us, 14.3%sy,  0.2%ni, 63.7%id,  5.3%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1980560k total,  1903092k used,    77468k free,    14804k buffers
Swap:  3964920k total,   108632k used,  3856288k free,   192172k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                   
11744 root      20   0 1978m 1.2g 2812 S 190.4 61.1   0:05.72 qemu-kvm                                                                                                 
11742 root      20   0 1978m 1.2g 2812 S 64.9 61.1   5:27.71 qemu-kvm                                                                                                  
11743 root      20   0 1978m 1.2g 2812 S 44.9 61.1   5:01.37 qemu-kvm                                                                                                  
11706 root      20   0 1978m 1.2g 2812 S  4.7 61.1   2:14.99 qemu-kvm                                                                                                  

will try to reproduce it to see if it can generate a mem dump

Comment 3 Marian Krcmarik 2011-07-18 17:35:15 UTC
I am not able to reproduce that on Win7/32bit guest with qxl-win-0.1-7.
Can you reproduce on Win7/2008?

Comment 4 Uri Lublin 2011-07-18 21:16:44 UTC
Did you reboot your guest after installing the qxl driver ?
Does it also happen with one of the following guests:
 - Windows XP
 - Windows 7 (either 32 or 64 bit)

Comment 6 Xiaoqing Wei 2011-07-19 05:26:12 UTC
(In reply to comment #3)
Hi Marian Krcmarik ,
> I am not able to reproduce that on Win7/32bit guest with qxl-win-0.1-7.
> Can you reproduce on Win7/2008?

It's not 100% reproducible for me (even using the image previously bsod) .

Win7 / 2008 I have tried 5 times each , not yet reproduce . will try more to confirm.

Best Regards
Xiaoqing.

Comment 7 Xiaoqing Wei 2011-07-19 05:28:34 UTC
(In reply to comment #4)
Hi Uri Lublin,
> Did you reboot your guest after installing the qxl driver ?
guest did reboot after installed qxl-win-0.1-7

> Does it also happen with one of the following guests:
>  - Windows XP
>  - Windows 7 (either 32 or 64 bit)
It's not 100% reproducible for me (even using the Win2003 image previously bsod) .
I will try more to confirm.

Best Regards,
Xiaoqing.

Comment 8 David Jaša 2011-07-19 09:10:39 UTC
Created attachment 513739 [details]
BSOD

Confirmed on Windows XP, 100% reproducible - install RHEV 3 guest tools, then manually update the driver to latest qxl 0.1-7 from Brew, reboot. Any change of resolution results in crash.

Comment 9 Yonit Halperin 2011-07-19 10:25:55 UTC
The S3 patches affect the resolution change pathway dramatically. Hopefully it also solves this BSOD.

Comment 10 Yonit Halperin 2011-07-19 10:27:30 UTC
(In reply to comment #9)
> The S3 patches affect the resolution change pathway dramatically. Hopefully it
> also solves this BSOD.
see #688883

Comment 14 Christophe Fergeau 2011-08-05 15:31:26 UTC
(In reply to comment #8)
> Created attachment 513739 [details]
> BSOD
> 
> Confirmed on Windows XP, 100% reproducible - install RHEV 3 guest tools, then
> manually update the driver to latest qxl 0.1-7 from Brew, reboot. Any change of
> resolution results in crash.

With a slightly different setup, ie spice 0.8.2, latest qemu from the 0.14 branch, windows xp guest, agent installed, qxl 0.1-7 and also qxl 0.1-9. However, I couldn't install the guest additions from the RHEVM iso because it checks if the host is a RHEL6, which is not true on my development box, so that's a difference compared to the setup described here.
Is it still reproduceable on the latest RHEVM build and with the latest qxl drivers?

Comment 15 Cameron Meadors 2011-08-05 18:43:22 UTC
qemu-kvm 0.12 is what is in rhel 6.2.  I am don't think using with 0.14 is valid.

Comment 16 Christophe Fergeau 2011-08-05 21:02:10 UTC
Well, I'm testing with what I have easily available :)

Comment 17 Yonit Halperin 2011-08-08 07:46:36 UTC
I couldn't reproduce it with qemu-kvm 0.12, qxl 0.1-9, spice-server 0.8.2, xp guest. I tried with different qemu-kvm cpu configurations
-smp2,sockets=2,cores=1,threads=1
and also -smp 2 (two threads).

If you manage to reproduce it, please analyse the BSOD (see comment 13), or at least supply us the memory dump.

Comment 18 Marian Krcmarik 2011-08-08 08:54:03 UTC
(In reply to comment #17)
> I couldn't reproduce it with qemu-kvm 0.12, qxl 0.1-9, spice-server 0.8.2, xp
> guest. I tried with different qemu-kvm cpu configurations
> -smp2,sockets=2,cores=1,threads=1
> and also -smp 2 (two threads).
> 
> If you manage to reproduce it, please analyse the BSOD (see comment 13), or at
> least supply us the memory dump.

I believe the qxl 0.1-9 is built from qxl 2.2 version but Xiaoqing used 0.1-7 which is based on "qxl 3.0".
I am not saying I reproduced it, maybe This is the difference only.

Comment 22 Xiaoqing Wei 2011-08-08 10:09:31 UTC
Created attachment 517158 [details]
!analyze  -v

Comment 24 Alon Levy 2011-08-08 10:33:25 UTC
I'm moving to POST, the patches are already in the repository
(cgit.engineering.redhat.com/users/alevy/qxl.git):
 1. if we take the current latest, 0.1-9, that is the RHEV-2.2 driver, without
offscreen surface support, then the problem cannot be reproduced according to
yonit comment 19
 2. if we finish WHQL'ing the current driver, 0.1-8, the patches that fix S3
also change resolution changing behavior and again I and yonit both tested them
for resolution change.

Xiaoqing Wei - the new drivers should only solve this problem when run with a revision 3 device, so you need a scratch build of qemu-kvm (https://brewweb.devel.redhat.com/taskinfo?taskID=3539069 is based off 177) and the new drivers, either 0.1-8 from brew or (better) the ones on spice space - http://spice-space.org/download/binaries/qxl-win-0.1010-20110308-d9eb3203bd.zip

Alon

Comment 25 Yonit Halperin 2011-08-08 10:36:15 UTC
Sorry, moving this again the New, since qemu-kvm with revision 2 device might trigger this BSOD again, even with the new driver.

Comment 26 Yonit Halperin 2011-08-08 12:12:22 UTC
Moving to POST. Tested again with http://spice-space.org/download/binaries/qxl-win-0.1010-20110308-d9eb3203bd.zip (future qxl 0.1-10) and didn't reproduce.
Version of other components: as in comment #17.

Comment 27 Xiaoqing Wei 2011-08-08 12:56:12 UTC
(In reply to comment #24)
Hi Alon,

using your scratch build.
I install a new WinXP-32, and install the driver from spice-space.
after driver installed , I reboot guest and change resolution. does not trigger this issue.

Best Regards,
Xiaoqing.

Comment 28 Yonit Halperin 2011-08-14 12:07:36 UTC
Opening this again. Encountered A BSOD while detaching and reattaching a monitor on a dual monitors XP Guest. 
qxl 0.1-10 
qemu-kvm 0.12

Call stack attached.
Not sure yet if it is the same as the original bug.
Looks like we double release an off-screen surface, or alternatively, not clean the release ring while all the surfaces were already destroyed.

Comment 29 Yonit Halperin 2011-08-14 12:09:49 UTC
Created attachment 518179 [details]
call stack for BSOD when extending display to additional monitor

Comment 30 Yaniv Kaul 2011-08-14 12:35:48 UTC
(In reply to comment #29)
> Created attachment 518179 [details]
> call stack for BSOD when extending display to additional monitor

(Having the stack in the comments allows searching for similar cases in the future):
ee48b730 8052037a 00000050 ee9fb018 00000000 nt!KeBugCheckEx+0x1b
ee48b798 80544588 00000000 ee9fb018 00000000 nt!MmAccessFault+0x9a8
ee48b798 bf9e31a8 00000000 ee9fb018 00000000 nt!KiTrap0E+0xd0
ee48b838 bf9dab94 ee9fb008 eea243c0 e130a790 qxldd!mspace_free+0x18 [c:\cygwin\tmp\build\source\qxl\display\mspace.c @ 2270]
ee48b84c bf9db5f4 eea243c0 e130a790 f5274d10 qxldd!FreeMem+0xa4 [c:\cygwin\tmp\build\source\qxl\display\res.c @ 451]
ee48b860 bf9db673 e130a790 eea243c0 00000002 qxldd!QXLDelSurface+0x64 [c:\cygwin\tmp\build\source\qxl\display\res.c @ 833]
ee48b880 bf9dd2e2 e130a790 f5274d10 e130a790 qxldd!FreeDelSurface+0x53 [c:\cygwin\tmp\build\source\qxl\display\res.c @ 857]
ee48b89c bf9dd368 f3d10fa4 f3d10f98 ffffffff qxldd!ReleaseOutput+0x72 [c:\cygwin\tmp\build\source\qxl\display\res.c @ 184]
ee48b8b8 bf9dd4ef e130a790 ee48ba04 ee48b92c qxldd!FlushReleaseRing+0x28 [c:\cygwin\tmp\build\source\qxl\display\res.c @ 357]
ee48b8cc bf9dd5e6 000000df 00000001 e130a790 qxldd!__AllocMem+0xaf [c:\cygwin\tmp\build\source\qxl\display\res.c @ 410]
ee48b8e0 bf9e05cd e130a790 ee48ba04 ee48b92c qxldd!GetDrawable+0x16 [c:\cygwin\tmp\build\source\qxl\display\res.c @ 690]
ee48b8f4 bf9d9bea e130a790 0000000d ee48b92c qxldd!Drawable+0x3d [c:\cygwin\tmp\build\source\qxl\display\res.c @ 707]
ee48b940 bf8364e7 e10f6e90 e1e0a010 00000000 qxldd!DrvAlphaBlend+0x1da [c:\cygwin\tmp\build\source\qxl\display\rop.c @ 1635]
ee48b990 bf8a5f95 e10f6e90 e1e0a010 ee48b9cc win32k!WatchdogDrvAlphaBlend+0x56
ee48ba44 bf8a3946 ee48bbe4 e17918f8 00000000 win32k!PDEVOBJ::vProfileDriver+0x18e
ee48ba70 bf8a4201 e17918f8 e16c3c00 e1be91e8 win32k!hCreateHDEV+0x548
ee48bbe8 bf8aab44 00000000 00000000 00000001 win32k!DrvCreateMDEV+0x4dc
ee48bcdc bf8acdaf 00000000 e162e008 00000000 win32k!DrvChangeDisplaySettings+0x251
ee48bd20 bf8acca2 00000000 00000000 00000000 win32k!xxxUserChangeDisplaySettings+0x141
ee48bd48 8054162c 00000000 00000000 00000000 win32k!NtUserChangeDisplaySettings+0x4a
ee48bd48 7c90e514 00000000 00000000 00000000 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be wrong.
0007e5a0 00000000 00000000 00000000 00000000 0x7c90e514

Comment 31 Yonit Halperin 2011-08-14 13:15:32 UTC
(In reply to comment #28)
> Opening this again. Encountered A BSOD while detaching and reattaching a
> monitor on a dual monitors XP Guest. 
> qxl 0.1-10 
> qemu-kvm 0.12
qemu-kvm with revision 2 qxl device
> 
> Call stack attached.
> Not sure yet if it is the same as the original bug.
> Looks like we double release an off-screen surface, or alternatively, not clean
> the release ring while all the surfaces were already destroyed.

Comment 32 Yonit Halperin 2011-08-15 11:17:52 UTC
The bug reason:
When disabling a monitor and then enabling it again, Drv<Disable/Enable>Surface are called. In DrvDisableSurface we unmap the vram, and in DrvEnableSurface we Map it again, and the miniport calls VideoPortMapMemory.
The problem is that before the S3 support (qxl-0.9), we didn't made sure the vram is cleared when the surface is disabled, and we assumed that after remapping the vram, its virtual address will stay the same. However,  VideoPortMapMemory, returns a different virtual address, thus, when we try to access the vram by using the old virtual address, we crash.

This bug shouldn't happen when running with qxl-0.10 + Rhel6.2 qemu-kvm with qxl->revision=3. But with older qemu-kvm, or with qxl->revision=2 it can occur.

I reproduced the BSOD with an xp guest, with dual monitor. I disabled and enabled the primary monitor, and before enabling it I changed its resolution.

Comment 33 Alon Levy 2011-08-29 13:42:33 UTC
Fixed in this driver brew build (by commit 6f5cf2dbcc876c82db5cd870ef21104b7b83c838 in upstream, or d0decd09cb6fea7ed0310dea248c7770b99cdb94 in rhev-3.0 branch internally - only difference is patch reordering):

 https://brewweb.devel.redhat.com/buildinfo?buildID=177280

Note, it is not tagged with the rhev-m-3-candidate tag, it will not appear in an ic build, hence not setting Fixed In Version or moving to MODIFIED.

This is a strange situation. Basically the only way this can work is if every driver passes WHQL and we never break that :(

Alon

Comment 34 Uri Lublin 2012-07-24 15:22:08 UTC
The build mentioned in comment 33 is qxl-win-0.1-10
Closing as current-release.