Bug 1411105

Summary: Windows Server 2008-32 crashes on startup with q35 if cdrom attached
Product: Red Hat Enterprise Linux 7 Reporter: ybendito
Component: qemu-kvm-rhevAssignee: Ladi Prosek <lprosek>
Status: CLOSED ERRATA QA Contact: jingzhao <jinzhao>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4CC: ailan, chayang, jinzhao, jsnow, juzhang, knoel, lijin, lprosek, michen, rbalakri, virt-maint, yvugenfi
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.8.0-5.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 23:42:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1418320    
Bug Blocks:    

Description ybendito 2017-01-08 12:21:55 UTC
Description of problem:

QEMU uses -M q35 and has CDROM device. In repetitive reboot test the guest system crashes with signs of code memory corruption (usually NDIS procedures affected) 


Version-Release number of selected component (if applicable):

qemu-kvm-rhev-2.6.0-29.el7.x86_64

How reproducible:

Approx 1/20

Steps to Reproduce:
1. Install windows_server_2008_with_sp2_x86_dvd_342333.iso on q35
2. Configure system not to restart of BSOD, keep kernel dump
3. Run repetitive system restart

Actual results:

BSOD with signs of code memory corruption

Expected results:


Additional info:

Was discovered during investigation of 
https://bugzilla.redhat.com/show_bug.cgi?id=1408771

$qemu \
-drive file=/home/yurib/vms/Client1_windows_server_2008_with_sp2_x86-q35.qcow2,cache=unsafe \
-M q35 \
-nodefaults -nodefconfig -vga cirrus \
-device ioh3420,bus=pcie.0,id=root1.0,slot=1 \
-netdev tap,id=hostnet11,vhost=on,script=br0-ifup,ifname=nw1 \
-device e1000,netdev=hostnet11,mac=00:54:45:56:12:10,id=poc1,bus=pcie.0 \
-netdev tap,id=hostnet12,vhost=on,script=br0-ifup,ifname=nw2 \
-device virtio-net-pci,netdev=hostnet12,mac=00:54:45:56:11:10,bus=root1.0,id=poc2 \
-m 4G -smp 4,cores=4 -cpu qemu64,+x2apic,+fsgsbase,model=13 \
-enable-kvm -usbdevice tablet -vnc :$vncport \
-monitor telnet::$telnetport,server,nowait \
-cdrom en_windows_server_2008_with_sp2_x86_dvd_342333.iso

Comment 2 ybendito 2017-01-08 12:38:43 UTC
(from https://bugzilla.redhat.com/show_bug.cgi?id=1408771)
The corrupting pattern in several dump files, looks a lot like SCSI Sense Buffer data, 18 bytes long:
f0 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00 00 00

Is there some known/WIP problem related to this bug?
Should it be considered as bug of 2008-only cdrom/msahci or problem of existing qemu?

Comment 3 Ladi Prosek 2017-01-09 15:43:19 UTC
This is roughly what's happening:

Windows issues the MODE SENSE (5a) ATAPI command with action = 0 and code = LUN Mapping (1b). Because QEMU does not support the LUN Mapping page, the command fails with ILLEGAL_REQUEST / ASC_INV_FIELD_IN_CMD_PACKET.

Windows then issues the REQUEST SENSE (3) command to figure out why 5a failed and because it is (very likely) not set up correctly, the DMA transfer of the sense data buffer corrupts guest memory, sometimes causing BSOD later on.

It's hard to tell what exactly Windows is doing without extensive reverse engineering. But the story with a poorly tested code path (presumably real HW tends to support the LUN Mapping page?) is plausible.

This is where the DMA transfer is triggered from Windows point of view:

00 803d8530 803f0e64 nt!WRITE_REGISTER_ULONG+0xa
01 0x803f0e64
02 msahci!P_Running_WaitOnBSYDRQ+0x10a
03 msahci!P_Running_WaitOnFRE+0xcd
04 msahci!P_Running_WaitOnDET+0xef
05 msahci!P_Running+0x10b
06 msahci!P_Running_StartAttempt+0x36
07 msahci!AhciNonQueuedErrorRecovery+0x243
08 msahci!WorkerDispatch+0x60
09 ataport!IdeProcessMiniportDpcRequest+0x5e
0a ataport!IdePortCompletionDpc+0x6c
0b nt!KiRetireDpcList+0x147
0c nt!KiDispatchInterrupt+0x45
0d hal!HalpCheckForSoftwareInterrupt+0x64
0e hal!KfLowerIrql+0x64
0f ataport!IdeStartDeviceRequest+0x107
10 ataport!IssueCrbSync+0x30
11 ataport!IdeDiscoverDevice+0x131
12 ataport!IdeEnumerateDevices+0x8b
13 ataport!IdePortScanChannel+0x36
14 ataport!ChannelQueryBusRelation+0x3d
15 nt!IopProcessWorkItem+0x23
16 nt!ExpWorkerThread+0xfd
17 nt!PspSystemThreadStartup+0x9d
18 nt!KiThreadStartup+0x16


So - a more specific question for John: How hard would it be to support the MODE SENSE command with LUN Mapping (1b)? Thanks!

Comment 4 Ladi Prosek 2017-01-10 12:29:17 UTC
I tried the same Q35 VM config with Windows Server 2008R2 64-bit and I see the same issue. It's just that nothing important happens to live in the corrupted memory region on 2008R2.

In my run the DMA physical address was 7ff40540. Then in windbg:

// get directory base
1: kd> !ptov 00187000  
Amd64PtoV: pagedir 187000
...
7ff40000 fffff880`0299f000
...

1: kd> !address 0xfffff880`0299f000+0x540
...
Usage:                  
Base Address:           fffff880`023a0000
End Address:            fffff8a0`00000000
Region Size:            0000001f`fdc60000
VA Type:                SystemPTEs // <== doesn't look like a good DMA destination

Comment 5 Ladi Prosek 2017-01-13 08:30:11 UTC
(In reply to Ladi Prosek from comment #3)
> It's hard to tell what exactly Windows is doing without extensive reverse
> engineering. But the story with a poorly tested code path (presumably real
> HW tends to support the LUN Mapping page?) is plausible.

This was not a correct assessment. Also, comment 4 should be ignored. Apparently the memory returned by AtaPortGetUnCachedExtension where all the AHCI datastructures live really identifies as "SystemPTEs" in !address, as strange as it looks.

The problem is somewhere else. Here's a snippet from msahci.sys which sets up the DMA destination (data base addresses in the PRDT):

8cb2d14c call msahci!AtaPortGetPhysicalAddress (8cb2d3ac)
8cb2d151 test al,1
8cb2d153 jne  msahci!IRBtoPRDT+0x198 (8cb2d192)
8cb2d155 mov  dword ptr [esi+80h],eax
8cb2d15b test dword ptr [edi+468h],80000000h
8cb2d165 je   msahci!IRBtoPRDT+0x173 (8cb2d16d)       [br=1]
8cb2d167 mov  dword ptr [esi+84h],edx

[esi+80h] is the lower 32-bits (DBA), [esi+84h] is the upper 32-bits (DBAU). The corruption we observe is caused by not setting the DBAU while the physical address is in fact >4GB and has a non-zero upper dword. Side notes: 1) I couldn't originally repro this because I was running the VM with less than 4GB of RAM. 2) 4GB of RAM is enough to hit this because there will be pages >4GB due to all those memory gaps.

[edi+468h] has the contents of the HBA Capabilities register and bit 31 tested above is described in the spec like so:

"Supports 64-bit Addressing (S64A): Indicates whether the HBA can access 64-bit
data structures. When set to ‘1’, the HBA shall make the 32-bit upper bits of the port DMA Descriptor, the PRD Base, and each PRD entry read/write. When cleared to ‘0’, these are read-only and treated as ‘0’ by the HBA."


So whose fault is this? What Windows is doing is definitely odd. If the address has non-zero upper 32-bits and the HBA claims that it doesn't support 64-bit DMA, then they ignore the upper 32-bits and hope that it will somehow work. Right. :)

On the other hand, the QEMU HBA does support 64-bit DMA so bit 31 in the HBA Capabilities register should be set. I have verified that it fixes the issue, i.e it makes Windows correctly supply 64-bit physical addresses and it all works.

Comment 6 Ladi Prosek 2017-01-13 11:45:45 UTC
I have posted a QEMU patch to advertise the 64-bit capability:
https://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg02596.html

And a SeaBIOS patch to correctly initialize the AHCI controller (found during testing; manifests as a hang or crash after reboot):
https://www.coreboot.org/pipermail/seabios/2017-January/011062.html

Comment 7 Ladi Prosek 2017-02-01 14:30:41 UTC
Bug 1418320 tracks the SeaBIOS fix in 7.4. This bug continues to track the QEMU fix.

Comment 8 Ladi Prosek 2017-02-01 14:32:59 UTC
Changing the component as Windows guests are not supported in qemu-kvm.

Comment 12 Miroslav Rezanina 2017-02-20 10:06:59 UTC
Fix included in qemu-kvm-rhev-2.8.0-5.el7

Comment 14 jingzhao 2017-02-23 05:37:14 UTC
Reproduce the issue on qemu-kvm-rhev-2.6.0-29.el7

Verified it on the qemu-kvm-rhev-2.8.0-5.el7

ps: the qemu command line:
/usr/libexec/qemu-kvm \
-M q35 \
-cpu SandyBridge \
-nodefaults -rtc base=utc \
-m 4G \
-smp 2,sockets=2,cores=1,threads=1 \
-enable-kvm \
-name rhel7.4 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-serial unix:/tmp/console,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/test/seabios.log,id=seabios \
-device isa-debugcon,chardev=seabios,iobase=0x402 \
-qmp tcp::8887,server,nowait \
-vga qxl \
-spice port=5932,disable-ticketing \
-device ioh3420,id=root.0,slot=1 \
-drive file=/home/test/win8-32.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,bus=root.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-device ioh3420,id=root.1,slot=2 \
-device ioh3420,id=root.2,slot=3 \
-netdev tap,id=hostnet1 \
-device virtio-net-pci,netdev=hostnet1,id=net1,mac=54:52:00:B6:40:22,bus=root.2 \
-monitor stdio \
-cdrom /home/en_windows_server_2008_datacenter_enterprise_standard_sp2_x64_dvd_342336.iso \
-drive file=/usr/share/virtio-win/virtio-win-1.9.0.iso,if=none,media=cdrom,id=drive-ide1,format=raw \
-device ide-drive,bus=ide.0,drive=drive-ide1,id=ide1 \


Thanks
Jing Zhao

Comment 16 errata-xmlrpc 2017-08-01 23:42:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 17 errata-xmlrpc 2017-08-02 01:19:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 18 errata-xmlrpc 2017-08-02 02:11:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 19 errata-xmlrpc 2017-08-02 02:52:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 20 errata-xmlrpc 2017-08-02 03:17:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392