Description of problem:
QEMU uses -M q35 and has CDROM device. In repetitive reboot test the guest system crashes with signs of code memory corruption (usually NDIS procedures affected)
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install windows_server_2008_with_sp2_x86_dvd_342333.iso on q35
2. Configure system not to restart of BSOD, keep kernel dump
3. Run repetitive system restart
BSOD with signs of code memory corruption
Was discovered during investigation of
-drive file=/home/yurib/vms/Client1_windows_server_2008_with_sp2_x86-q35.qcow2,cache=unsafe \
-M q35 \
-nodefaults -nodefconfig -vga cirrus \
-device ioh3420,bus=pcie.0,id=root1.0,slot=1 \
-netdev tap,id=hostnet11,vhost=on,script=br0-ifup,ifname=nw1 \
-device e1000,netdev=hostnet11,mac=00:54:45:56:12:10,id=poc1,bus=pcie.0 \
-netdev tap,id=hostnet12,vhost=on,script=br0-ifup,ifname=nw2 \
-device virtio-net-pci,netdev=hostnet12,mac=00:54:45:56:11:10,bus=root1.0,id=poc2 \
-m 4G -smp 4,cores=4 -cpu qemu64,+x2apic,+fsgsbase,model=13 \
-enable-kvm -usbdevice tablet -vnc :$vncport \
-monitor telnet::$telnetport,server,nowait \
The corrupting pattern in several dump files, looks a lot like SCSI Sense Buffer data, 18 bytes long:
f0 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00 00 00
Is there some known/WIP problem related to this bug?
Should it be considered as bug of 2008-only cdrom/msahci or problem of existing qemu?
This is roughly what's happening:
Windows issues the MODE SENSE (5a) ATAPI command with action = 0 and code = LUN Mapping (1b). Because QEMU does not support the LUN Mapping page, the command fails with ILLEGAL_REQUEST / ASC_INV_FIELD_IN_CMD_PACKET.
Windows then issues the REQUEST SENSE (3) command to figure out why 5a failed and because it is (very likely) not set up correctly, the DMA transfer of the sense data buffer corrupts guest memory, sometimes causing BSOD later on.
It's hard to tell what exactly Windows is doing without extensive reverse engineering. But the story with a poorly tested code path (presumably real HW tends to support the LUN Mapping page?) is plausible.
This is where the DMA transfer is triggered from Windows point of view:
00 803d8530 803f0e64 nt!WRITE_REGISTER_ULONG+0xa
So - a more specific question for John: How hard would it be to support the MODE SENSE command with LUN Mapping (1b)? Thanks!
I tried the same Q35 VM config with Windows Server 2008R2 64-bit and I see the same issue. It's just that nothing important happens to live in the corrupted memory region on 2008R2.
In my run the DMA physical address was 7ff40540. Then in windbg:
// get directory base
1: kd> !ptov 00187000
Amd64PtoV: pagedir 187000
1: kd> !address 0xfffff880`0299f000+0x540
Base Address: fffff880`023a0000
End Address: fffff8a0`00000000
Region Size: 0000001f`fdc60000
VA Type: SystemPTEs // <== doesn't look like a good DMA destination
(In reply to Ladi Prosek from comment #3)
> It's hard to tell what exactly Windows is doing without extensive reverse
> engineering. But the story with a poorly tested code path (presumably real
> HW tends to support the LUN Mapping page?) is plausible.
This was not a correct assessment. Also, comment 4 should be ignored. Apparently the memory returned by AtaPortGetUnCachedExtension where all the AHCI datastructures live really identifies as "SystemPTEs" in !address, as strange as it looks.
The problem is somewhere else. Here's a snippet from msahci.sys which sets up the DMA destination (data base addresses in the PRDT):
8cb2d14c call msahci!AtaPortGetPhysicalAddress (8cb2d3ac)
8cb2d151 test al,1
8cb2d153 jne msahci!IRBtoPRDT+0x198 (8cb2d192)
8cb2d155 mov dword ptr [esi+80h],eax
8cb2d15b test dword ptr [edi+468h],80000000h
8cb2d165 je msahci!IRBtoPRDT+0x173 (8cb2d16d) [br=1]
8cb2d167 mov dword ptr [esi+84h],edx
[esi+80h] is the lower 32-bits (DBA), [esi+84h] is the upper 32-bits (DBAU). The corruption we observe is caused by not setting the DBAU while the physical address is in fact >4GB and has a non-zero upper dword. Side notes: 1) I couldn't originally repro this because I was running the VM with less than 4GB of RAM. 2) 4GB of RAM is enough to hit this because there will be pages >4GB due to all those memory gaps.
[edi+468h] has the contents of the HBA Capabilities register and bit 31 tested above is described in the spec like so:
"Supports 64-bit Addressing (S64A): Indicates whether the HBA can access 64-bit
data structures. When set to ‘1’, the HBA shall make the 32-bit upper bits of the port DMA Descriptor, the PRD Base, and each PRD entry read/write. When cleared to ‘0’, these are read-only and treated as ‘0’ by the HBA."
So whose fault is this? What Windows is doing is definitely odd. If the address has non-zero upper 32-bits and the HBA claims that it doesn't support 64-bit DMA, then they ignore the upper 32-bits and hope that it will somehow work. Right. :)
On the other hand, the QEMU HBA does support 64-bit DMA so bit 31 in the HBA Capabilities register should be set. I have verified that it fixes the issue, i.e it makes Windows correctly supply 64-bit physical addresses and it all works.
I have posted a QEMU patch to advertise the 64-bit capability:
And a SeaBIOS patch to correctly initialize the AHCI controller (found during testing; manifests as a hang or crash after reboot):
Bug 1418320 tracks the SeaBIOS fix in 7.4. This bug continues to track the QEMU fix.
Changing the component as Windows guests are not supported in qemu-kvm.
Fix included in qemu-kvm-rhev-2.8.0-5.el7
Reproduce the issue on qemu-kvm-rhev-2.6.0-29.el7
Verified it on the qemu-kvm-rhev-2.8.0-5.el7
ps: the qemu command line:
-M q35 \
-cpu SandyBridge \
-nodefaults -rtc base=utc \
-m 4G \
-smp 2,sockets=2,cores=1,threads=1 \
-name rhel7.4 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-serial unix:/tmp/console,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/test/seabios.log,id=seabios \
-device isa-debugcon,chardev=seabios,iobase=0x402 \
-qmp tcp::8887,server,nowait \
-vga qxl \
-spice port=5932,disable-ticketing \
-device ioh3420,id=root.0,slot=1 \
-drive file=/home/test/win8-32.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,bus=root.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-device ioh3420,id=root.1,slot=2 \
-device ioh3420,id=root.2,slot=3 \
-netdev tap,id=hostnet1 \
-device virtio-net-pci,netdev=hostnet1,id=net1,mac=54:52:00:B6:40:22,bus=root.2 \
-monitor stdio \
-cdrom /home/en_windows_server_2008_datacenter_enterprise_standard_sp2_x64_dvd_342336.iso \
-drive file=/usr/share/virtio-win/virtio-win-1.9.0.iso,if=none,media=cdrom,id=drive-ide1,format=raw \
-device ide-drive,bus=ide.0,drive=drive-ide1,id=ide1 \
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.