Bug 1135824 - [virtio-scsi] > virtio-win-0.1-65: Windows 64-bit guest: BSOD/file system corruption
Summary: [virtio-scsi] > virtio-win-0.1-65: Windows 64-bit guest: BSOD/file system cor...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Virtualization Tools
Classification: Community
Component: virtio-win
Version: unspecified
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Vadim Rozenfeld
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-09-01 00:31 UTC by Matthew Stapleton
Modified: 2015-05-05 19:41 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-05-05 19:41:07 UTC
Embargoed:


Attachments (Terms of Use)
win7-64 scsi driver (2.81 MB, application/octet-stream)
2014-09-05 11:02 UTC, Mike Cao
no flags Details

Description Matthew Stapleton 2014-09-01 00:31:57 UTC
On a Windows 7 64-bit guest, after changing the OS drive driver from IDE to virtio-scsi-pci and then shutting down and rebooting Windows, it seems to get BSOD errors that seem to indicate it is having problems accessing the drive.  I sometimes get a 0x000000F4 code (timeout?) and sometimes get a 0x00000221 in one of the boot files (corruption).  This occurs on virtio-win 0.1-74 and 0.1-81 (downloaded from http://alt.fedoraproject.org/pub/alt/virtio-win).  It always boots fine after going back to IDE, and also after changing to 0.1-65 version of virtio-scsi-pci Windows appears to work fine with no errors.  Currently I've only tested this problem on Windows 7 64-bit guest and qemu 2.0.0 and I've tried without discard and with cache set to none as well.  Also, the host machine is running on Gentoo Linux 64-bit, but since the problem is fixed by changing the guest driver version it doesn't seem like an issue with the host.  The main difference between 0.1-65 and 0.1-74 in the vioscsi driver, seemed to be MSI support, although the IRQ is still positive in 0.1-74 (As long as I don't attach the virtio-scsi interface to the main OS drive, Windows is able to boot) and there were some other changes as well.

Here is an example config for testing the problem:
qemu-system-x86_64 -nodefaults -machine pc-i440fx-2.0,accel=kvm -smp maxcpus=4,cores=4,sockets=1 -cpu Penryn,+vmx,+svm -k en-us -net tap,vlan=0,ifname=tap0,script=no,vhost=on -net nic,vlan=0,macaddr=52:54:00:01:01:01,model=virtio -rtc base=localtime,clock=host,driftfix=slew -m 2560 -drive file=disk.raw,if=none,id=disk-main,cache=writeback,discard=on,format=raw,aio=native -drive file=virtio-win-0.1-65.iso,if=none,id=cdrom-virtiodrv,readonly=on,aio=native -device virtio-scsi-pci,id=vscsi0 -device scsi-hd,bus=vscsi0.0,scsi-id=0,drive=disk-main,bootindex=1 -device ide-cd,bus=ide.1,unit=1,drive=cdrom-virtiodrv -no-hpet -device VGA -spice port=5931,addr=127.0.0.1,password=password -name windows -runas user -realtime mlock=on -daemonize -monitor unix:windows.monitor.sock,server,nowait -daemonize -pidfile windows.pid -name windows

Comment 1 Vadim Rozenfeld 2014-09-01 07:53:47 UTC
Please post the output from "info pci" before and after crash.
Thanks,
Vadim.

Comment 2 Matthew Stapleton 2014-09-01 12:28:34 UTC
Here is the pci info.  It doesn't seem to change during the running of the system.  I did a fresh Win7 install and although it 
 Bus  0, device   0, function 0:
    Host bridge: PCI device 8086:1237
      id ""
  Bus  0, device   1, function 0:
    ISA bridge: PCI device 8086:7000
      id ""
  Bus  0, device   1, function 1:
    IDE controller: PCI device 8086:7010
      BAR4: I/O at 0xc080 [0xc08f].
      id ""
  Bus  0, device   1, function 3:
    Bridge: PCI device 8086:7113
      IRQ 9.
      id ""
  Bus  0, device   2, function 0:
    Ethernet controller: PCI device 8086:100e
      IRQ 11.
      BAR0: 32 bit memory at 0xfebc0000 [0xfebdffff].
      BAR1: I/O at 0xc000 [0xc03f].
      BAR6: 32 bit memory at 0xffffffffffffffff [0x0003fffe].
      id "virtnet"
  Bus  0, device   3, function 0:
    SCSI controller: PCI device 1af4:1004
      IRQ 0.
      BAR0: I/O at 0xc040 [0xc07f].
      BAR1: 32 bit memory at 0xfebf0000 [0xfebf0fff].
      id "vscsi0"
  Bus  0, device   4, function 0:
    VGA controller: PCI device 1234:1111
      BAR0: 32 bit prefetchable memory at 0xfd000000 [0xfdffffff].
      BAR2: 32 bit memory at 0xfebf1000 [0xfebf1fff].
      BAR6: 32 bit memory at 0xffffffffffffffff [0x0000fffe].
      id ""

Also, I have a test environment setup now where I can trigger the problem every time during a Windows install (Currently testing with Windows 7 Pro 64-bit with SP1):
Run qemu-img create -f raw disk.raw 40G
Then qemu-system-x86_64 -nodefaults -machine pc-i440fx-2.0,accel=kvm -enable-kvm -smp maxcpus=2,cores=2,sockets=1 -cpu Penryn -k en-us -netdev user,id=mainnet,net=192.168.240.0/24,dhcpstart=192.168.240.5,hostfwd=tcp::10022-:22,hostfwd=tcp::13389-:3389 -device e1000,netdev=mainnet,id=virtnet -rtc base=localtime,clock=host -m 2048 -drive file=disk.raw,if=none,id=disk-main,cache=unsafe,discard=on,format=raw,aio=native -drive file=Windows7Pro.iso,if=none,id=cdrom-win7cd,readonly=on,aio=native -drive file=virtio-win-0.1-65.iso,if=none,id=cdrom-virtiodrv,readonly=on,aio=native -device virtio-scsi-pci,id=vscsi0 -device scsi-hd,bus=vscsi0.0,scsi-id=0,drive=disk-main,bootindex=7 -device ide-cd,bus=ide.1,unit=0,drive=cdrom-win7cd,bootindex=5 -device ide-cd,bus=ide.1,unit=1,drive=cdrom-virtiodrv -no-hpet -device VGA -spice port=5931,addr=127.0.0.1,password=password -name windows -sdl -no-quit -display gtk -monitor unix:windows.monitor.sock,server,nowait
  (cache=unsafe is just to speed up the install as cache=writeback also gets the error)
Click through the install wizard until asked where to install Windows, and then select load driver and browse to WIN7\AMD64, load the driver, and install Windows.  With virtio-win-0.1-65.iso Windows installs without any problems.  With virtio-win-0.1-74.iso and virtio-win-0.1-81.iso I get a corruption error: 0x80070570 while "Expanding Windows files".

Comment 4 Matthew Stapleton 2014-09-01 12:51:18 UTC
Machine type: q35 has similar results:
qemu-system-x86_64 -nodefaults -machine q35,accel=kvm -enable-kvm -smp maxcpus=2,cores=2,sockets=1 -cpu Penryn -k en-us -netdev user,id=mainnet,net=192.168.240.0/24,dhcpstart=192.168.240.5,hostfwd=tcp::10022-:22,hostfwd=tcp::13389-:3389 -device e1000,netdev=mainnet,id=virtnet -rtc base=localtime,clock=host -m 2048 -drive file=disk.raw,if=none,id=disk-main,cache=unsafe,discard=on,format=raw,aio=native -drive file=Windows7Pro.iso,if=none,id=cdrom-win7cd,readonly=on,aio=native -drive file=virtio-win-0.1-65.iso,if=none,id=cdrom-virtiodrv,readonly=on,aio=native -device virtio-scsi-pci,id=vscsi0 -device scsi-hd,bus=vscsi0.0,scsi-id=0,drive=disk-main,bootindex=7 -device ide-cd,bus=ide.0,drive=cdrom-win7cd,bootindex=5 -device ide-cd,bus=ide.1,drive=cdrom-virtiodrv -no-hpet -device VGA -spice port=5931,addr=127.0.0.1,password=password -name windows -sdl -no-quit -display gtk -monitor unix:windows.monitor.sock,server,nowait

info pci:
  Bus  0, device   0, function 0:
    Host bridge: PCI device 8086:29c0
      id ""
  Bus  0, device   1, function 0:
    Ethernet controller: PCI device 8086:100e
      IRQ 10.
      BAR0: 32 bit memory at 0xfebc0000 [0xfebdffff].
      BAR1: I/O at 0xc000 [0xc03f].
      BAR6: 32 bit memory at 0xffffffffffffffff [0x0003fffe].
      id "virtnet"
  Bus  0, device   2, function 0:
    SCSI controller: PCI device 1af4:1004
      IRQ 22.
      BAR0: I/O at 0xc040 [0xc07f].
      BAR1: 32 bit memory at 0xfebf0000 [0xfebf0fff].
      id "vscsi0"
  Bus  0, device   3, function 0:
    VGA controller: PCI device 1234:1111
      BAR0: 32 bit prefetchable memory at 0xfd000000 [0xfdffffff].
      BAR2: 32 bit memory at 0xfebf1000 [0xfebf1fff].
      BAR6: 32 bit memory at 0xffffffffffffffff [0x0000fffe].
      id ""
  Bus  0, device  31, function 0:
    ISA bridge: PCI device 8086:2918
      id ""
  Bus  0, device  31, function 2:
    SATA controller: PCI device 8086:2922
      IRQ 16.
      BAR4: I/O at 0xc0c0 [0xc0df].
      BAR5: 32 bit memory at 0xfebf2000 [0xfebf2fff].
      id ""
  Bus  0, device  31, function 3:
    SMBus: PCI device 8086:2930
      IRQ 10.
      BAR4: I/O at 0xb100 [0xb13f].
      id ""

Comment 6 Matthew Stapleton 2014-09-01 13:37:49 UTC
Oh "info pci" outputs the same before and after the crash (before I got the test environment setup) and with Windows Setup, it can still see the drive when restarting the install after it aborts from the error.

Comment 7 Matthew Stapleton 2014-09-03 00:19:51 UTC
I've tested a few other versions of Windows now and it looks like 32-bit isn't affected (tested with Vista 32-bit and Windows 7 32-bit) only 64-bit.  Also, with Windows 2012 and 2012 R2 series it gets a different error (using WIN8\AMD64 driver) and aborts at a different stage of the Windows installation compared to Windows 7, but it is basically the same issue.  So I've updated the title to BSOD/file system corruption as that's what it seems to be causing.

Would it help if I test on a Fedora system or are you able to replicate this problem?

Comment 11 Mike Cao 2014-09-03 10:31:34 UTC
(In reply to Matthew Stapleton from comment #7)
> I've tested a few other versions of Windows now and it looks like 32-bit
> isn't affected (tested with Vista 32-bit and Windows 7 32-bit) only 64-bit. 
> Also, with Windows 2012 and 2012 R2 series it gets a different error (using
> WIN8\AMD64 driver) and aborts at a different stage of the Windows
> installation compared to Windows 7, but it is basically the same issue.  So
> I've updated the title to BSOD/file system corruption as that's what it
> seems to be causing.
> 
> Would it help if I test on a Fedora system or are you able to replicate this
> problem?

Pls let me know which host are you using , and provide qemu-kvm info 

Thanks,
Mike

Comment 13 Matthew Stapleton 2014-09-04 00:15:58 UTC
The host system that I'm using to test on is running Gentoo Linux x86_64 on an Intel i7-4700HQ, but also noticed the problem on a server: Intel i7-3820 CPU.

Other info on the test machine: 3.15.6-gentoo kernel, module: kvm_intel nested=1, gcc version 4.6.3 (Gentoo 4.6.3 p1.13, pie-0.5.2), and qemu-2.0.0-r1.ebuild which is based on qemu-2.0.0 from www.qemu.org, but has a few added patches.  seabios is 1.7.4,  sgabios is 0.1_pre8 and vgabios is 0.7a.

Configure flags for qemu softmmu: ./configure --prefix=/usr --sysconfdir=/etc --libdir=/usr/lib64 --docdir=/usr/share/doc/qemu-2.0.0-r1/html --disable-bsd-user --disable-guest-agent --disable-strip --disable-werror --python=/usr/bin/python2.7 --cc=x86_64-pc-linux-gnu-gcc --cxx=x86_64-pc-linux-gnu-g++ --host-cc=x86_64-pc-linux-gnu-gcc --disable-debug-info --disable-debug-tcg --enable-docs --enable-tcg-interpreter --disable-linux-user --enable-system --with-system-pixman --target-list=,i386-softmmu,x86_64-softmmu --enable-bluez --enable-gtk --enable-sdl --enable-linux-aio --disable-brlapi --enable-cap-ng --disable-curl --disable-fdt --disable-glusterfs --disable-rdma --disable-libiscsi --enable-vnc-jpeg --enable-kvm --disable-curses --disable-libnfs --enable-glx --enable-vnc-png --disable-rbd --disable-vnc-sasl --enable-seccomp --disable-smartcard-nss --enable-spice --disable-libssh2 --enable-quorum --enable-vnc-tls --enable-vnc-ws --enable-libusb --enable-usb-redir --enable-uuid --enable-vde --enable-vhost-net --enable-virtfs --enable-vnc --enable-attr --disable-xen --disable-xen-pci-passthrough --disable-xfsctl --audio-drv-list=pa,sdl,oss --with-gtkabi=3.0

CFLAGS for qemu softmmu: -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include  -O2 -march=corei7-avx -fomit-frame-pointer -mmmx -msse3 -msse4.1 -ftree-vectorize -fpredictive-commoning -fno-tree-vect-loop-version -pipe

Comment 15 Matthew Stapleton 2014-09-05 07:29:12 UTC
I just tested on a test Fedora 20 install with qemu-system-x86-1.6.2-7.fc20.x86_64 and didn't get the problem so this might not be an issue with the Fedora build of qemu.  I'll do some more testing over the next few days with various versions.  The CPU on the test system is a AMD Athlon 64 X2 Dual Core Processor 5000+

Comment 16 Matthew Stapleton 2014-09-05 10:48:18 UTC
On my test system, I've now isolated the problem to any version of qemu >= 1.7.1 downloaded from qemu.org as qemu-1.7.0 and below don't appear to have the problem (at least with Windows 7 64-bit installation).  Trying -M pc-q35-1.6 didn't help either.  Also, I am waiting for my Fedora test install to finish updating to rawhide so I can test qemu-2.1.0 on that system.

Comment 17 Mike Cao 2014-09-05 11:02:00 UTC
Created attachment 934745 [details]
win7-64 scsi driver

Comment 18 Mike Cao 2014-09-05 11:04:04 UTC
Hi,Matthew

Can you help to confirm the driver in comment #17 can resolve your issue or not

Pls delete autounattend.xml before you using it 

Thanks,
Mike

Comment 19 Matthew Stapleton 2014-09-05 13:02:18 UTC
The driver in comment #17 didn't fix the issue with qemu-1.7.1 or qemu-2.0.0.

I'm not seeing corruption in qemu-2.1.0 on my gentoo system with virtio-win 0.1-81, so I must have forgot to check that version during my initial testing :(  Also, I see that Fedora 21 and Rawhide are using qemu 2.1.0 so they probably don't have the problem either.  Although other distros that use the affected versions of qemu probably would be affected so I'm not sure if a warning should be put up somewhere.

In summary, the versions of virtio scsi driver in virtio-win 0.1-74, 0.1-81, and possibly other versions above 0.1-65 with 64-bit Windows trigger a bug in released versions of qemu > 1.7.0 and < 2.1.0 .

Comment 20 Mike Cao 2014-09-05 13:36:46 UTC
(In reply to Matthew Stapleton from comment #19)
> The driver in comment #17 didn't fix the issue with qemu-1.7.1 or qemu-2.0.0.
> 
> I'm not seeing corruption in qemu-2.1.0 on my gentoo system with virtio-win
> 0.1-81, so I must have forgot to check that version during my initial
> testing :(  Also, I see that Fedora 21 and Rawhide are using qemu 2.1.0 so
> they probably don't have the problem either.  Although other distros that
> use the affected versions of qemu probably would be affected so I'm not sure
> if a warning should be put up somewhere.
> 
> In summary, the versions of virtio scsi driver in virtio-win 0.1-74, 0.1-81,
> and possibly other versions above 0.1-65 with 64-bit Windows trigger a bug
> in released versions of qemu > 1.7.0 and < 2.1.0 .


I tried on downstream qemu-1.5.3 did not hit the issue as well 
Based on your comment I think it should be a known issue in qemu side which has been fixed ald.


Mike

Comment 21 Jaroslav Reznik 2015-03-03 16:15:55 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 22 Cole Robinson 2015-05-05 19:41:07 UTC
Sounds like the bug was only an issue in certain qemu versions, so closing this

If anyone can still reproduce with the latest qemu version and the latest virtio-win drivers, please file a new report, and provide all the info requested in this bug


Note You need to log in before you can comment on or make changes to this bug.