Bug 1294105

Summary: virtio drivers causes disk errors when windows is "finalizing settings" after install
Product: [Community] Virtualization Tools Reporter: Shawn Debnath <shawn>
Component: virtio-winAssignee: Vadim Rozenfeld <vrozenfe>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: ghammer, lijin, michen, shawn, virt-maint, vrozenfe, wyu, yvugenfi
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-18 17:43:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
screenshot of errors on the physical screen
none
beginning of errors (first attachment got cut off) from scrolling
none
screenshot for installation and device info none

Description Shawn Debnath 2015-12-24 21:52:57 UTC
Created attachment 1109310 [details]
screenshot of errors on the physical screen

Description of problem:
I was attempting to install Windows server 2012 R2 on a proxmox installation and for performance reasons wanted to use the para virtualized drivers. Followed best practices available at https://pve.proxmox.com/wiki/Windows_2012_guest_best_practices but every time after entering the admin password, the installer went into "Finalizing settings" and immediately resulted in disk errors on sda (pic attached - not found in log). The whole machine locked up. Reboot and things are fine. Ran isolated tests on the drive to rule out disk issues. The problem doesn't occur if I swap the VIRTIO drive with a IDE drive in the guest settings. The laptop is Lenovo T530 core i7 with 167GB SSD.

NOTE: virtio for network works perfectly with the IDE based drive in windows guest.

Version-Release number of selected component (if applicable):

virtio drivers for Windows 2012R2 amd64 (virtio-win 0.1.102)
distribution: proxmox 4.1-1/2f9650d4
root@pve:~# kvm --version
QEMU emulator version 2.4.1 pve-qemu-kvm_2.4-17, Copyright (c) 2003-2008 Fabrice Bellard
Linux pve 4.2.6-1-pve #1 SMP Wed Dec 9 10:49:55 CET 2015 x86_64 GNU/Linux

How reproducible:

Easily reproducible with windows 2k12r2 64 bit on proxmox 4 with guest settings for HDD using VIRTIO as BUS, QEMU format, any cache setting (I tried, no cache/write back/write through).

Steps to Reproduce:
1. Install proxmox 4.1 on a system
2. Choose Windows 8/2012 for OS type and windows 2k12r2 iso for cd1, and virtio driver cd for cd2.
2. Create a new VM using VIRTIO as the bus for the primary hard drive.
3. I used qemu format for the imae
4. Continue with the rest of the setup wizard and start the guest
5. Follow the typical Windows installation wizard, when it comes to drives, there shouldn't be any. "Load drivers" by selecting the viostor directory in the virtio cd. Drive will be found and continue install. 


Actual results:

System will reboot a couple of times and then will prompt for admin password. Continue to "Finalizing settings". At this point, sometime in the process you will start to see errors on the host console and this step which usually takes about 5 seconds will be stuck for ever.

Expected results:

After typing the password, finalizing settings should take a few seconds and then to the ctrl+alt+del screen.

Additional info:

I can repro this quite easily so if you need more information, let me know what steps I need to take to collect them.

Comment 1 Shawn Debnath 2015-12-24 22:17:39 UTC
Re-created the VM, from 'ps' output for the kvm command line

/usr/bin/kvm -id 103 -chardev socket,id=qmp,path=/var/run/qemu-server/103.qmp,server,nowait -mon chardev=qmp,mode=control -vnc unix:/var/run/qemu-server/103.vnc,x509,password -pidfile /var/run/qemu-server/103.pid -daemonize -smbios type=1,uuid=3ce5f4bb-fed1-4665-8e60-34f003847cb2 -name virtiodbg -smp 2,sockets=1,cores=2,maxcpus=2 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000 -vga std -no-hpet -cpu host,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_relaxed,+kvm_pv_unhalt,+kvm_pv_eoi -m 1024 -k en-us -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -iscsi initiator-name=iqn.1993-08.org.debian:01:8e0e96258ff -drive file=/var/lib/vz/template/iso/en_windows_server_2012_r2_with_update_x64_dvd_6052708.iso,if=none,id=drive-ide2,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 -drive file=/var/lib/vz/template/iso/virtio-win-0.1.102.iso,if=none,id=drive-ide0,media=cdrom,aio=threads -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=201 -drive file=/var/lib/vz/images/103/vm-103-disk-1.qcow2,if=none,id=drive-virtio0,cache=writeback,format=qcow2,aio=threads,detect-zeroes=on -device virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100 -netdev type=tap,id=net0,ifname=tap103i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown -device e1000,mac=62:65:38:39:39:66,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 -rtc driftfix=slew,base=localtime -global kvm-pit.lost_tick_policy=discard

Comment 2 Shawn Debnath 2015-12-24 22:28:42 UTC
Created attachment 1109324 [details]
beginning of errors (first attachment got cut off) from scrolling

Comment 3 Shawn Debnath 2015-12-24 22:30:55 UTC
Captured beginning of error on console (failed flush). results in cyclic blue screen of death.

From kern.log:

Dec 24 14:19:39 pve pvedaemon[16236]: <root@pam> successful auth for user 'root@pam'
Dec 24 14:22:14 pve kernel: [32346.437712] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 24 14:22:14 pve kernel: [32346.437739] ata1.00: failed command: FLUSH CACHE EXT
Dec 24 14:22:14 pve kernel: [32346.437755] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 11
Dec 24 14:22:14 pve kernel: [32346.437755]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 24 14:22:14 pve kernel: [32346.437788] ata1.00: status: { DRDY }
Dec 24 14:22:14 pve kernel: [32346.437801] ata1: hard resetting link
Dec 24 14:22:15 pve kernel: [32347.165818] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
Dec 24 14:22:15 pve kernel: [32347.165828] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Dec 24 14:22:15 pve kernel: [32347.185669] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
Dec 24 14:22:15 pve kernel: [32347.185679] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Dec 24 14:22:15 pve kernel: [32347.195710] ata1.00: configured for UDMA/133
Dec 24 14:22:15 pve kernel: [32347.198012] ata1: EH complete
Dec 24 14:26:32 pve kernel: [32604.876242] ata1: EH complete
Dec 24 14:26:32 pve kernel: [32604.876343] sd 0:0:0:0: [sda] tag#21 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 24 14:26:32 pve kernel: [32604.876349] blk_update_request: I/O error, dev sda, sector 133398032
Dec 24 14:26:32 pve kernel: [32604.876391] sd 0:0:0:0: [sda] tag#22 CDB: Write(10) 2a 00 09 20 d9 38 00 00 08 00
Dec 24 14:26:32 pve kernel: [32604.876435] Buffer I/O error on device dm-2, logical block 4233767
Dec 24 14:26:32 pve kernel: [32604.876482] blk_update_request: I/O error, dev sda, sector 156178744
Dec 24 14:26:32 pve kernel: [32604.876542] blk_update_request: I/O error, dev sda, sector 0
Dec 24 14:26:32 pve kernel: [32604.876574] blk_update_request: I/O error, dev sda, sector 177997936
Dec 24 14:26:32 pve kernel: [32604.876635] sd 0:0:0:0: [sda] tag#25 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 24 14:26:32 pve kernel: [32604.876638] blk_update_request: I/O error, dev sda, sector 180813184
Dec 24 14:26:32 pve kernel: [32604.876704] sd 0:0:0:0: [sda] tag#26 CDB: Write(10) 2a 00 0c e1 b7 00 00 00 10 00
Dec 24 14:26:32 pve kernel: [32604.876748] sd 0:0:0:0: [sda] tag#27 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 24 14:26:32 pve kernel: [32604.876782] EXT4-fs warning (device dm-0): ext4_end_bio:332: I/O error -5 writing to inode 2363115 (offset 2490368 size 4096 starting block 1608287)
Dec 24 14:26:32 pve kernel: [32604.876843] sd 0:0:0:0: [sda] tag#28 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 24 14:26:32 pve kernel: [32604.876871] sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 24 14:26:32 pve kernel: [32604.876878] Buffer I/O error on device dm-2, logical block 4233767
Dec 24 14:26:32 pve kernel: [32604.876888] EXT4-fs warning (device dm-2): ext4_end_bio:332: I/O error -5 writing to inode 1835013 (offset 0 size 0 starting block 4249354)
Dec 24 14:26:32 pve kernel: [32604.876893] EXT4-fs warning (device dm-2): ext4_end_bio:332: I/O error -5 writing to inode 1835013 (offset 0 size 0 starting block 4530439)
Dec 24 14:26:32 pve kernel: [32604.876950] EXT4-fs error (device loop0): ext4_journal_check_start:56: Detected aborted journal
Dec 24 14:26:32 pve kernel: [32604.877107] EXT4-fs warning (device dm-0): ext4_end_bio:332: I/O error -5 writing to inode 2234419 (offset 0 size 0 starting block 8995333)
Dec 24 14:26:32 pve kernel: [32604.877141] Buffer I/O error on device dm-0, logical block 8995337
Dec 24 14:26:32 pve kernel: [32604.877211] Buffer I/O error on dev dm-0, logical block 9437459, lost async page write
Dec 24 14:26:32 pve kernel: [32604.877322] Buffer I/O error on dev dm-2, logical block 2, lost async page write
Dec 24 14:26:32 pve kernel: [32604.877505] Buffer I/O error on dev dm-2, logical block 12091392, lost sync page write
Dec 24 14:26:32 pve kernel: [32604.877549] EXT4-fs error (device dm-2): ext4_journal_check_start:56: Detected aborted journal
Dec 24 14:26:32 pve kernel: [32604.877556] WARNING: CPU: 2 PID: 4862 at fs/buffer.c:1160 mark_buffer_dirty+0xf3/0x100()
Dec 24 14:26:32 pve kernel: [32604.877607] CPU: 2 PID: 4862 Comm: kworker/u17:2 Tainted: P           O    4.2.6-1-pve #1
Dec 24 14:26:32 pve kernel: [32604.877613]  0000000000000000 00000000524e25d3 ffff8802800d7818 ffffffff81801028
Dec 24 14:26:32 pve kernel: [32604.877616] Call Trace:
Dec 24 14:26:32 pve kernel: [32604.877625]  [<ffffffff8107b79a>] warn_slowpath_null+0x1a/0x20
Dec 24 14:26:32 pve kernel: [32604.877631]  [<ffffffff812a690b>] __ext4_abort+0x12b/0x150
Dec 24 14:26:32 pve kernel: [32604.877639]  [<ffffffff812b9b86>] __ext4_journal_start_sb+0x36/0xd0
Dec 24 14:26:32 pve kernel: [32604.877646]  [<ffffffff8128d7a4>] ? ext4_finish_bio+0x164/0x280
Dec 24 14:26:32 pve kernel: [32604.877654]  [<ffffffff811832ca>] filemap_write_and_wait_range+0x2a/0x70
Dec 24 14:26:32 pve kernel: [32604.877662]  [<ffffffff813a473c>] ? __blk_mq_complete_request+0xbc/0xf0
Dec 24 14:26:32 pve kernel: [32604.877668]  [<ffffffff810ae926>] ? set_next_entity+0xa6/0x4d0
Dec 24 14:26:32 pve kernel: [32604.877676]  [<ffffffff81094927>] process_one_work+0x157/0x3f0
Dec 24 14:26:32 pve kernel: [32604.877682]  [<ffffffff81095370>] ? rescuer_thread+0x330/0x330
Dec 24 14:26:32 pve kernel: [32604.877690]  [<ffffffff8180841f>] ret_from_fork+0x3f/0x70
Dec 24 14:26:32 pve kernel: [32604.879596] EXT4-fs error (device dm-2): ext4_journal_check_start:56: 
Dec 24 14:26:32 pve kernel: 
Dec 24 14:26:32 pve kernel: 
Dec 24 14:26:32 pve kernel: [32604.879711] EXT4-fs (dm-2): ext4_writepages: jbd2_start: 9223372036854775807 pages, ino 1835015; err -30
Dec 24 14:26:32 pve kernel: [32604.879743] Aborting journal on device loop1-8.
Dec 24 14:26:32 pve kernel: [32604.879758] JBD2: Error -5 detected when updating journal superblock for loop1-8.
Dec 24 14:26:32 pve kernel: [32604.880252] EXT4-fs warning (device dm-0): ext4_end_bio:332: I/O error -5 writing to inode 2363182 (offset 0 size 0 starting block 1607905)
Dec 24 14:26:32 pve kernel: [32604.880632] EXT4-fs (loop1): previous I/O error to superblock detected
Dec 24 14:26:32 pve kernel: [32604.880644] loop: Write error at byte offset 0, length 4096.
Dec 24 14:26:32 pve kernel: [32604.880652] EXT4-fs error (device loop1): ext4_journal_check_start:56: Detected aborted journal
Dec 24 14:26:32 pve kernel: [32604.880653] EXT4-fs (loop1): Remounting filesystem read-only
Dec 24 14:26:32 pve kernel: [32604.880654] EXT4-fs (loop1): previous I/O error to superblock detected
Dec 24 14:26:32 pve kernel: [32604.880659] loop: Write error at byte offset 0, length 4096.
^C

Comment 4 Vadim Rozenfeld 2015-12-25 05:56:58 UTC
Usually we don't deal with proxmon or any other products, outside of RH provided ones.

Just in case, can you try more recent virtio-win drivers release (build 112)?

Vadim.

Comment 5 Shawn Debnath 2015-12-27 23:44:24 UTC
Same issue with build 112. 

I understand this is for a different distribution, but given the kernel and virtio driver combination, it likely is the virtio drivers at fault. Not a 100% of course.

Comment 6 Vadim Rozenfeld 2015-12-28 06:31:33 UTC
(In reply to Shawn Debnath from comment #5)
> Same issue with build 112. 
> 
> I understand this is for a different distribution, but given the kernel and
> virtio driver combination, it likely is the virtio drivers at fault. Not a
> 100% of course.

Thanks,

Li Jin, can we try reproducing this problem on our setups?

Best regards,
Vadim.

Comment 8 Yu Wang 2015-12-29 03:23:01 UTC
QE can Not reproduce on our setup w/ virtio-blk-device(virtio-win 0.1.102/virtio-win 0.1.106)

Steps as comment#0

boot cli:

/usr/bin/kvm -id 100 -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -vnc unix:/var/run/qemu-server/100.vnc,x509,password -pidfile /var/run/qemu-server/100.pid -daemonize -smbios type=1,uuid=7421a640-7629-40df-8fda-875de2b279ac -name vm100 -smp 1,sockets=1,cores=1,maxcpus=1 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000 -vga std -no-hpet -cpu kvm64,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_relaxed,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 2048 -k en-us -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -iscsi initiator-name=iqn.1993-08.org.debian:01:9677f21380d4 -drive file=/var/lib/vz/template/iso/driver.iso,if=none,id=drive-ide3,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=1,drive=drive-ide3,id=ide3,bootindex=200 -drive file=/var/lib/vz/template/iso/en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,id=drive-ide2,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=201 -drive file=/var/lib/vz/images/100/vm-100-disk-1.qcow2,if=none,id=drive-virtio1,format=qcow2,cache=none,aio=native,detect-zeroes=on -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb,bootindex=100 -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown -device e1000,mac=62:32:63:62:34:62,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 -rtc driftfix=slew,base=localtime -global kvm-pit.lost_tick_policy=discard


Actual result:

Can install win2012R2 successfully w/o any error

Version-Release number of selected component (if applicable):

virtio drivers for Windows 2012R2 amd64 (virtio-win 0.1.102/virtio-win 0.1.106)
distribution: proxmox 4.1-1/2f9650d4
root@pve:~# kvm --version
QEMU emulator version 2.4.1 pve-qemu-kvm_2.4-17, Copyright (c) 2003-2008 Fabrice Bellard

more info , pls refer to the attachment

Comment 9 Yu Wang 2015-12-29 03:24:37 UTC
Created attachment 1110073 [details]
screenshot for installation and device info

Comment 10 Shawn Debnath 2015-12-29 03:28:05 UTC
Curious, were the drives being used to reproduce the issue SSDs?

Comment 11 lijin 2017-08-17 07:16:20 UTC
Hi Shawn,

Is this still an issue for you?Could you try with latest virtio-win version?

Thanks.

Comment 12 Shawn Debnath 2017-12-18 17:43:44 UTC
No longer an issue as I moved off the platform.