Description of problem: Disk corruption reported by both qemu-img check and chkdsk within win10 guest. Version-Release number of selected component (if applicable): qemu 4.1, libvirt 5.6.0, virtio-win 0.1.172 How reproducible: 100% - win10 home 1903-v2 or pro 1903-v1, cache=writethrough or cache=writeback, BIOS or UEFI guests. Steps to Reproduce: virt-install \ --virt-type kvm \ --name=windows10 \ --os-variant=win10 \ --vcpus 2 \ --cpu host-passthrough \ --memory 4096 \ --features kvm_hidden=on \ --disk path=~/win10.qcow2,size=50,format=qcow2,sparse=true,bus=scsi,discard=unmap,io=threads \ --controller type=scsi,model=virtio-scsi \ --graphics spice \ --channel spicevmc,target_type=virtio,name=com.redhat.spice.0 \ --video model=qxl,vgamem=32768,ram=131072,vram=131072,heads=1 \ --network bridge=br0,model=virtio \ --input type=tablet,bus=virtio \ --disk ~/virtio-win-0.1.172.iso,device=cdrom \ --cdrom ~/Win10_1903_V2_English_x64.iso Actual results: After a couple of boot/reboot cycles the guest will warn of disk errors, "qemu-img check -r all" will fix the qcow2 image but sometimes result in a totally unbootable vm. Expected results: VM boot's 100% of the time in a reliable fashion - just like Debian 10 guests running virtio-scsi do. Host SSD is not the culprit, believe it to be a vioscsi windows driver bug. Additional info: Host SSD not mounted using discard option, using fstrim.timer instead, ext4 filesystem. Host runs 5.2.17-1 kernel.
Hi Li Jin, Can QE try to reproduce this problem? Thanks, Vadim.
(In reply to Vadim Rozenfeld from comment #1) > Hi Li Jin, > > Can QE try to reproduce this problem? > > Thanks, > Vadim. Hi Yu, Could you help to reproduce this issue? Thanks
Some more info on the types of problems found by qemu-img check: Repairing cluster 196604 refcount=1 reference=0 Repairing cluster 196605 refcount=1 reference=0 Repairing cluster 196606 refcount=1 reference=0 Repairing cluster 196607 refcount=1 reference=0 Repairing OFLAG_COPIED data cluster: l2_entry=8000000279430000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=8000000279440000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=8000000279430000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=8000000279440000 refcount=2 The following inconsistencies were found and repaired: 4101 leaked clusters 6 corruptions Also: Repairing OFLAG_COPIED data cluster: l2_entry=80000001fdf90000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=800000021a690000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=800000021a6a0000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=800000021a6b0000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=80000001b44c0000 refcount=2 The following inconsistencies were found and repaired: 1 leaked clusters 4253 corruptions Oddly enough Windows can get stuck in a repair/reboot loop as it's detecting errors from chkdsk even when qemu-img isn't. I cannot reproduce on Win2019 (that seems to mark the drive as not optimizable) so seems specific to Win10. All the latest spice-tools, qemu-agent, virtio-win drivers installed.
Also the number of errors and fixes seems odd - 3 errors found but 6 listed, then 3 fixed and 9 now reported! $ qemu-img check win10uefi.qcow2 ERROR cluster 143838 refcount=1 reference=2 ERROR cluster 143839 refcount=1 reference=2 ERROR cluster 143840 refcount=1 reference=2 3 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. 237450/819200 = 28.99% allocated, 11.01% fragmented, 0.00% compressed clusters Image end offset: 17673814016 $ qemu-img check -r leaks win10uefi.qcow2 ERROR cluster 143838 refcount=1 reference=2 ERROR cluster 143839 refcount=1 reference=2 ERROR cluster 143840 refcount=1 reference=2 ERROR cluster 143838 refcount=1 reference=2 ERROR cluster 143839 refcount=1 reference=2 ERROR cluster 143840 refcount=1 reference=2 3 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. 237450/819200 = 28.99% allocated, 11.01% fragmented, 0.00% compressed clusters Image end offset: 17673814016 $ qemu-img check win10uefi.qcow2 ERROR cluster 143838 refcount=1 reference=2 ERROR cluster 143839 refcount=1 reference=2 ERROR cluster 143840 refcount=1 reference=2 3 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. 237450/819200 = 28.99% allocated, 11.01% fragmented, 0.00% compressed clusters Image end offset: 17673814016 $ qemu-img check -r all win10uefi.qcow2 ERROR cluster 143838 refcount=1 reference=2 ERROR cluster 143839 refcount=1 reference=2 ERROR cluster 143840 refcount=1 reference=2 Repairing cluster 143838 refcount=1 reference=2 Repairing cluster 143839 refcount=1 reference=2 Repairing cluster 143840 refcount=1 reference=2 Repairing OFLAG_COPIED data cluster: l2_entry=8000000231df0000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=8000000231de0000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=8000000231e00000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=8000000231de0000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=8000000231df0000 refcount=2 Repairing OFLAG_COPIED data cluster: l2_entry=8000000231e00000 refcount=2 The following inconsistencies were found and repaired: 0 leaked clusters 9 corruptions Double checking the fixed image now... No errors were found on the image. 237450/819200 = 28.99% allocated, 11.01% fragmented, 0.00% compressed clusters Image end offset: 17673814016 $ qemu-img check win10uefi.qcow2 No errors were found on the image. 237450/819200 = 28.99% allocated, 11.01% fragmented, 0.00% compressed clusters Image end offset: 17673814016
Hi, I cannot reproduce on my env. (both qemu and libvirt) Steps: 1 boot with win10-64 guest installed by en_windows_10_business_editions_version_1903_x64_dvd_37200948 virt-install \ --virt-type kvm \ --name=windows10 \ --os-variant=win10 \ --vcpus 2 \ --cpu host-passthrough \ --memory 4096 \ --features kvm_hidden=on \ --disk path=~/win10.qcow2,size=50,format=qcow2,sparse=true,bus=scsi,discard=unmap,io=threads \ --controller type=scsi,model=virtio-scsi \ --graphics spice \ --channel spicevmc,target_type=virtio,name=com.redhat.spice.0 \ --video model=qxl,vgamem=32768,ram=131072,vram=131072,heads=1 \ --network bridge=br0,model=virtio \ --input type=tablet,bus=virtio \ 2 reboot guest continuously in 24 hours 3 check with command qemu-img check win10-64-virtio-scsi.qcow2 qemu-img check -r leaks win10-64-virtio-scsi.qcow2 qemu-img check -r all win10-64-virtio-scsi.qcow2 results: I cannot check error in host [root@dell-per440-05 images]# qemu-img check win10-64-virtio-scsi.qcow2 No errors were found on the image. 172971/491520 = 35.19% allocated, 30.01% fragmented, 0.00% compressed clusters Image end offset: 11803033600 [root@dell-per440-05 images]# qemu-img check -r leaks win10-64-virtio-scsi.qcow2 No errors were found on the image. 172971/491520 = 35.19% allocated, 30.01% fragmented, 0.00% compressed clusters Image end offset: 11803033600 [root@dell-per440-05 images]# qemu-img check -r all win10-64-virtio-scsi.qcow2 No errors were found on the image. 172971/491520 = 35.19% allocated, 30.01% fragmented, 0.00% compressed clusters Image end offset: 11803033600 Version Guest:en_windows_10_business_editions_version_1903_x64_dvd_37200948 virtio-win-prewhql-172.iso Host: qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61.x86_64 seabios-1.12.0-5.module+el8.1.0+4022+29a53beb.x86_64 kernel-4.18.0-137.el8.x86_64 I cannot find the iso version you used on msdn subscriber, could you try with the iso for 1903 release? Maybe this version cannot hit this issue. Or could you upload the iso you used? Thanks Yu Wang
its the regular 1903 image from: https://www.microsoft.com/en-gb/software-download/windows10ISO i've not got an msdn subscription. i had the same issue with home and pro versions.
Tried to reproduce this issue with 1903 V2 image from: https://www.microsoft.com/en-gb/software-download/windows10ISO, still cannot reproduce it. No errors during 24-hour continuously reboot. (the same as comment#6) Guest:Win10_1903_V2_English_x64.iso (Home) virtio-win-prewhql-172.iso Host: qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61.x86_64 seabios-1.12.0-5.module+el8.1.0+4022+29a53beb.x86_64 kernel-4.18.0-137.el8.x86_64 Thanks Yu Wang
that kernel is hideously old, i'm testing with 5.2/5.3 kernels. seems it may be a qcow2 bug rather than virtio-scsi or virtio-win: https://bugs.launchpad.net/qemu/+bug/1847793 https://bugs.launchpad.net/qemu/+bug/1847793
(In reply to bugzilla from comment #9) > that kernel is hideously old, i'm testing with 5.2/5.3 kernels. > > seems it may be a qcow2 bug rather than virtio-scsi or virtio-win: > > https://bugs.launchpad.net/qemu/+bug/1847793 > > https://bugs.launchpad.net/qemu/+bug/1847793 Please take a look at https://bugzilla.redhat.com/show_bug.cgi?id=1743176 It might be related as well. Best, Vadim.
Just seen qcow2 corruption on a debian 10 guest, so its not a virtio-win issue, probably is one of the multitude of qcow2 corruption bugs in qemu 4.1