RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1890810 - Win10 disk corruption after TRIM for 4Kn viostor drives
Summary: Win10 disk corruption after TRIM for 4Kn viostor drives
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: virtio-win
Version: 8.3
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.3
Assignee: Vadim Rozenfeld
QA Contact: menli@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-22 21:38 UTC by Vitaliy Gusev
Modified: 2021-03-23 02:52 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-16 14:24:38 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Screenshot with message about corruption E drive (40.09 KB, image/png)
2020-10-22 21:38 UTC, Vitaliy Gusev
no flags Details
driver corruption (210.88 KB, image/png)
2020-10-26 10:23 UTC, menli@redhat.com
no flags Details
Lost disk G: after reboot (22.45 KB, image/png)
2020-10-26 10:44 UTC, Vitaliy Gusev
no flags Details
raw format scenario (103.77 KB, image/png)
2020-10-26 13:16 UTC, menli@redhat.com
no flags Details

Description Vitaliy Gusev 2020-10-22 21:38:31 UTC
Created attachment 1723631 [details]
Screenshot with message about corruption E drive

Description of problem:

4Kn virtual disks (virtio block device). Windows 10 sees disk corruption after TRIM.


Version-Release number of selected component (if applicable):

   08/10/2020,100.83.104.18900

How reproducible: When discard command is sent - always.


Steps to Reproduce:
 1. Use bhyve, pass 4Kn virtio block device to VM (Windows 10).
    bhyve -A -H -c 2 -m 2147483648 -s 0,hostbridge -s 11,virtio-blk,$vm_disk,sectorsize=4096/4096 ...

 2. Create some large files, then delete. 
 3. Reboot or use command in PowerShell
    "Optimize-Volume -DriveLetter E -ReTrim -Verbose"


Actual results:

Disk E become corrupted or even can not be accessible in OS.

Expected results:

No corruption

Additional info:

Comment 1 Vadim Rozenfeld 2020-10-23 00:04:01 UTC
Can QE try reproducing this issue on RHEL/upstream qemu?

Thanks,
Vadim.

Comment 2 menli@redhat.com 2020-10-26 02:18:50 UTC
Sorry I have no upstream ENV, tested with following but not reproduce it.

test env:

qemu-kvm-5.1.0-14.module+el8.3.0+8438+644aff69.x86_64


qemu command line:

 /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm3' \
    -machine q35 \
    -nodefaults \
    -vga std  \
    -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x3 \
    -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x3.0x1 \
    -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x3.0x2 \
    -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x3.0x3 \
    -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x3.0x4 \
    -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x3.0x5 \
    -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x3.0x6 \
    -device pcie-root-port,port=0x17,chassis=8,id=pci.8,bus=pcie.0,addr=0x3.0x7 \
    -drive file=/home/test/win10.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,cache=none -device ide-drive,drive=drive-ide0-0-0,id=ide0-0-0,bus=ide.0,unit=0,bootindex=0 \
    -blockdev node-name=file_stg1,driver=file,cache.direct=on,cache.no-flush=off,filename=/home/test/storage.qcow2,aio=threads \
    -blockdev node-name=drive_stg1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_stg1 \
    -device virtio-blk-pci,id=stg1,drive=drive_stg1,bus=pci.8,addr=0x0,logical_block_size=4096,physical_block_size=4096 \
    -device virtio-net-pci,mac=9a:36:83:b6:3d:05,id=idJVpmsF,netdev=id23ZUK6,bus=pci.3  \
    -netdev tap,id=id23ZUK6,vhost=on,script=/etc/qemu-ifup\
    -m 4G  \
    -smp 4,maxcpus=4 \
    -cpu 'Skylake-Server',hv_stimer,hv_synic,hv_vpindex,hv_reset,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv-tlbflush,+kvm_pv_unhalt \
    -cdrom /home/kvm_autotest_root/iso/windows/virtio-win-prewhql-0.1-189.iso  \
    -device piix3-usb-uhci,id=usb -device usb-tablet,id=input0 \
    -vnc :11  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -qmp tcp:0:1232,server,nowait \
    -monitor stdio \


I try to search some information, could you try with following to check if that fixes the issue.

1. Open Computer by clicking the Start button, and then clicking Computer.
2. Right-click the drive that you want to check, and then click Properties.
3. Click the Tools tab, and then, under Error-checking, click Check now. If you're prompted for an administrator password or confirmation, type the password or provide confirmation.

To automatically repair problems with files and folders that the scan detects, select Automatically fix file system errors. Otherwise, the disk check will report problems but not fix them.

To perform a thorough check, select Scan for and attempt recovery of bad sectors. This scan attempts to find and repair physical errors on the drive itself, and it can take much longer to complete.

To check for both file errors and physical errors, select both Automatically fix file system errors and Scan for and attempt recovery of bad sectors.

4. Click Start.

Depending on the size of your drive, this might take several minutes. For best results, don't use your computer for any other tasks while it is checking for errors.

Comment 3 Vitaliy Gusev 2020-10-26 08:55:51 UTC
Meni, What this command "fsutil fsinfo sectorInfo F:" reports in Windows (replace F with your drive?


Output should be like this:

       LogicalBytesPerSector :                                 4096
       PhysicalBytesPerSectorForAtomicity :                    4096
       PhysicalBytesPerSectorForPerformance :                  4096
       FileSystemEffectivePhysicalBytesPerSectorForAtomicity : 4096
       Device Alignment :                                      Aligned (0x000)
       Partition alignment on device :                         Aligned (0x000)
       Performs Normal Seeks
       Trim Supported

Make sure that "Trim Supported" is shown. 


Next, please run command in PowerShell (change F with your drive):


     Optimize-Volume -DriveLetter F -ReTrim -Verbose


Wait for 10-30 seconds and reboot VM. In case of "raw" format, you should get unavailable disk F.

Comment 4 Vitaliy Gusev 2020-10-26 09:30:01 UTC
In case of "qcow2" format, you should see drive corruption during Tool->Error-Checking or during Boot stage (auto filesystem check).

Comment 5 menli@redhat.com 2020-10-26 10:23:41 UTC
Created attachment 1724137 [details]
driver corruption

Comment 6 menli@redhat.com 2020-10-26 10:28:23 UTC
Vitaliy, thank you for highlight the checkpoint~

reproduce it on build:

qemu-kvm-core-5.1.0-14.module+el8.3.0+8438+644aff69.x86_64
virtio-win-prewhql-0.1-189.iso


see the drive corruption during Boot stage, but not hit the "raw" format scenario.

correct the qemu command line on comment 2 for miss the discard option

/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm3' \
    -machine q35 \
    -nodefaults \
    -vga std  \
    -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x3 \
    -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x3.0x1 \
    -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x3.0x2 \
    -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x3.0x3 \
    -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x3.0x4 \
    -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x3.0x5 \
    -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x3.0x6 \
    -device pcie-root-port,port=0x17,chassis=8,id=pci.8,bus=pcie.0,addr=0x3.0x7 \
    -drive file=/home/kvm_autotest_root/images/win10-64-virtio.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,cache=none -device ide-drive,drive=drive-ide0-0-0,id=ide0-0-0,bus=ide.0,unit=0,bootindex=0 \
    -blockdev node-name=file_stg2,driver=file,cache.direct=on,cache.no-flush=off,filename=/home/test/storage.qcow2,aio=threads,discard=unmap \
    -blockdev node-name=drive_stg2,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_stg2,discard=unmap \
    -device virtio-blk-pci,id=stg2,drive=drive_stg2,bus=pci.4,addr=0x0,logical_block_size=4096,physical_block_size=4096 \
    -device virtio-net-pci,mac=9a:36:83:b6:3d:05,id=idJVpmsF,netdev=id23ZUK6,bus=pci.3  \
    -netdev tap,id=id23ZUK6,vhost=on,script=/etc/qemu-ifup\
    -m 4G  \
    -smp 4,maxcpus=4 \
    -cpu 'Skylake-Server',hv_stimer,hv_synic,hv_vpindex,hv_reset,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv-tlbflush,+kvm_pv_unhalt \
    -device piix3-usb-uhci,id=usb -device usb-tablet,id=input0 \
    -vnc :11  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -qmp tcp:0:1232,server,nowait \
    -monitor stdio \


Thanks
Menghuan

Comment 7 Vitaliy Gusev 2020-10-26 10:42:33 UTC
Menghuan, to completely loose drive, you should prepare qcow2 disk with the same cluster_size as NTFS block size. 

For instance, I created qcow2 disk with "-o cluster_size=32K", and formatted NTFS with 32K allocation unit size.

After  "Optimize-Volume -DriveLetter G -ReTrim -Verbose" and reboot, drive G was completely lost (I will attach screenshot)

Comment 8 Vitaliy Gusev 2020-10-26 10:44:52 UTC
Created attachment 1724138 [details]
Lost disk G: after reboot

Comment 9 menli@redhat.com 2020-10-26 13:15:23 UTC
Vitaliy, Thank you provide the details.


I try 10 times ,hit one "raw" format scenario, always hit scenario 'see drive corruption during Tool->Error-Checking or during Boot stage' 

1. prepare a qcow2 disk

qemu-img create -f qcow2 storage.qcow2 -o cluster_size=32k  20g

2.formatted NTFS with 32K allocation unit size.

3. boot the guest like comment 6

4.execute  "Optimize-Volume -DriveLetter G -ReTrim -Verbose" in powershell

5.reboot the guest



Thanks

Menghuan

Comment 10 menli@redhat.com 2020-10-26 13:16:38 UTC
Created attachment 1724153 [details]
raw format scenario

Comment 11 Vadim Rozenfeld 2020-10-27 01:56:16 UTC
Hi Menghuan,

If you can reproduce the problem with official viostor driver,
could you give a try to this one, available at http://people.redhat.com/vrozenfe/4Vitaliy.zip
and let us know if it helps solveing the problem?
This driver was built aside from the brew build system, and it doesn't have a proper 
version number. So, you might need to uninstall the already installed driver 
first and then install that new one, rather than trying to update viostor driver.

Best,
Vadim.

Comment 12 menli@redhat.com 2020-10-27 07:59:19 UTC
Use the same qemu command line as comment 6, just install the driver http://people.redhat.com/vrozenfe/4Vitaliy.zip  like comment 11, try 10 times ,not reproduce it.




Thanks

Menghuan

Comment 13 Vadim Rozenfeld 2020-10-27 08:03:30 UTC
(In reply to menli from comment #12)
> Use the same qemu command line as comment 6, just install the driver
> http://people.redhat.com/vrozenfe/4Vitaliy.zip  like comment 11, try 10
> times ,not reproduce it.
> 
> 
> 
> 
> Thanks
> 
> Menghuan

Thank you, Menghuan.

Comment 15 Vadim Rozenfeld 2020-11-12 05:08:34 UTC
Please verify with the latest drivers from build 190
https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1383208

Comment 16 menli@redhat.com 2020-11-12 14:00:20 UTC
Reproduce with virtio-win-prewhql-0.1-189.iso version(steps same as comment#0), result is comment 10
Verified with virtio-win-prewhql-0.1-190.iso(steps same as comment#0), no drive corruption
So this issue is fixed, change status to verified.


Thanks 

Menghuan

Comment 19 errata-xmlrpc 2021-02-16 14:24:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virtio-win bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:0535


Note You need to log in before you can comment on or make changes to this bug.