RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1879388 - red hat virtio scsi disk device 6.3.9600.18758
Summary: red hat virtio scsi disk device 6.3.9600.18758
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: virtio-win
Version: ---
Hardware: x86_64
OS: Windows
unspecified
urgent
Target Milestone: rc
: ---
Assignee: Vadim Rozenfeld
QA Contact: Peixiu Hou
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-16 07:35 UTC by Evgen Puzanov
Modified: 2023-03-14 19:34 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-07-03 23:34:57 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Evgen Puzanov 2020-09-16 07:35:35 UTC
Hello,
 
We have a cloud environment that consists of hosts backed by KVM hypervisors. About half of the virtual machines are running Windows Server operating systems (2008R2, 2012R2, 2016, 2019), there are hundreds of such instances and almost all of them use VirtIO drivers (mostly 0.1.160).
 
Sometimes (it occurred about 3-4 times) we encountered the following glitch: a guest operating decides that its primary disk storage has more size than it actually is. For example, an instance had a virtual drive of 200 GB, it had been worked fine for years, but at some moment (no one knows at which exactly one) primary partition (we mean "drive C:", which usually is the 2nd one, as the 1st one is being used by the operating system) became the size of 210 GB just out of the blue. After that, the system event log started growing with the following error messages: `The driver detected a controller error \ Device \ Harddisk0 \ DR`. Obviously, it happens when the operating system tries to write pieces of data to the sectors that don't exist.
 
Once we expand this virtual drive to 210 GB, the error messages don't appear anymore. Still, after that we find some part of the data corrupted (maybe some fragments of files are being stored to the non-existent sectors), so it seems to be a real problem for us when it happens.
 
Alas, we didn't find a way to reproduce that. As we stated before, it happened only 3-4 times, though each time the outcomes are quite unpleasant.
 
Should we provide with more data regarding this issue? Should we consider upgrading the driver? Perhaps, we just don’t know that it’s a bug that had been fixed after the 0.1.160's release? Just curious, did anyone send a similar bug-report before? We tried to find them, though with no luck.
 
Thanks in advance for your feedback.

Comment 6 Peixiu Hou 2022-05-24 04:16:48 UTC
Hi Evgen Puzanov,

(In reply to Evgen Puzanov from comment #0)
> Hello,
>  
> We have a cloud environment that consists of hosts backed by KVM
> hypervisors. About half of the virtual machines are running Windows Server
> operating systems (2008R2, 2012R2, 2016, 2019), there are hundreds of such
> instances and almost all of them use VirtIO drivers (mostly 0.1.160).
>  
> Sometimes (it occurred about 3-4 times) we encountered the following glitch:
> a guest operating decides that its primary disk storage has more size than
> it actually is. For example, an instance had a virtual drive of 200 GB, it
> had been worked fine for years, but at some moment (no one knows at which
> exactly one) primary partition (we mean "drive C:", which usually is the 2nd
> one, as the 1st one is being used by the operating system) became the size
> of 210 GB just out of the blue. After that, the system event log started
> growing with the following error messages: `The driver detected a controller
> error \ Device \ Harddisk0 \ DR`. Obviously, it happens when the operating
> system tries to write pieces of data to the sectors that don't exist.
>  
> Once we expand this virtual drive to 210 GB, the error messages don't appear
> anymore. Still, after that we find some part of the data corrupted (maybe
> some fragments of files are being stored to the non-existent sectors), so it
> seems to be a real problem for us when it happens.
>  
> Alas, we didn't find a way to reproduce that. As we stated before, it
> happened only 3-4 times, though each time the outcomes are quite unpleasant.
>  
> Should we provide with more data regarding this issue? Should we consider
> upgrading the driver? Perhaps, we just don’t know that it’s a bug that had
> been fixed after the 0.1.160's release? Just curious, did anyone send a
> similar bug-report before? We tried to find them, though with no luck.
>  
> Thanks in advance for your feedback.

Hi Evgen Puzanov,

Sorry for late for this issue~
I want to try to reproduce this issue, but I need some information for your environment:

1) Can I know what's kind cloud env were you used? RHV, Openstack, CNV or others?
2) What's the version of you used KVM hypervisors? and the host kernel version? What's the bios mode? seabios mode or ovmf mode?
3) Did you hit it again later? and if possible to provide the vm's qemu-command line?
On the host, to run "ps aux| grep qemu" to get this info.

Thanks a lot~
Peixiu


Note You need to log in before you can comment on or make changes to this bug.