Bug 1067225

Summary: Windows guest performing out-of-bounds accesses on virtio device
Product: Red Hat Enterprise Linux 6 Reporter: David Gibson <dgibson>
Component: virtio-winAssignee: Vadim Rozenfeld <vrozenfe>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 6.4CC: adevolder, areis, dgibson, dgilbert, jherrman, juzhang, knoel, lijin, mdeng, michen, mkalinin, rbalakri, rpacheco, vrozenfe
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, Windows guests in some cases issued out-of-bounds read and write requests to a virtio device, which caused the guest to become unresponsive. Now, the Windows virtio-blk device drive performs logical block addressing (LBA) sanity checks before submitting requests to QEMU, and does not submit requests that are out-of-bounds. As a result, the described problem no longer occurs.
Story Points: ---
Clone Of:
: 1195487 (view as bug list) Environment:
Last Closed: 2016-05-10 16:35:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1195487    
Attachments:
Description Flags
Patch showing diagnostic alterations to qemu
none
couldnotshutdown
none
rhevandspice-install-service-up
none
fromdevicemanager
none
nclogfromhost none

Description David Gibson 2014-02-19 23:59:02 UTC
Description of problem:

NOTE: virtio-win component is just a guess.  The problem is definitely somewhere inside the Windows guest, and virtio-win is the only plausibly responsible component that is RH provided, so we need to either find the bug there, or rule it out.

Several Windows Server 2008R2 guests under RHEV were periodically pausing due to an IO error.  Adding debugging to qemu showed that these EIO errors were because the guest itself was attempting virtio accesses beyond the logical end of device.

Version-Release number of selected component (if applicable):

virtio-win-1.4.0.iso (I think)

How reproducible:

Customer has several systems which trigger this problem daily.

Steps to Reproduce:

Happens daily on customer system.  Usually about the same time each day, so it's probably triggered by a scheduled job, but the exact sequence of guest-side events which trigger this problem is unknown so far.

Actual results:

Guest issues out-of-bounds accesses to the virtio virtual hardware, causing qemu to pause the guest.

Expected results:

virtio-win drivers reject out of bounds access requests and report them on the guest side.

Additional info:

The accesses have a strange pattern.  They appear to come in a burst, the first is the sector immediately beyond the logical end of device, then there are several more each 2-4x the distance beyond the end-of-device as the last one.

qemu diagnostics with more details coming.

Comment 3 David Gibson 2014-02-20 00:21:07 UTC
Created attachment 865302 [details]
Patch showing diagnostic alterations to qemu

This is the debug patch I used to generate the diagnostics for out of bounds accesses.  The code can also be found in the 'private-dgibson-sfdc01018528' branch of the qemu-kvm-rhev rhpkg tree.

Comment 4 Mike Cao 2014-02-20 01:27:50 UTC
Can you try to reproduce it with virito-win-1.6.8-4 ?
Do you know which app is running when this issue occurs ?

Comment 9 Mike Cao 2014-04-09 03:29:42 UTC
This bug may dup of https://bugzilla.redhat.com/show_bug.cgi?id=1080996

Comment 10 David Gibson 2014-04-10 00:45:06 UTC
@Mike,

This is closely related to bug 1080996, but it's not a dupe per se.

Bug 1080996 (itself a dupe of 1064643) is covering the fact that qemu and the stack above don't deal well with errors like this which are unambiguously the guest's fault.  The proposed fix is to differently classify guest parameter caused errors, and (usually) always report them rather than pausing the VM.

This bug is addressing the fact that the guest is initiating the bad accesses in the first place.  The proposal is to either fix the virtio-win drivers so it doesn't make these accesses, or to determine that the problem lies within something else on the guest side.

Comment 11 Vadim Rozenfeld 2014-09-17 05:59:16 UTC
Hi Mike,

Can we try reproducing this problem on a fresh system with the following apps
installed:

Sophos Anti-Virus
Sophos AutoUpdate
Sophos Remote Management System

Thanks,
Vadim.

Comment 12 Mike Cao 2014-09-18 02:59:44 UTC
mdeng ,pls handle the needinfo per comment #11.

Comment 14 Min Deng 2014-10-21 07:07:33 UTC
Created attachment 948817 [details]
couldnotshutdown

Comment 15 Min Deng 2014-10-21 07:08:19 UTC
Created attachment 948818 [details]
rhevandspice-install-service-up

Comment 16 Min Deng 2014-10-21 07:09:00 UTC
Created attachment 948819 [details]
fromdevicemanager

Comment 17 Min Deng 2014-10-21 07:11:16 UTC
Created attachment 948820 [details]
nclogfromhost

Comment 19 Vadim Rozenfeld 2015-01-16 07:46:27 UTC
(In reply to David Gibson from comment #10)

> This bug is addressing the fact that the guest is initiating the bad
> accesses in the first place.  The proposal is to either fix the virtio-win
> drivers so it doesn't make these accesses, or to determine that the problem
> lies within something else on the guest side.

It looks like some application(s) - presumably Sophos Anti-Virus performs raw reads/writes, bypassing file system driver.
I will add an extra sanity check to validate IO boundaries.

Comment 25 Vadim Rozenfeld 2015-03-03 02:52:19 UTC
Should be fixed in build 101, available at http://download.devel.redhat.com/brewroot/packages/virtio-win-prewhql/0.1/101/win/virtio-win-prewhql-0.1.zip

Comment 26 lijin 2015-04-09 06:52:46 UTC
cannot reproduce this issue from QE side,guest works fine.

package info:
kernel-2.6.32-540.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64
virtio-win-1.6.4-1.el6_4.noarch/virtio-win-prewhql-102
seabios-0.6.1.2-29.el6.x86_64

steps:
1.boot win2k8R2 guest with virtio-blk device:
2.install software “Sophos Endpoint Security and Control 10.3” which include "Sophos Anti-Virus" and "Sophos AutoUpdate" on guest;
3.configure schedule of scan computers and autoupdate of sophos
4.make the guest running two days

guest works fine after Sophos Anti-Virus scan the computer;
And I got following error message after Sophos AutoUpdate try to update,I guess it's due to I have no product license:
Message: ERROR:   Download of Sophos Endpoint Security and Control failed from server sophos
Message: ERROR:   Could not find a source for updated packages

Comment 27 Vadim Rozenfeld 2015-05-24 06:13:22 UTC
Please re-check again with the latest build http://download.devel.redhat.com/brewroot/packages/virtio-win-prewhql/0.1/104/win/virtio-win-prewhql-0.1.zip

Thanks,
Vadim.

Comment 28 lijin 2015-05-28 02:44:29 UTC
Mike,please verify the bug with build104

Comment 29 Mike Cao 2015-06-02 05:19:28 UTC
(In reply to Vadim Rozenfeld from comment #27)
> Please re-check again with the latest build
> http://download.devel.redhat.com/brewroot/packages/virtio-win-prewhql/0.1/
> 104/win/virtio-win-prewhql-0.1.zip
> 
> Thanks,
> Vadim.

Vadim ,I can not find Sophos Endpoint Security and Control 10.3 on the internet 
Can you suggest some other similiar tools instead?

Mike

Comment 30 Vadim Rozenfeld 2015-06-02 06:03:06 UTC
(In reply to Mike Cao from comment #29)
> (In reply to Vadim Rozenfeld from comment #27)
> > Please re-check again with the latest build
> > http://download.devel.redhat.com/brewroot/packages/virtio-win-prewhql/0.1/
> > 104/win/virtio-win-prewhql-0.1.zip
> > 
> > Thanks,
> > Vadim.
> 
> Vadim ,I can not find Sophos Endpoint Security and Control 10.3 on the
> internet 
> Can you suggest some other similiar tools instead?

Hi Mike,
No idea. But if needed, I can create a simple app which will do out-of-bound IOs.

Best regards,
Vadim.

> 
> Mike

Comment 31 Mike Cao 2015-06-02 06:16:23 UTC
(In reply to Vadim Rozenfeld from comment #30)
> (In reply to Mike Cao from comment #29)
> > (In reply to Vadim Rozenfeld from comment #27)
> > > Please re-check again with the latest build
> > > http://download.devel.redhat.com/brewroot/packages/virtio-win-prewhql/0.1/
> > > 104/win/virtio-win-prewhql-0.1.zip
> > > 
> > > Thanks,
> > > Vadim.
> > 
> > Vadim ,I can not find Sophos Endpoint Security and Control 10.3 on the
> > internet 
> > Can you suggest some other similiar tools instead?
> 
> Hi Mike,
> No idea. But if needed, I can create a simple app which will do out-of-bound
> IOs.
> 
> Best regards,
> Vadim.

Hi, Vadim

Pls help do it.

Thanks,
Mike
> 
> > 
> > Mike

Comment 34 lijin 2015-12-23 07:52:55 UTC
As rhel6.8 will ship the same viostor version with rhel7.2 and the same bug1195487 on rhel7.2 has been verified,change status to verified.

Comment 36 errata-xmlrpc 2016-05-10 16:35:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1011.html

Comment 37 Ladi Prosek 2016-10-26 16:22:15 UTC
*** Bug 1388553 has been marked as a duplicate of this bug. ***