Bug 1296099 - [virtio-win][vioser] Race condition in read and write cancellation logic
Summary: [virtio-win][vioser] Race condition in read and write cancellation logic
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: virtio-win
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: ---
Assignee: Ladi Prosek
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-06 10:27 UTC by Ladi Prosek
Modified: 2017-01-06 06:58 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-06 06:58:40 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Ladi Prosek 2016-01-06 10:27:02 UTC
Description of problem:
The vioser I/O cancellation logic has a race condition which may result in an I/O request never completing.

How reproducible:
- easy for write, see below
- hard for read, requires specific host side timing

Steps to Reproduce:
1. Issue an async write on a serial port with WriteFile()
2. Cancel the write immediately with CancelIoEx()

Actual results:
The cancel never completes and the process hangs with ~1% probability on my system.

Expected results:
Writes and reads can always be cancelled.

Additional info:
The driver does not properly synchronize between cancellation and regular I/O completion. If the I/O is being completed at about the same time as the EvtRequestCancel is being delivered to the driver, it is possible that both code paths will skip the completion of the I/O request because both will think that the other party has already done it.

Related reading:
https://msdn.microsoft.com/en-us/library/windows/hardware/ff544726(v=vs.85).aspx

Comment 3 Peixiu Hou 2016-03-23 10:22:46 UTC
Hi Ladi,

About this bug:
I need to reproduce this issue and to verify it on the latest version. 
1. I want to know which the vioser driver version was used when the issue happen?
2. I also want to know the detail steps of reproducing it, eg: how to issue an async write on a serial port with WriteFile()? How to cancel the write immediately with CancelIoEx()? Where the WriteFile() and CancelIoEx() files can be found? etc...


Thanks~~
Peixiu Hou

Comment 4 Ladi Prosek 2016-03-28 14:58:28 UTC
Hi Peixiu Hou,

You should be able to reproduce this with version 112 and older. One approach to building a repro would be to start with the I/O benchmark (vioserial/benchmark in the source tree) which uses async I/O so the only thing missing is the CancelIoEx (or CancelIo) call right after WriteFile. WriteFile, CancelIo/Ex are Win32 API calls:

https://msdn.microsoft.com/en-us/library/windows/desktop/aa365747(v=vs.85).aspx
https://msdn.microsoft.com/en-us/library/windows/desktop/aa363791(v=vs.85).aspx
https://msdn.microsoft.com/en-us/library/windows/desktop/aa363792(v=vs.85).aspx

Calling Win32 directly using C/C++ may not be the only way though. If the language and framework you're using supports I/O cancellation, you should be able to hit the bug too.

Here's an article about I/O cancellation:

https://msdn.microsoft.com/en-us/library/windows/desktop/aa363789(v=vs.85).aspx

Let me know if this helps. Thanks!

Comment 8 Peixiu Hou 2016-07-12 07:04:27 UTC
Hi Ladi,

I reproduced this bug with virtio-win-prewhql-112. 

Run benchmark.exe w com.redhat.rhevm.vdsm, getting a hang, benchmark.exe stops producing output and cannot be killed.

Verified with virtio-win-prewhql-121. The behavior is same as your system, printing 'CompleteIO failed with error 995' but not getting stuck.

According above, the issue has been fixed.

Thank you so much~~


Best Regards~
Peixiu Hou

Comment 9 lijin 2016-08-01 02:58:33 UTC
change status to verified according to comment#8

Comment 10 lijin 2017-01-06 06:58:40 UTC
close as this issue has already been fixed in rhel7.3 virtio-win package


Note You need to log in before you can comment on or make changes to this bug.