Hide Forgot
Description of problem: The vioser I/O cancellation logic has a race condition which may result in an I/O request never completing. How reproducible: - easy for write, see below - hard for read, requires specific host side timing Steps to Reproduce: 1. Issue an async write on a serial port with WriteFile() 2. Cancel the write immediately with CancelIoEx() Actual results: The cancel never completes and the process hangs with ~1% probability on my system. Expected results: Writes and reads can always be cancelled. Additional info: The driver does not properly synchronize between cancellation and regular I/O completion. If the I/O is being completed at about the same time as the EvtRequestCancel is being delivered to the driver, it is possible that both code paths will skip the completion of the I/O request because both will think that the other party has already done it. Related reading: https://msdn.microsoft.com/en-us/library/windows/hardware/ff544726(v=vs.85).aspx
Hi Ladi, About this bug: I need to reproduce this issue and to verify it on the latest version. 1. I want to know which the vioser driver version was used when the issue happen? 2. I also want to know the detail steps of reproducing it, eg: how to issue an async write on a serial port with WriteFile()? How to cancel the write immediately with CancelIoEx()? Where the WriteFile() and CancelIoEx() files can be found? etc... Thanks~~ Peixiu Hou
Hi Peixiu Hou, You should be able to reproduce this with version 112 and older. One approach to building a repro would be to start with the I/O benchmark (vioserial/benchmark in the source tree) which uses async I/O so the only thing missing is the CancelIoEx (or CancelIo) call right after WriteFile. WriteFile, CancelIo/Ex are Win32 API calls: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365747(v=vs.85).aspx https://msdn.microsoft.com/en-us/library/windows/desktop/aa363791(v=vs.85).aspx https://msdn.microsoft.com/en-us/library/windows/desktop/aa363792(v=vs.85).aspx Calling Win32 directly using C/C++ may not be the only way though. If the language and framework you're using supports I/O cancellation, you should be able to hit the bug too. Here's an article about I/O cancellation: https://msdn.microsoft.com/en-us/library/windows/desktop/aa363789(v=vs.85).aspx Let me know if this helps. Thanks!
Hi Ladi, I reproduced this bug with virtio-win-prewhql-112. Run benchmark.exe w com.redhat.rhevm.vdsm, getting a hang, benchmark.exe stops producing output and cannot be killed. Verified with virtio-win-prewhql-121. The behavior is same as your system, printing 'CompleteIO failed with error 995' but not getting stuck. According above, the issue has been fixed. Thank you so much~~ Best Regards~ Peixiu Hou
change status to verified according to comment#8
close as this issue has already been fixed in rhel7.3 virtio-win package