Bug 643750

Summary: virtio_console driver never returns from selecting for write when the queue is full
Product: Red Hat Enterprise Linux 6 Reporter: Hans de Goede <hdegoede>
Component: kernelAssignee: Amit Shah <amit.shah>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: bcao, dhoward, mjenner, plyons, virt-maint
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-112.el6 Doc Type: Bug Fix
Doc Text:
Using a virtio serial port from an application, filling it until the write command returns -EAGAIN and then executing a select command for the write command, caused the select command to not return any values when using the virtio serial port in a non-blocking mode. When used in blocking mode, the write command waited until the host indicated it had used up the buffers. This was due to the fact that the poll operation waited for the port->waitqueue pointer; however, nothing woke the waitqueue when there was room again in the queue. With this update, the queue is woken via host notifications so that buffers consumed by the host can be reclaimed, the queue freed, and the application write operations may proceed again.
Story Points: ---
Clone Of:
: 673459 (view as bug list) Environment:
Last Closed: 2011-05-23 20:24:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 580954, 673459, 678558    

Description Hans de Goede 2010-10-17 19:58:49 UTC
When using a virtio serial port from an application and putting it in non blocking mode, then fulling it till write returns -EAGAIN and then doing a select
for write, the select will never returns.

The reason for this is that poll waits for port->waitqueue, but nothing
wakes the waitqueue when there is room again in the queue, quoting from virtio_console.c: init_vqs():

        io_callbacks[j] = in_intr;
        io_callbacks[j + 1] = NULL;

The fix is to simply define a callback for the j + 1 case, and make this wait the waitqueue all the other needed bits are already present.

Comment 1 Hans de Goede 2010-10-17 20:02:09 UTC
Note I believe this bug should be assigned to Amit Shah (but I'm, not sure if it is ok to do this myself wrt the kernel teams procedures).

Comment 3 Hans de Goede 2010-10-19 07:57:30 UTC
Some notes my original description of this problem comes from reading the code, not from hitting this in practice. Amit Shah has run some tests and cannot re-create the problem which one would expect up on reading the code.

We've discussed this and decided to keep this bug open for further investigation later to see if the waitqueue in question is somehow actually woken when room becomes available in the out_vq, or if things currently happen to work because of some side-effect somewhere else.

Comment 4 RHEL Program Management 2011-01-07 04:34:22 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 5 Suzanne Logcher 2011-01-07 16:10:16 UTC
This request was erroneously denied for the current release of Red Hat
Enterprise Linux.  The error has been fixed and this request has been
re-proposed for the current release.

Comment 7 RHEL Program Management 2011-01-28 11:50:39 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 8 RHEL Program Management 2011-01-28 14:10:50 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 9 Amit Shah 2011-01-28 14:12:48 UTC
Testing notes:

With qemu-kvm with the fix for bug 588916, start a guest with:

-chardev socket,path=/tmp/foo,server,nowait,id=c0 -device virtio-serial -device virtserialport,chardev=c0

Then redirect the port to a file:

nc -U /tmp/foo > /tmp/guest-file

In the guest, transfer a big file (anything > 1G) to the virtio port:

cat /tmp/bigfile > /dev/vport0p1


In some cases, the guest command will never finish and the size of the host file will not increase beyond a particular number.

After the kernel with this bug solved is used, the 'cat' command in the guest will finish and the size of the file in the host will match the size of the file in the guest.

Comment 11 Aristeu Rozanski 2011-02-03 19:06:36 UTC
Patch(es) available on kernel-2.6.32-112.el6

Comment 14 Mike Cao 2011-02-16 08:30:05 UTC
Verified on qemu-kvm-0.12.1.2-2.144.el6.
guest kernel : kernel-2.6.32-113.el6

steps:
1.start VM with virtio-serial-port w/o -M parameter.
2.open the socket file on the host and not read it
eg:#cat open-socket 
#!/usr/bin/python
import os
import sys
import socket
import time

#fd = os.open(sys.argv[1], os.O_RDONLY)

s = socket.socket(socket.AF_UNIX)
s.connect(sys.argv[1])

while 1:
        # do nothing
        time.sleep(1)

#python open-socket /tmp/vport0
3.transfer a file whose size > 2G via virtio-serial
eg :#cat /tt > /dev/vport0p1

Actual Results:
qemu-kvm process does not freeze.

Based on above ,this issue has been fixed.
Change status to VERIFIED.

Comment 17 Martin Prpič 2011-04-12 12:41:14 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Using a virtio serial port from an application, filling it until the write command returns -EAGAIN and then executing a select command for the write command, caused the select command to not return any values when using the virtio serial port in a non-blocking mode. When used in blocking mode, the write command waited until the host indicated it had used up the buffers. This was due to the fact that the poll operation waited for the port->waitqueue pointer; however, nothing woke the waitqueue when there was room again in the queue. With this update, the queue is woken via host notifications so that buffers consumed by the host can be reclaimed, the queue freed, and the application write operations may proceed again.

Comment 18 errata-xmlrpc 2011-05-23 20:24:13 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html