RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 643751 - writing to a virtio serial port while no one is listening on the host side hangs the guest
Summary: writing to a virtio serial port while no one is listening on the host side ha...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: 6.1
Assignee: Amit Shah
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 580954 644735 678562
TreeView+ depends on / blocked
 
Reported: 2010-10-17 20:06 UTC by Hans de Goede
Modified: 2013-01-11 03:24 UTC (History)
7 users (show)

Fixed In Version: kernel-2.6.32-91.el6
Doc Type: Bug Fix
Doc Text:
If a host was slow in reading data or did not read data at all, blocking write() calls not only blocked the program that called the write() call but also the entire guest. This was caused by the write() calls waiting until an acknowledgment that the data consumed was received from the host. With this update, write() calls no longer wait for such acknowledgment: control is immediately returned to the user space application. This ensures that even if the host is busy processing other data or is not consuming data at all, the guest is not blocked.
Clone Of:
: 644735 (view as bug list)
Environment:
Last Closed: 2011-05-23 20:26:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
serial-console.py (195 bytes, text/plain)
2011-04-19 03:27 UTC, dawu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0542 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 6.1 kernel security, bug fix and enhancement update 2011-05-19 11:58:07 UTC

Description Hans de Goede 2010-10-17 20:06:43 UTC
The problem is this "beauty" in virtio_console.c: send_buf()

        /*
         * Wait till the host acknowledges it pushed out the data we
         * sent.  This is done for ports in blocking mode or for data
         * from the hvc_console; the tty operations are performed with
         * spinlocks held so we can't sleep here.
         */
        while (!virtqueue_get_buf(out_vq, &len))
                cpu_relax();

I see a number of possible (partial) solutions here:

1) the code says it is using cpu_relux rather then sleep, because the tty
functions are called with a spinlock held. but fops-write is not called with
any spinlock held, I believe. How about a parameter to send_buf, called
"may_sleep" and then use sleep rather then relax if may_sleep is true?

2) The waiting is done for: "This is done for ports in blocking mode or for
data from the hvc_console". I wonder why the waiting is done in blocking mode
too, I guess this is some sort of workaround for the missing waitqueue wakeups,
see bug 643750, with those waitqueue wakeups added I would think / expect the
waiting for the host acknowledge is only needed for tty usage, and that we
could skip the wait entirely (making 1 mute) when called from fops_write ?

I know that work is being done for a more permanent solution, but if the above 2 are possible this would be a nice way to lessen the cases where this problem happens, which will also help while running in VM's without the new more permanent fix.

Note I believe this bug should be assigned to Amit Shah (but I'm, not sure if
it is ok to do this myself wrt the kernel teams procedures).

Comment 2 Hans de Goede 2010-10-19 07:58:10 UTC
Amit has posted a patch for this, assigning to Amit.

Comment 3 Amit Shah 2010-10-20 06:07:51 UTC
To test:

- open guest virtio-console port
- open host virtio-console port

Without reading from the host side, keep writing to the guest port.  After a
few writes, the guest will freeze.

After applying the patch that fixes this, the guest will not freeze, but the
application writing data to the guest port will wait till the host side data is
read off.

This test is the test_blocking_write() in test-virtserial.git:

http://fedorapeople.org/gitweb?p=amitshah/public_git/test-virtserial.git;a=commitdiff;h=e5cbe2be47ca7cf5fce86da694869fc8e922d41c

Comment 4 RHEL Program Management 2010-10-20 12:10:05 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 5 Amit Shah 2010-10-21 07:45:56 UTC
Additional testing note: the effect of this patch will be visible with host-side qemu modifications which aren't yet in RHEL6.  When testing, ask me for a brew build.

Comment 6 Aristeu Rozanski 2010-12-15 16:05:30 UTC
Patch(es) available on kernel-2.6.32-91.el6

Comment 11 juzhang 2011-03-23 05:48:11 UTC
(In reply to comment #3)
> To test:
> 
> - open guest virtio-console port
> - open host virtio-console port
> 
> Without reading from the host side, keep writing to the guest port.  After a
> few writes, the guest will freeze.
Reproduced on kernel-2.6.32-90.el6
After step3,30 seconds later,the guest is hang.

Verified  on kernel-2.6.32-118.el6 with qemu-kvm-0.12.1.2-2.151.el6.x86_64

Steps:
1. boot guest.
#/usr/libexec/qemu-kvm -m 2G -smp 4 -drive file=/root/zhangjunyi/rhel6.1-ide.qcow2,if=none,id=test,cache=none,format=qcow2,werror=stop,rerror=stop -device virtio-blk-pci,drive=test -cpu qemu64,+sse2,+x2apic -boot c -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:11:22:45:66:94 -vnc :10  -device virtio-serial-pci,id=virtio-serial0,max_ports=31 -chardev socket,id=channel0,path=/var/zhangjunyi0,server,nowait -device virtserialport,bus=virtio-serial0.0,chardev=channel0,name=org.port.0,id=port1 -serial stdio -qmp tcp:0:4444,server,nowait

2.in guest,write big file
cat partaa > /dev/vport0p1

3.in host.
just open  virtio-console without portreading.

results:
10 mins later,guest still well.

Comment 12 juzhang 2011-03-23 05:49:02 UTC
According to comment11,set this issue as verified.

Comment 13 Martin Prpič 2011-04-12 12:42:10 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
If a host was slow in reading data or did not read data at all, blocking write() calls not only blocked the program that called the write() call but also the entire guest. This was caused by the write() calls waiting until an acknowledgment that the data consumed was received from the host. With this update, write() calls no longer wait for such acknowledgment: control is immediately returned to the user space application. This ensures that even if the host is busy processing other data or is not consuming data at all, the guest is not blocked.

Comment 14 dawu 2011-04-19 03:25:47 UTC
Verified  on kernel-2.6.32-133.el6 with qemu-kvm-0.12.1.2-2.158.el6.x86_64
this issue does not reproduce, 10 mins later,guest still well,following is the details:

Steps:
1. boot guest.
/usr/libexec/qemu-kvm -m 2G -smp 4 -drive file=RHEL-Server-6.1-64-virtio.qcow2,if=none,id=test,cache=none,format=qcow2,werror=stop,rerror=stop -device virtio-blk-pci,drive=test -cpu qemu64,+sse2,+x2apic -boot c -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:11:22:45:66:94 -vnc :1  -device virtio-serial-pci,id=virtio-serial0,max_ports=31 -chardev socket,id=channel0,path=/var/zhangjunyi0,server,nowait -device virtserialport,bus=virtio-serial0.0,chardev=channel0,name=org.port.0,id=port1-serial -qmp tcp:0:4444,server,nowait

2.in guest,write big file
cat partaa > /dev/vport0p1

3.in host.
just open  virtio-console without portreading. (please refer to the attached python script of "serial-console.py")
#python serial-console.py /var/zhangjunyi0

results:
10 mins later,guest still well

Comment 15 dawu 2011-04-19 03:27:28 UTC
Created attachment 493062 [details]
serial-console.py

Comment 16 errata-xmlrpc 2011-05-23 20:26:31 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html


Note You need to log in before you can comment on or make changes to this bug.