Bug 1259380
Summary: | Windows guests consumes full CPU core when host side of virtio-serial is closed | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Nat Meo <nat> | ||||||
Component: | qemu-kvm | Assignee: | Gal Hammer <ghammer> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.1 | CC: | huding, juzhang, knoel, qzhang, rbalakri, virt-maint, xfu | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Windows | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-08-22 19:33:08 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Nat Meo
2015-09-02 13:40:07 UTC
The described behavior is by (lack of) design. When the host side is not connected, all read/write requests to/from the driver are completed with an error. If the guest runs in a loop and keeps retrying without a delay then you'll see a 100% CPU usage of the running process' core. In order to avoid this you can either: 1. Use IOCTL_GET_INFORMATION and check the HostConnected field's value. 2. Register to the GUID_VIOSERIAL_PORT_CHANGE_STATUS notification (similar to a CD change notification). Created attachment 1071336 [details]
vmc-cat V2 to demonstrate host connected flag
Unfortunately those options don't really work out. As for #1, the HostConnected status does not correctly report when the host side has disconnected. I have attached an updated version of the vmc-cat tool to demonstrate this. The code has been updated so that before each call to WriteFile it will check the HostConnected status with IOCTL_GET_INFORMATION. The following steps are performed: 1. Execute "cat /dev/pts/1" on the host side. 2. Execute "vmc-cat w test" on the guest side. 3. Type a message as follows and observe the output of the status: Test1 Host Status: Connected 4. Observe the "Test1" is received on the host side. 5. Press CTRL-C in "cat" on the host side to close the host side of the virtio-serial device. 6. Type a second message as follows and observe the output of the status: Test2 Host Status: Connected 7. CPU spike occurs So the HostConnected status is not reporting correctly that it is disconnected and the guest is unable to actually determine that it should not be writing data. Using GUID_VIOSERIAL_PORT_CHANGE_STATUS doesn't appear to be a viable option either. Ignoring the potential race condition that a write could be occurring in another thread when the event is received, it does not appear the virtio-serial driver actually reports when the host side has closed. I have looked through the source code at the following link and don't see anything that would be doing this: https://github.com/YanVugenfirer/kvm-guest-drivers-windows/tree/master/vioserial There are events for when the device is physically added and removed as well as when it is opened, but nothing apparent for when it is closed: https://github.com/YanVugenfirer/kvm-guest-drivers-windows/blob/master/vioserial/sys/Control.c#L122 If I am misreading the source code for virtio-serial then please let me know. The observed behavior of the CPU spike appears to be inside qemu-kvm itself though and not the Windows driver as task manager does not report any increase in CPU usage when this happens. This behavior is not observed on Linux guests though which is the rather strange part. Disconnecting the host side of a virtio-serial device for a Linux guest does not cause the CPU to spike when data is written on the guest side. (In reply to Nat Meo from comment #4) > Unfortunately those options don't really work out. As for #1, the > HostConnected status does not correctly report when the host side has > disconnected. I have attached an updated version of the vmc-cat tool to > demonstrate this. The code has been updated so that before each call to > WriteFile it will check the HostConnected status with IOCTL_GET_INFORMATION. > The following steps are performed: > > 1. Execute "cat /dev/pts/1" on the host side. > 2. Execute "vmc-cat w test" on the guest side. > 3. Type a message as follows and observe the output of the status: > > Test1 > Host Status: Connected > > 4. Observe the "Test1" is received on the host side. > 5. Press CTRL-C in "cat" on the host side to close the host side of the > virtio-serial device. > 6. Type a second message as follows and observe the output of the status: > > Test2 > Host Status: Connected If the host status field is not updated then it might be a bug. Either in the driver or in qemu. I need to check it. > 7. CPU spike occurs > > So the HostConnected status is not reporting correctly that it is > disconnected and the guest is unable to actually determine that it should > not be writing data. > > Using GUID_VIOSERIAL_PORT_CHANGE_STATUS doesn't appear to be a viable option > either. Ignoring the potential race condition that a write could be > occurring in another thread when the event is received, it does not appear > the virtio-serial driver actually reports when the host side has closed. I > have looked through the source code at the following link and don't see > anything that would be doing this: > > https://github.com/YanVugenfirer/kvm-guest-drivers-windows/tree/master/ > vioserial > > There are events for when the device is physically added and removed as well > as when it is opened, but nothing apparent for when it is closed: > > https://github.com/YanVugenfirer/kvm-guest-drivers-windows/blob/master/ > vioserial/sys/Control.c#L122 The VIRTIO_CONSOLE_PORT_OPEN event is expected when the host either is closed or opened. > If I am misreading the source code for virtio-serial then please let me > know. The observed behavior of the CPU spike appears to be inside qemu-kvm > itself though and not the Windows driver as task manager does not report any > increase in CPU usage when this happens. This behavior is not observed on > Linux guests though which is the rather strange part. Disconnecting the host > side of a virtio-serial device for a Linux guest does not cause the CPU to > spike when data is written on the guest side. Are you sure that the CPU is not consumed by the vm-cat program? Can you try with a newer version of qemu-kvm? qemu-kvm-1.5.3-102 is the latest, I think. The HostConnected status not being updated correctly appears to be a bug to me. You can use the attached source code to see the behavior yourself. It is not a small race condition either where it could be a microsecond between WriteFile and IOCTL_GET_INFORMATION since I can close the host side and then perform the IOCTL_GET_INFORMATION call 15 seconds later or something and it will still report back as connected. The CPU is definitely not being consumed by the example vmc-cat program. I have attached the source code so you can see for yourself. It works by blocking on a ReadFile call from standard input and only performs a WriteFile call when anything is typed by the user. It only makes the WriteFile call once and does not retry on error. There is no repeated busy loop that would cause the CPU to thrash. As I have mentioned before, if you observe task manager inside the Windows guest it will show no CPU usage at all but on the host side the qemu-kvm process will consume a full CPU core. Thanks for the info on the VIRTIO_CONSOLE_PORT_OPEN that it also works on closed. That may help prevent the need to call IOCTL_GET_INFORMATION before every call to WriteFile. Given that it is used both for port open and close though it sounds like I still need to check the HostConnected state upon that event which currently does not report back correctly, so I am still stuck with the same problem. Where can I get qemu-kvm-1.5.3-102? Currently I am using qemu-kvm-1.5.3-86 which is showing up as the latest version through yum. I have observed this problem for over a year now though and it existed on EL6 as well. It just hasn't been too much trouble in what I have been working on until recently since I am depending more on using virtio-serial to communicate between guests and hosts. I tried making a build of QEMU 2.3.1 and with that version this problem no longer occurs. There is no CPU spike when data is written to the virtio serial device when the host side has been closed. I may go along with using QEMU 2.3.1 for my purposes, but this problem still exists in the EL7 QEMU 1.5.3 packages and may affect other people so it may be worth looking into backporting whatever fixes this problem from 2.3.1. Problem is solved in QEMU 2.3.1 (comment #8). No reason to backport fix to previous version was given. |