Bug 805533

Summary: qemu-ga: possible race while suspending the guest
Product: Red Hat Enterprise Linux 6 Reporter: Luiz Capitulino <lcapitulino>
Component: qemu-kvmAssignee: Luiz Capitulino <lcapitulino>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 6.3CC: acathrow, areis, bsarathy, jcody, juzhang, lcapitulino, lersek, mkenneth, qiguo, qzhang, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.307.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 07:33:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 804161, 831387    

Description Luiz Capitulino 2012-03-21 14:27:41 UTC
During qemu-ga patch review it was found that there's a possible race in the code used to detect if the guest supports suspend. No suspend support could be erroneously reported when the race is triggered.

The code in question is in bios_supports_mode() function. Theoretically, the following calls could be interrupted if one of the children created by bios_supports_mode() exits - or other children created by qemu-ga, although no other code that could run in parallel with bios_supports_mode() does that today:

 close(pipefds[1]);
 g_free(pmutils_path);

 ret = read(pipefds[0], &status, sizeof(status));

The quick & easy solution for RHEL6.3 is to loop read() on EINTR error and block SIGCHLD during the close() and g_free() functions.

The Right solution for upstream is to add a general interface to create & safely wait for children to terminate. This would also simplify the suspend functions.

Comment 1 Ademar Reis 2012-04-09 23:53:37 UTC
Corner case on a tech-preview feature, postponing to 6.4.

Comment 2 Luiz Capitulino 2012-04-10 14:47:07 UTC
Took this upstream and the recommendation is to implement the easy fix:

  http://lists.gnu.org/archive/html/qemu-devel/2012-04/msg00998.html

This is doable for 6.3, but I agree it's a corner case.

Comment 3 Luiz Capitulino 2012-05-16 20:18:14 UTC
After some discussion, we decided to make the guest-suspend-* commands synchronous. This just drops the need for the SIGCHLD signal, which automatically ends up fixing this issue.

Patches submitted some days ago and already included in Michael Roth's latest pull request:

http://lists.gnu.org/archive/html/qemu-devel/2012-05/msg02093.html

Comment 10 Luiz Capitulino 2012-11-29 12:29:41 UTC
As far as testing is concerned, this issue was found in code review and is theoretical. There's no recipe to trigger it. So I think verification should be skipped.

Comment 11 juzhang 2012-12-03 05:35:25 UTC
Checked on qemu-kvm-0.12.1.2-2.337.el6, the codes indeed included according to changelog.

#rpm -q qemu-kvm-0.12.1.2-2.337.el6 --changelog | grep 805533
- Update information: Add bug 805533 information to changelog (fix for 827612 fixed also 805533)
- Resolves: bz#805533

Comment 13 errata-xmlrpc 2013-02-21 07:33:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0527.html

Comment 14 Qian Guo 2013-10-15 07:42:21 UTC
Hi, Luiz

Do we need  clone this bug to RHEL7 product?

Thanks

Comment 15 Luiz Capitulino 2013-10-15 14:08:22 UTC
No, this is a very old issue fixed since qemu v1.1.0, but thanks for checking.