Bug 874330
Summary: | First autostarted guest has always id 1 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Peter Krempa <pkrempa> |
Component: | libvirt | Assignee: | Peter Krempa <pkrempa> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.4 | CC: | acathrow, bili, dallan, dyasny, dyuan, eblake, mzhan, rwu, whuang, ydu |
Target Milestone: | rc | Keywords: | Regression |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | libvirt-0.10.2-8.el6 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-02-21 07:26:14 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Peter Krempa
2012-11-07 23:14:50 UTC
This is caused by a race between the thread that autostarts machines on daemon startup and threads that re-connect to existing processes. The maximum of the IDs of guests that are still running on restart of libvirt has to be determined before forking separate threads. I'm working on a fix. Fix posted upstream: http://www.redhat.com/archives/libvir-list/2012-November/msg00403.html Fixed upstream: commit 02cf57c0d0d2333dceadb7f84b08ec28a35ef540 Author: Peter Krempa <pkrempa> Date: Thu Nov 8 13:48:37 2012 +0100 qemu: Fix domain ID numbering race condition When the libvirt daemon is restarted it tries to reconnect to running qemu domains. Since commit d38897a5d4b1880e1998394b2a37bba979bbdff1 the re-connection code runs in separate threads. In the original implementation the maximum of domain ID's (that is used as an initializer for numbering guests created next) while libvirt was reconnecting to the guest. With the threaded implementation this opens a possibility for race conditions with the thread that is autostarting guests. When there's a guest running with id 1 and the daemon is restarted. The autostart code is reached first and spawns the first guest that should be autostarted as id 1. This results into the following unwanted situation: # virsh list Id Name State ---------------------------------------------------- 1 guest1 running 1 guest2 running This patch extracts the detection code before the re-connection threads are started so that the maximum id of the guests being reconnected to is known. The only semantic change created by this is if the guest with greatest ID quits before we are able to reconnect it's ID is used anyway as the greatest one as without this patch the greatest ID of a process we could successfuly reconnect to would be used. I can reproduce it with libvirt-0.10.2-7.el6. # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] # virsh list Id Name State ---------------------------------------------------- # virsh start rhel63 Domain rhel63 started # virsh list --all Id Name State ---------------------------------------------------- 1 rhel63 running - rhel62 shut off # virsh autostart rhel62 Domain rhel62 marked as autostarted # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] # virsh list Id Name State ---------------------------------------------------- 1 rhel63 running 1 rhel62 running Verified the bug with libvirt-0.10.2-8.el6: # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] # virsh start raw Domain raw started # virsh list Id Name State ---------------------------------------------------- 1 raw running # virsh autostart aa Domain aa marked as autostarted # virsh list --all --autostart Id Name State ---------------------------------------------------- - aa shut off # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] # virsh list --all Id Name State ---------------------------------------------------- 1 raw running 2 aa running So moving to VERIFIED. Marking this as regression, since it was introduced in upstream commit d38897a (0.9.5); RHEL 6.1 did not have this issue. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0276.html |