Bug 874330 - First autostarted guest has always id 1
First autostarted guest has always id 1
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.4
Unspecified Unspecified
unspecified Severity medium
: rc
: ---
Assigned To: Peter Krempa
Virtualization Bugs
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-07 18:14 EST by Peter Krempa
Modified: 2014-07-01 08:03 EDT (History)
10 users (show)

See Also:
Fixed In Version: libvirt-0.10.2-8.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-21 02:26:14 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Peter Krempa 2012-11-07 18:14:50 EST
Description of problem:
When guests are selected to be autostarted on libvirtd start, the first one has always ID 1 regardless of existing guests. This leads to unfortunate situations like:

# virsh list
 Id    Name                           State
----------------------------------------------------
 1     guest1                         running
 1     guest2                         running


Version-Release number of selected component (if applicable):
Found in upstream version, but the code wasn't touched in ages so recent downstream versions are affected too.


How reproducible:
100%


Steps to Reproduce:
1. shutdown/destroy all guests
2. restart libvirtd
3. start a guest, it will get ID 1
4. mark a different guest as autostartable
5. restart libvirt
  
Actual results:
Two guests will share ID 1.


Expected results:
Guests will have different ID's


Additional info:
This bug only applies to the first guest started. After the first one is started the next one continues the numbering series from the highest ID of the guests that were running at libvirtd restart.
Comment 2 Peter Krempa 2012-11-08 06:23:21 EST
This is caused by a race between the thread that autostarts machines on daemon startup and threads that re-connect to existing processes. The maximum of the IDs of guests that are still running on restart of libvirt has to be determined before forking separate threads. I'm working on a fix.
Comment 3 Peter Krempa 2012-11-08 08:13:25 EST
Fix posted upstream: http://www.redhat.com/archives/libvir-list/2012-November/msg00403.html
Comment 4 Peter Krempa 2012-11-08 18:15:55 EST
Fixed upstream:

commit 02cf57c0d0d2333dceadb7f84b08ec28a35ef540
Author: Peter Krempa <pkrempa@redhat.com>
Date:   Thu Nov 8 13:48:37 2012 +0100

    qemu: Fix domain ID numbering race condition
    
    When the libvirt daemon is restarted it tries to reconnect to running
    qemu domains. Since commit d38897a5d4b1880e1998394b2a37bba979bbdff1 the
    re-connection code runs in separate threads. In the original
    implementation the maximum of domain ID's (that is used as an
    initializer for numbering guests created next) while libvirt was
    reconnecting to the guest.
    
    With the threaded implementation this opens a possibility for race
    conditions with the thread that is autostarting guests. When there's a
    guest running with id 1 and the daemon is restarted. The autostart code
    is reached first and spawns the first guest that should be autostarted
    as id 1. This results into the following unwanted situation:
    
     # virsh list
       Id    Name                           State
      ----------------------------------------------------
       1     guest1                         running
       1     guest2                         running
    
    This patch extracts the detection code before the re-connection threads
    are started so that the maximum id of the guests being reconnected to is
    known.
    
    The only semantic change created by this is if the guest with greatest ID
    quits before we are able to reconnect it's ID is used anyway as the
    greatest one as without this patch the greatest ID of a process we could
    successfuly reconnect to would be used.
Comment 6 dyuan 2012-11-09 01:04:52 EST
I can reproduce it with libvirt-0.10.2-7.el6.

# service libvirtd restart
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

# virsh list
 Id    Name                           State
----------------------------------------------------

# virsh start rhel63
Domain rhel63 started

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 1     rhel63                         running
 -     rhel62                         shut off

# virsh autostart rhel62
Domain rhel62 marked as autostarted

# service libvirtd restart
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

# virsh list
 Id    Name                           State
----------------------------------------------------
 1     rhel63                         running
 1     rhel62                         running
Comment 8 EricLee 2012-11-15 03:09:26 EST
Verified the bug with libvirt-0.10.2-8.el6:

# service libvirtd restart
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

# virsh start raw
Domain raw started

# virsh list 
 Id    Name                           State
----------------------------------------------------
 1     raw                            running

# virsh autostart aa
Domain aa marked as autostarted

# virsh list --all --autostart
 Id    Name                           State
----------------------------------------------------
 -     aa                             shut off

# service libvirtd restart
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 1     raw                            running
 2     aa                             running

So moving to VERIFIED.
Comment 9 Eric Blake 2012-11-20 10:24:40 EST
Marking this as regression, since it was introduced in upstream commit d38897a (0.9.5); RHEL 6.1 did not have this issue.
Comment 11 errata-xmlrpc 2013-02-21 02:26:14 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html

Note You need to log in before you can comment on or make changes to this bug.