Bug 243884 - Race in vfb/vkbd device setup
Summary: Race in vfb/vkbd device setup
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen
Version: 5.0
Hardware: All
OS: Linux
low
low
Target Milestone: ---
: ---
Assignee: Markus Armbruster
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-06-12 15:25 UTC by Markus Armbruster
Modified: 2007-11-30 22:07 UTC (History)
1 user (show)

Fixed In Version: RHEA-2007-0635
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-11-07 17:10:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2007:0635 0 normal SHIPPED_LIVE xen enhancement update 2007-10-30 15:49:02 UTC

Description Markus Armbruster 2007-06-12 15:25:06 UTC
Description of problem:
Fixed in xen-unstable cset 14926.  Quoting its log message:

 1. XendDomainInfo._createDevices() gets a list of devices to be
    created from XendConfig.ordered_device_refs().
    On a simple guest, this has 4 devices, vfb, vbd, vif, vkbd - in
    that order.

 2. It iterates over those devices, creating the appropriate
    DevController subclass instance, and then calling createDevice()
    on that object.

 3. When createDevice() is called on the vfb  device, it spawns
    xen-vncfb daemon.

 4. During startup xen-vncfb writes into the backend paths
            /local/domain/0/backend/vfb/0
     And
            /local/domain/0/backend/vkbd/0

 5. When createDevice() is called on the vkbd device in XenD, if the
    2nd xenstore path write from step 4 has occurred, then you'll hit
    the 'Device 0 (vkbd) is already connected' error. If the 2nd path
    write didn't complete yet then everything is fine.

I think the reason it often works once after boot is that loading
xen-vncfb from disk the first time around is just enough of a slow
down to ensure step 5 occurs before the 2nd xenstore write in step 4
has occurred.

The key seems to be to ensure the vkbd device is initialized in
xenstore before the vfb device - this ensures all the xenstored setup
from XenD is complete before the xen-vncfb daemon starts. I'm now able
to create & destroy a domain many times over with this patch & never
hit the error message any more.

End quote.

The order in which devices are created is non-random but unpredictable; small,
innocent-looking changes to the xend code can change the order, as can a Python
upgrade.  With the current xend code and Python, vkbd is created before vfb, and
therefore the bug can't bite.

How reproducible:
Not reproducible with the current version.

Actual results:
If it were reproducible, xend would complain:
Device 0 (vkbd) is already connected
and refuse to create the vkbd device, which breaks the pvfb.

Expected results:
Device is created, pvfb works.

Additional info:

Comment 1 RHEL Program Management 2007-06-12 18:34:21 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 Daniel Berrangé 2007-06-16 00:28:23 UTC
This is now built into xen-3.0.3-29.el5 in dist-5E-qu-candidate.

Comment 6 errata-xmlrpc 2007-11-07 17:10:45 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2007-0635.html



Note You need to log in before you can comment on or make changes to this bug.