Bug 243884

Summary: Race in vfb/vkbd device setup
Product: Red Hat Enterprise Linux 5 Reporter: Markus Armbruster <armbru>
Component: xenAssignee: Markus Armbruster <armbru>
Status: CLOSED ERRATA QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 5.0CC: xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHEA-2007-0635 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-07 17:10:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Markus Armbruster 2007-06-12 15:25:06 UTC
Description of problem:
Fixed in xen-unstable cset 14926.  Quoting its log message:

 1. XendDomainInfo._createDevices() gets a list of devices to be
    created from XendConfig.ordered_device_refs().
    On a simple guest, this has 4 devices, vfb, vbd, vif, vkbd - in
    that order.

 2. It iterates over those devices, creating the appropriate
    DevController subclass instance, and then calling createDevice()
    on that object.

 3. When createDevice() is called on the vfb  device, it spawns
    xen-vncfb daemon.

 4. During startup xen-vncfb writes into the backend paths
            /local/domain/0/backend/vfb/0
     And
            /local/domain/0/backend/vkbd/0

 5. When createDevice() is called on the vkbd device in XenD, if the
    2nd xenstore path write from step 4 has occurred, then you'll hit
    the 'Device 0 (vkbd) is already connected' error. If the 2nd path
    write didn't complete yet then everything is fine.

I think the reason it often works once after boot is that loading
xen-vncfb from disk the first time around is just enough of a slow
down to ensure step 5 occurs before the 2nd xenstore write in step 4
has occurred.

The key seems to be to ensure the vkbd device is initialized in
xenstore before the vfb device - this ensures all the xenstored setup
from XenD is complete before the xen-vncfb daemon starts. I'm now able
to create & destroy a domain many times over with this patch & never
hit the error message any more.

End quote.

The order in which devices are created is non-random but unpredictable; small,
innocent-looking changes to the xend code can change the order, as can a Python
upgrade.  With the current xend code and Python, vkbd is created before vfb, and
therefore the bug can't bite.

How reproducible:
Not reproducible with the current version.

Actual results:
If it were reproducible, xend would complain:
Device 0 (vkbd) is already connected
and refuse to create the vkbd device, which breaks the pvfb.

Expected results:
Device is created, pvfb works.

Additional info:

Comment 1 RHEL Program Management 2007-06-12 18:34:21 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 Daniel Berrangé 2007-06-16 00:28:23 UTC
This is now built into xen-3.0.3-29.el5 in dist-5E-qu-candidate.

Comment 6 errata-xmlrpc 2007-11-07 17:10:45 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2007-0635.html