Bug 541325

Summary: [RHEL5]: A new xenfb thread is created on every save/restore
Product: Red Hat Enterprise Linux 5 Reporter: john.haxby <john.haxby>
Component: kernel-xenAssignee: Chris Lalancette <clalance>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: low    
Version: 5.4CC: armbru, clalance, cward, jch, llim, pbonzini, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 543823 (view as bug list) Environment:
Last Closed: 2010-03-30 07:40:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 543823    
Attachments:
Description Flags
Patch to fix the problem none

Description john.haxby@oracle.com 2009-11-25 14:22:16 UTC
Created attachment 373753 [details]
Patch to fix the problem

Description of problem:

Previously, in RHEL5.3, there was just one [xenfb thread] in an RHEL5 Xen PV guest reported by ps and over subsequnent live migrations or save/restore cycles that one thread stayed.  Some time in RHEl5.4, I'm not sure when, the patch  linux-2.6-xen-fbfront-dirty-race.patch was introduced to fix bug 456893 (which I can't read) and this resulted in an initial two xenfb threads followed by another two every time the guest was live migrated or saved and restored.

Version-Release number of selected component (if applicable):
    kernel-xen-2.6.18-164.el5 and later (and possibly earlier)


How reproducible: Always


Steps to Reproduce:
1. ps -ef | grep xenfb
2. xm save; xm restore in dom0 or live migrate
3. ps -ef | grep xenfb
  
Actual results:
   At step 1 you'll see ps reporting two [xenfb thread] kernel threads, at step
   3 you'll see four of them.  If you repeat step 2 you'll get another two
   xenfb threads.   Eventually, I'm sure Bad Things Will Happen.

Expected results:
   Just one [xenfb thread]

Additional info:
   In linux-2.6-xen-fbfront-dirty-race.patch creation of the xenfb_thread
   thread is correctly deferred until XenbusStateConnected.  However, that
   state is entered every time the framebuffer is connected -- this seems to
   happen twice on initial boot and another twice after each successful
   save/restore or live migration.

   The attached patch simply avoids creating the thread if it already exists.
   There doesn't seem to be any possibility of introducing another race
   condition (the previous check for dev->state would seem to be enough of
   a barrier) and there also doesn't seem to be any need to create a new thread
   rather that using the existing one.

Comment 2 Markus Armbruster 2009-12-02 09:22:08 UTC
Chris Lalancette was able to reproduce the bug.  The proposed patch looks like the simplest way to fix it.

Comment 3 Markus Armbruster 2009-12-03 10:28:23 UTC
John, do you intend to submit your patch for linux-2.6.18-xen.hg?

Comment 4 John Haxby 2009-12-03 12:20:46 UTC
Markus,

Just as I started looking into this I found that Chris Lalancette had already
sent it to xen-devel (thanks Chris).

http://markmail.org/thread/3lynfle5mr6x2ro6

Comment 5 Markus Armbruster 2009-12-03 12:41:55 UTC
Sorry, I should have touched base with Chris before bothering you.  Thanks!

Comment 6 Don Zickus 2009-12-11 19:31:13 UTC
in kernel-2.6.18-179.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please update the appropriate value in the Verified field
(cf_verified) to indicate this fix has been successfully
verified. Include a comment with verification details.

Comment 8 john.haxby@oracle.com 2009-12-23 12:59:38 UTC
Your kernel works for us (sorry, forgot to update the bug earlier).

Comment 10 errata-xmlrpc 2010-03-30 07:40:54 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html