Created attachment 373753 [details] Patch to fix the problem Description of problem: Previously, in RHEL5.3, there was just one [xenfb thread] in an RHEL5 Xen PV guest reported by ps and over subsequnent live migrations or save/restore cycles that one thread stayed. Some time in RHEl5.4, I'm not sure when, the patch linux-2.6-xen-fbfront-dirty-race.patch was introduced to fix bug 456893 (which I can't read) and this resulted in an initial two xenfb threads followed by another two every time the guest was live migrated or saved and restored. Version-Release number of selected component (if applicable): kernel-xen-2.6.18-164.el5 and later (and possibly earlier) How reproducible: Always Steps to Reproduce: 1. ps -ef | grep xenfb 2. xm save; xm restore in dom0 or live migrate 3. ps -ef | grep xenfb Actual results: At step 1 you'll see ps reporting two [xenfb thread] kernel threads, at step 3 you'll see four of them. If you repeat step 2 you'll get another two xenfb threads. Eventually, I'm sure Bad Things Will Happen. Expected results: Just one [xenfb thread] Additional info: In linux-2.6-xen-fbfront-dirty-race.patch creation of the xenfb_thread thread is correctly deferred until XenbusStateConnected. However, that state is entered every time the framebuffer is connected -- this seems to happen twice on initial boot and another twice after each successful save/restore or live migration. The attached patch simply avoids creating the thread if it already exists. There doesn't seem to be any possibility of introducing another race condition (the previous check for dev->state would seem to be enough of a barrier) and there also doesn't seem to be any need to create a new thread rather that using the existing one.
Chris Lalancette was able to reproduce the bug. The proposed patch looks like the simplest way to fix it.
John, do you intend to submit your patch for linux-2.6.18-xen.hg?
Markus, Just as I started looking into this I found that Chris Lalancette had already sent it to xen-devel (thanks Chris). http://markmail.org/thread/3lynfle5mr6x2ro6
Sorry, I should have touched base with Chris before bothering you. Thanks!
in kernel-2.6.18-179.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please update the appropriate value in the Verified field (cf_verified) to indicate this fix has been successfully verified. Include a comment with verification details.
Your kernel works for us (sorry, forgot to update the bug earlier).
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html