Bug 228890 - Pinned CPUs of guests are unpinned after save/restore sequence
Summary: Pinned CPUs of guests are unpinned after save/restore sequence
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen
Version: 5.0
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Jiri Denemark
QA Contact:
URL:
Whiteboard:
: 345331 (view as bug list)
Depends On:
Blocks: 477162
TreeView+ depends on / blocked
 
Reported: 2007-02-15 19:17 UTC by Joseph Szczypek
Modified: 2018-10-20 02:34 UTC (History)
9 users (show)

Fixed In Version: xen-3.0.3-85.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-02 10:06:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch to fix this bug (4.44 KB, patch)
2008-12-02 17:36 UTC, Jiri Denemark
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:1328 0 normal SHIPPED_LIVE xen bug fix and enhancement update 2009-09-01 10:32:30 UTC

Description Joseph Szczypek 2007-02-15 19:17:07 UTC
Description of problem:
I have a 4 core 16GB x86_64 system.  Dom0 is set up to use 1GB.  There are three
paravirtualized guests, where each guest has one VCPU which is pinned to one
physical core (using the 'cpus=' entry in the guest's config file, ie: cpus =
"1").  Each guest gets its own physical core.  Guests are started with 'xm
create' and 'xm vcpu-list' shows correct cpu affinity.  Guests are setup to
start automatically upon bootup of Dom0 (there are entries for the guests in
/etc/xen/auto.)  After reboot of Dom0, 'xm vcpu-list' indicates guests now have
'any cpu'.  

Dom0 and guests are all running the same version of OS.

Version-Release number of selected component (if applicable):
2.6.18-8.el5xen 

How reproducible:
Every time

Steps to Reproduce:
1.Pin VCPU is a physical cpu via config file entry 'cpus=', start up guest, and
verify cpu affinity with 'xm vcpu-list' command
2.Reboot dom0, which will save/restore guest
3.Do another 'xm vcpu-list' and check cpu affinity
  
Actual results:
'xm vcpu-list' states that cpu-affinity is 'any cpu'.

Expected results:
'xm vcpu-list' would list cpu-affinity as set up in guest config file.

Additional info:
Disabling save/restore feature in /etc/sysconfig/xendomains (by making
XENDOMAINS_SAVE equal to nothing) does result in cpu affinity being as expected
after a reboot.

Comment 1 Daniel Berrangé 2007-03-27 15:45:40 UTC
After a little investigation this is a flaw in the save/restore code, rather
than the initscript:

# virsh vcpuinfo demo
VCPU:           0
CPU:            0
State:          blocked
CPU time:       9.6s
CPU Affinity:   yy

# virsh vcpupin demo 0 1

[root@pumpkin ~]# virsh vcpuinfo demo
VCPU:           0
CPU:            1
State:          blocked
CPU time:       17.1s
CPU Affinity:   -y

# virsh save demo /var/lib/xen/demo.save 
Domain demo saved to /var/lib/xen/demo.save

# virsh restore /var/lib/xen/demo.save 
Domain restored from /var/lib/xen/demo.save

# virsh vcpuinfo demo
VCPU:           0
CPU:            0
State:          blocked
CPU time:       0.0s
CPU Affinity:   yy



Comment 2 RHEL Program Management 2007-03-27 16:03:54 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 RHEL Program Management 2007-09-07 19:55:06 UTC
This request was previously evaluated by Red Hat Product Management
for inclusion in the current Red Hat Enterprise Linux release, but
Red Hat was unable to resolve it in time.  This request will be
reviewed for a future Red Hat Enterprise Linux release.

Comment 4 Joe Pruett 2008-10-03 16:22:46 UTC
i can confirm that this bug still exists in 5.2.  i'll see if there is a fix from xensource that can be rolled in.

Comment 6 Chris Lalancette 2008-10-14 17:45:42 UTC
*** Bug 345331 has been marked as a duplicate of this bug. ***

Comment 7 Jiri Denemark 2008-12-02 17:36:29 UTC
Created attachment 325394 [details]
Patch to fix this bug

First, CPU affinity information is lost when saving a guest as it is not saved
to its state file. Second, CPU affinity is ignored when resuming a guest.

The attached patch fixes both sides of this bug by combining several upstream 
patches...

The second and the third chunk (-317,7 +325,7; -348,22 +356,33) is a backport 
from http://xenbits.xensource.com/xen-unstable.hg?rev/853853686147. Despite 
its upstream description, it does not fix this bug, it only allows CPUs to be 
specified as a list.

The forth chunk (-1271,6 +1290,9) outputs cpus list into sxpr which is than 
saved into guest's state file. This is similar to 
http://xenbits.xensource.com/xen-unstable.hg?rev/85198c4d4da5.

That was for the first side of this bug. The other side is covered by the rest 
of the patch (1st, 5th, 6th chunk), which is a backport of 
http://xenbits.xensource.com/xen-unstable.hg?rev/76e90ac5067e and 
http://xenbits.xensource.com/xen-unstable.hg?rev/857bda0c15b3 (actually, to be 
more exact, the upstream patches were forward-ported from RHEL fix).

Comment 8 Jiri Denemark 2009-02-23 11:07:13 UTC
A test package which fixes this issue (and several others as well) has been
made available at:

http://people.redhat.com/jdenemar/xen/

Could the reporter try it out and report if it fixes the problem or not?

Thank you for your cooperation.

Comment 9 Joseph Szczypek 2009-02-25 16:24:48 UTC
(In reply to comment #8)
> A test package which fixes this issue (and several others as well) has been
> made available at:
> 
> http://people.redhat.com/jdenemar/xen/
> 
> Could the reporter try it out and report if it fixes the problem or not?
> 
> Thank you for your cooperation.

I will try out the test package.

Comment 10 Joseph Szczypek 2009-02-27 23:14:51 UTC
Took a look at the packages.   I'm have stock bits installed, not updated bits (I have a 5.0 system set up, also 5.1 - well, had a 5.1 - applied the packages to it but now it doesn't seem to work right.).   I may need to apply the src code and build a new xen.   To which releases can I apply the source?  5.1? 5.2? 5.0?   The packages all are labeled 3.0.3 but it looks like I need Xen 3.1?

Comment 11 Chris Lalancette 2009-02-28 11:12:49 UTC
Oh yeah, 5.0 is way too old.  Even 5.1 is probably too old.  I would suggest upgrading the machine to 5.3 if you can, and then trying the updated packages based on that.  If you don't have 5.3, 5.2 will *probably* also work, although there was a fair bit of work done between 5.2 and 5.3, so YMMV.

Chris Lalancette

Comment 12 Joseph Szczypek 2009-03-06 02:52:50 UTC
Applied the xen (x86_64) and xen-libs (x86_64 and i386) rpms to a 5.3 system.  Reran the reproducer.   Problem no longer present.  Guest had cpu affinity defined in the config file.   It was started.   A reboot of Dom0 was done (with a Save/Restore across this reboot).  Affinity was correct after reboot.

I also tried an experiment with a guest which had no cpu affinity in the config file.  I pinned the guest to a couple of cpus with virsh vcpupin.  I then rebooted Dom0.   Save/Restore was done, but CPU affinity was not preserved across reboot for this case.

Comment 13 Jiri Denemark 2009-03-06 10:21:05 UTC
Great, thank you for the testing.

The result of the second experiment is actually an expected behavior. Runtime changes to CPU affinity are not preserved during save/restore. Whatever dynamically changes CPU affinity for running domains has to update it after restoring saved a domain.

The idea is that if it's in a config file, it's used for environments with a static set of running guests. If there is a need to change the affinity in runtime, it's most likely because the set of running guests changes dynamically and preserving CPU affinity during save/restore would likely lead to collisions with other guests.

Comment 14 Jiri Denemark 2009-05-11 13:39:45 UTC
Fix built into xen-3.0.3-85.el5

Comment 16 Chris Ward 2009-07-03 17:57:03 UTC
~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.

Questions can be posted to this bug or your customer or partner representative.

Comment 17 zhanghaiyan 2009-07-30 02:28:55 UTC
Verified this bug on xen-3.0.3-91.el5 and cannot reproduce.

Comment 19 errata-xmlrpc 2009-09-02 10:06:58 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1328.html


Note You need to log in before you can comment on or make changes to this bug.