Red Hat Bugzilla – Bug 228890
Pinned CPUs of guests are unpinned after save/restore sequence
Last modified: 2010-10-22 09:04:01 EDT
Description of problem:
I have a 4 core 16GB x86_64 system. Dom0 is set up to use 1GB. There are three
paravirtualized guests, where each guest has one VCPU which is pinned to one
physical core (using the 'cpus=' entry in the guest's config file, ie: cpus =
"1"). Each guest gets its own physical core. Guests are started with 'xm
create' and 'xm vcpu-list' shows correct cpu affinity. Guests are setup to
start automatically upon bootup of Dom0 (there are entries for the guests in
/etc/xen/auto.) After reboot of Dom0, 'xm vcpu-list' indicates guests now have
Dom0 and guests are all running the same version of OS.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Pin VCPU is a physical cpu via config file entry 'cpus=', start up guest, and
verify cpu affinity with 'xm vcpu-list' command
2.Reboot dom0, which will save/restore guest
3.Do another 'xm vcpu-list' and check cpu affinity
'xm vcpu-list' states that cpu-affinity is 'any cpu'.
'xm vcpu-list' would list cpu-affinity as set up in guest config file.
Disabling save/restore feature in /etc/sysconfig/xendomains (by making
XENDOMAINS_SAVE equal to nothing) does result in cpu affinity being as expected
after a reboot.
After a little investigation this is a flaw in the save/restore code, rather
than the initscript:
# virsh vcpuinfo demo
CPU time: 9.6s
CPU Affinity: yy
# virsh vcpupin demo 0 1
[root@pumpkin ~]# virsh vcpuinfo demo
CPU time: 17.1s
CPU Affinity: -y
# virsh save demo /var/lib/xen/demo.save
Domain demo saved to /var/lib/xen/demo.save
# virsh restore /var/lib/xen/demo.save
Domain restored from /var/lib/xen/demo.save
# virsh vcpuinfo demo
CPU time: 0.0s
CPU Affinity: yy
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
This request was previously evaluated by Red Hat Product Management
for inclusion in the current Red Hat Enterprise Linux release, but
Red Hat was unable to resolve it in time. This request will be
reviewed for a future Red Hat Enterprise Linux release.
i can confirm that this bug still exists in 5.2. i'll see if there is a fix from xensource that can be rolled in.
Duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=345331
*** Bug 345331 has been marked as a duplicate of this bug. ***
Created attachment 325394 [details]
Patch to fix this bug
First, CPU affinity information is lost when saving a guest as it is not saved
to its state file. Second, CPU affinity is ignored when resuming a guest.
The attached patch fixes both sides of this bug by combining several upstream
The second and the third chunk (-317,7 +325,7; -348,22 +356,33) is a backport
from http://xenbits.xensource.com/xen-unstable.hg?rev/853853686147. Despite
its upstream description, it does not fix this bug, it only allows CPUs to be
specified as a list.
The forth chunk (-1271,6 +1290,9) outputs cpus list into sxpr which is than
saved into guest's state file. This is similar to
That was for the first side of this bug. The other side is covered by the rest
of the patch (1st, 5th, 6th chunk), which is a backport of
http://xenbits.xensource.com/xen-unstable.hg?rev/857bda0c15b3 (actually, to be
more exact, the upstream patches were forward-ported from RHEL fix).
A test package which fixes this issue (and several others as well) has been
made available at:
Could the reporter try it out and report if it fixes the problem or not?
Thank you for your cooperation.
(In reply to comment #8)
> A test package which fixes this issue (and several others as well) has been
> made available at:
> Could the reporter try it out and report if it fixes the problem or not?
> Thank you for your cooperation.
I will try out the test package.
Took a look at the packages. I'm have stock bits installed, not updated bits (I have a 5.0 system set up, also 5.1 - well, had a 5.1 - applied the packages to it but now it doesn't seem to work right.). I may need to apply the src code and build a new xen. To which releases can I apply the source? 5.1? 5.2? 5.0? The packages all are labeled 3.0.3 but it looks like I need Xen 3.1?
Oh yeah, 5.0 is way too old. Even 5.1 is probably too old. I would suggest upgrading the machine to 5.3 if you can, and then trying the updated packages based on that. If you don't have 5.3, 5.2 will *probably* also work, although there was a fair bit of work done between 5.2 and 5.3, so YMMV.
Applied the xen (x86_64) and xen-libs (x86_64 and i386) rpms to a 5.3 system. Reran the reproducer. Problem no longer present. Guest had cpu affinity defined in the config file. It was started. A reboot of Dom0 was done (with a Save/Restore across this reboot). Affinity was correct after reboot.
I also tried an experiment with a guest which had no cpu affinity in the config file. I pinned the guest to a couple of cpus with virsh vcpupin. I then rebooted Dom0. Save/Restore was done, but CPU affinity was not preserved across reboot for this case.
Great, thank you for the testing.
The result of the second experiment is actually an expected behavior. Runtime changes to CPU affinity are not preserved during save/restore. Whatever dynamically changes CPU affinity for running domains has to update it after restoring saved a domain.
The idea is that if it's in a config file, it's used for environments with a static set of running guests. If there is a need to change the affinity in runtime, it's most likely because the set of running guests changes dynamically and preserving CPU affinity during save/restore would likely lead to collisions with other guests.
Fix built into xen-3.0.3-85.el5
~~ Attention - RHEL 5.4 Beta Released! ~~
RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!
If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.
Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.
Questions can be posted to this bug or your customer or partner representative.
Verified this bug on xen-3.0.3-91.el5 and cannot reproduce.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.