Bug 513211 - save/restore will reset cpu weight for the domain
Summary: save/restore will reset cpu weight for the domain
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen
Version: 5.3
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Andrew Jones
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-07-22 14:38 UTC by Sachin Prabhu
Modified: 2018-10-27 15:56 UTC (History)
8 users (show)

Fixed In Version: xen-3.0.3-97.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 08:58:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
xend: save-restore resets cpu params (704 bytes, patch)
2009-08-05 15:31 UTC, Andrew Jones
no flags Details | Diff
ver2: save-restore preserve cpu params plus remove redundant param setting (2.32 KB, patch)
2009-10-07 11:17 UTC, Andrew Jones
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0294 0 normal SHIPPED_LIVE xen bug fix and enhancement update 2010-03-29 14:20:32 UTC

Description Sachin Prabhu 2009-07-22 14:38:51 UTC
The cpu weight of a domain gets reset to the default one after a save/restore.

Step to Reproduce:
1. Create a guest whose cpu_weight is 512

# xm sched-cr -d rhel5_fv_21
{'cap': 0, 'weight': 256}
# xm sched-credit -d rhel5_fv_21 -w 512 
# xm sched-cr -d rhel5_fv_21
{'cap': 0, 'weight': 512}


2. Save the guest with xm save <guest's name> savefile
#xm save rhel5_fv_21 51

3. Restore the guest with xm restore savefile
# xm restore 51

4. Show cpu_weight of the guest with xm sched-cr -d <guest's name>
# xm sched-cr -d rhel5_fv_21
{'cap': 0, 'weight': 256}


Actual Results:
  {'cap': 0, 'weight': 256}

Expected Results:
   {'cap': 0, 'weight': 512}

Comment 1 Andrew Jones 2009-07-24 11:15:39 UTC
I've recreated the problem on the latest 5.4 and now I'm taking a look.

Comment 2 Andrew Jones 2009-07-24 14:24:21 UTC
I don't have an upstream build to test on right now.  I'm working on that.  In the meantime though, I'll just ask.  Do you know if this problem exists upstream as well?

Comment 8 Andrew Jones 2009-07-27 15:31:21 UTC
The save/restore of weight and cap doesn't appear to be implemented either in RHEL codes or in the upstream.  In both streams there is a block of code like the the following in xend/XendDomainInfo.py that gets run during the restore path.

        # Check for cpu_{cap|weight} validity for credit scheduler if used
        if xen.lowlevel.xc.xc().sched_id_get() == xen.lowlevel.xc.XEN_SCHEDULER_CREDIT:
            cap = self.getCap()
            weight = self.getWeight()

            assert type(weight) == int
            assert type(cap) == int

            if weight < 1 or weight > 65535:
                raise VmError("Cpu weight out of range, valid values are within range from 1 to 65535")

            if cap < 0 or cap > self.getVCpuCount() * 100:
                raise VmError("Cpu cap out of range, valid range is from 0 to %s for specified number of vcpus" %
                              (self.getVCpuCount() * 100))


What's strange is we do nothing with weight and cap after we've grabbed and validated them.  The fix is to send these values back down to the hypervisor with a domctl call, i.e.

diff xend/XendDomainInfo.py xend/XendDomainInfo.py-bz513211 
1863a1864,1865
>             xc.sched_credit_domain_set(self.domid, weight, cap)
> 

I believe the same fix will work for upstream.  After some upstream testing, I'll submit a patch there first to get feedback before submitting a final patch here.

Comment 9 Andrew Jones 2009-07-31 12:06:44 UTC
I submitted a patch upstream.  In the upstream the save path also needed a small fix, but the restore path change is the same as in #c8.  I'm waiting for some feedback from that post, but I think the fix for at least RHEL will remain as is.

Comment 10 Andrew Jones 2009-08-05 13:59:35 UTC
It turns out there was an upstream changeset added 2 weeks ago that fixes the exact same issue.  I missed it since I messed up and didn't do my testing on the tip.  Feedback from upstream pointed me to it, which is here, http://xenbits.xensource.com/xen-unstable.hg?rev/e07726c03d31.

All that said, the "port" of the upstream patch to rhel will remain the same one-liner as in #c8. I'll post the patch in a moment.

Comment 11 Andrew Jones 2009-08-05 15:31:10 UTC
Created attachment 356327 [details]
xend: save-restore resets cpu params

Comment 12 Andrew Jones 2009-10-07 11:17:31 UTC
Created attachment 363955 [details]
ver2: save-restore preserve cpu params plus remove redundant param setting

In review it was found that this patch introduces a redundant param setting on the create path. The extra param setting call is removed in this version (2) of the patch.

Comment 14 Jiri Denemark 2009-11-13 22:23:31 UTC
Fix built into xen-3.0.3-97.el5

Comment 18 errata-xmlrpc 2010-03-30 08:58:59 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0294.html

Comment 20 Paolo Bonzini 2010-04-08 15:49:11 UTC
This bug was closed during 5.5 development and it's being removed from the internal tracking bugs (which are now for 5.6).


Note You need to log in before you can comment on or make changes to this bug.