Description of Problem: This is the REGRESSION issue. This issue does not occur on RHEL5.3GA. On RHEL5.4Alpha, the default value of guests' cpu_weight is 1. This is incompatible to RHEL5.3 or earlier, since the default value of cpu_weight is 256 on them. This change affects scheduling results. Version-Release number of selected component: Red Hat Enterprise Linux Version Number: 5 Release Number: 4 Alpha Architecture: i386 x86_64 ia64 Kernel Version: 2.6.18-152.el5xen Related Package Version: xen-3.0.3-87.el5 Related Middleware / Application: None Drivers or hardware or architecture dependency: None. How reproducible: Always Step to Reproduce: 1. Create a guest without cpu_weight. 2. Run `xm sched-cr -d <guest's name>'. The default value of cpu_weight is shown. Actual Results: {'cap': 0, 'weight': 1} Expected Results: {'cap': 0, 'weight': 256} Business Impact: Customers not using cpu_weight has the serious problem. Dom0's cpu_weight is still 256. Therefore, when dom0 is busy, guests cannot run. Customers using cpu_weight may obtain unexpected CPU utilization of guests. Target Release: 5.4 Errata Request: None Hotfix Request: None Additional Info: - This issue is introduced by Bug 345321: [RHEL 5.2] Xen 3.1.1: Fix sched params to stick on reboot and be accurate in xm list https://bugzilla.redhat.com/show_bug.cgi?id=345321
That's right, default value changed but according to source code it has no effect at all because there was no CPU_CAP before and although the cpu_weight *was* existing there, the function itself had no effect. This C language function of libxc was defined by "return" line only so it did nothing therefore this is not the regression, the value was used never before so the change is mainly not to add cpu_cap to cpu_weight but to add entire cpu_cap/weight code because none of this was working prior to fix in BZ #345321. When there was cpu_weight only it didn't call the function xc.sched_credit_domain_set() which is defined in libxc well but it was calling bogus function xc.set_cpu_weight() or something which had only one return in it's body as a workaround to add only cpu_weight code in the time. Finally, since it was not matching upstream, there was a BZ #345321 about adding not only this cpu_weight support but cpu_cap support as well so this is no regression.
Just a note, the value is changed to 1 to match upstream because function used already existed in upstream but upstream changed this from 256 to 1 as well so this is not the regression if the cpu_weight did nothing before this patch. Therefore I am closing it as NOTABUG...
There is no CPU_CAP, but the weight has adjusted the scheduler priority of guests ever since RHEL 5 GA. Changing the default weight from 256 to 1 could have some effects: 1) unable to create lower-than-default-priority guests 2) breaking scripts that customers may have
But the weight value was never used prior to version with fix for BZ #345321. It was just prepared but the argument was passed to function that did absolutely nothing because it had only return statement. Ad1: I don't think so, this is done to match upstream so that the float is passed there instead of int. So even 0.1 is possible now but it was not before. Ad2: Could you give me example why it should break something? Once again, I repeat, the value was *never* used. It was just passed to dummy function that had only return statement and nothing else. Call to this function has been replaced by xc.sched_credit_domain_set() call which was already defined in libxc but never used. My patch makes use of this function.
This command most definately does something in RHEL 5.3 and before: xm sched-cred -d Domain-0 -w 2048 It raises the priority of domain 0 over that of the guests. The result is very much noticable and visible in top (less steal time in dom0, more in the guests).
Ok, Rik, I need to clarify this issue because now I see what you mean. This worked all the time but it's because it saves the currently set value but the value have never been passed to anything. It ended in libxc function that had the return statement. I see what you should ask now - why did `xm sched-cred -d domain` return some value you have set or something have set (eg. by initial domain creations itself)? The answer is that's because of this is managed by xen itself. There is nothing done when changing domain so in RHEL 5.3 it was doing nothing and it was just ready (prepared) to support it but there was no working implementation. BZ #345321 just added this functionality and also extended it by cpu_cap. We've been solving this with Chris and finally he pointed out that the function called does nothing which is exactly what I am trying to tell you. I see no reason why should people define the cpu_weight manually before xen-3.0.3-85.el5 (where this was build into) because it had no effect. The command you wrote does nothing. It just sets cpu_weight to 2048 but this does nothing else. The support for cpu_weight (to make it do something) was done in BZ #345321 but it was working never before so if the user set this in RHEL 5.3, it had no effect.
Adam, could you give me your domain config file? As you see previous comments, it was not working in previous XenD version so if you're having cpu_weight set in your config files (that did nothing before new RPMs) so setting cpu_weight in previous version of xend had no effect but now it does effect so this is basically a user error to use something that is not working and not the regression...
Created attachment 350757 [details] Fix for this BZ This is the fix for this BZ setting CPU_WEIGHT default value back to 256.
Fix built into xen-3.0.3-89.el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1328.html