Bug 629614
|
Description
Jon Thomas
2010-09-02 14:24:45 UTC
Created attachment 443544 [details]
concurrency-v3.patch
Here's the third version of the patch. First version was: - begin negotiation loop - negotiate - tally usage (any rejections?) - fairshare(# rejections) - goto begin negotiation loop This well for config that are flat where depth of tree is 1 w/ no subgroups. Second/third versions are: - begin loop - bubble up unused slots - fairshare(unused) - roundoff - negotiate - tally usage - set #submitters to reflect usage - goto begin loop The second version handles fairshare for subgroups. For repro, create a group config with autoregroup and set a LIMIT in config such as: users_LIMIT = 1 Then change the sdf of a job to have: concurrency_limits = users What happens is that although the job should have gotten x slots, it gets only one due to concurrency limits. That's correct behavior. The problem is the (x-1) slots don't get distributed across the other groups and you can see this be looking at expected and actual slots used. To thoroughly test this, the concurrency limit needs to be used as 1) non-group user 2) a node that has submitters and children 3) a child 4) in multiple groups - concurrency limits are first come/first served (I think one can config this issue away). The second group that has the same limit may not get any slots, but we need to make sure their slots get used by other groups too. 5) across different users within a group Created attachment 443635 [details]
concurency-v4.patch
yet another patch. I made a last minute change, but to the wrong value.
Created attachment 445772 [details]
concurrency-v5.patch
Another update. This would only impact the 5th test scenario. There was an initialized value inside of a loop when it should have been outside.
Created attachment 446080 [details]
tarball of testing config and submit files w/ README
tarball of some tests derived from jrt's description, plus some more I added
I verified that the -v5 patch works for the testing scenarios jrt described above.
There is a test I added to see if rejected slots get redistributed in a way that preserves fair-share, however I think fairshare may not be preserved:
(following is also in the tarball I attached to this bz, under "test 7" in the readme file)
HFS config (plus CL setup)
==============================================
T1_LIMIT = 1
T2_LIMIT = 2
NUM_CPUS = 20
GROUP_NAMES = a, b, a.m, a.n, b.q, b.r
GROUP_QUOTA_DYNAMIC_a = 0.5
GROUP_QUOTA_DYNAMIC_b = 0.4
GROUP_QUOTA_DYNAMIC_a.m = 0.5
GROUP_QUOTA_DYNAMIC_a.n = 0.4
GROUP_QUOTA_DYNAMIC_b.q = 0.5
GROUP_QUOTA_DYNAMIC_b.r = 0.25
GROUP_AUTOREGROUP_a = TRUE
GROUP_AUTOREGROUP_b = TRUE
GROUP_AUTOREGROUP_a.m = TRUE
GROUP_AUTOREGROUP_a.n = TRUE
GROUP_AUTOREGROUP_b.q = TRUE
GROUP_AUTOREGROUP_b.r = TRUE
==============================================
Submit 10 jobs against CL "T2", and against group "a.user" (two of those jobs will run, leaving 18 slots left)
Submit 15 jobs against group "b.q.user" and "b.r.user" each. What I'm expecting is that fairshare distribution of 18 slots gives 12 to "b.q" and 6 to "b.r" (their quota ratio is 2:1), however the distribution I get is:
% condor_q -run -l | grep AccountingGroup | sort | uniq -c
2 AccountingGroup = "a.user"
14 AccountingGroup = "b.q.user"
4 AccountingGroup = "b.r.user"
another data point: If I take CL out of the picture (that is, no rejects), just submit two jobs against "a", 15 against "b.q", 15 against "b.r", the fairshare works like expected:
% condor_q -run -l | grep AccountingGroup | sort | uniq -c
2 AccountingGroup = "a.user"
12 AccountingGroup = "b.q.user"
6 AccountingGroup = "b.r.user"
Created attachment 446106 [details] negotiator log of match cycle from fair-share failure in comment #9 Created attachment 446107 [details]
negotiator log of match cycle from similar submission w/out rejects
This is a sort of control-submission that works as expected, for comparison against previous attachment (test8.jsub.neglog), which shows failure to achieve fair share ratios when redistributing rejected slots.
Created attachment 446119 [details]
concurrency-v6.patch
This seems to fix the one test case. I have not yet tried the others.
The main fix was trimming out the groups with no submitters from the second iteration. These groups don't negotiate, but their slots were already accounted for in the first pass. This resulted in starvation.
if ( groupArray[i].submitterAds.MyLength() == 0 ) {
dprintf(D_ALWAYS,"Group %s - skipping, no submitters\n",groupArray[i].groupName.c_str());
if (groupArray[i].child==-1) groupArray[i].maxAllowed=0;
else groupArray[i].nodemaxAllowed=0;
continue;
}
This patch also has loophalt set to 2 rather than 5.
Created attachment 446137 [details]
updated test suite for HFS and rejected slot redistribution
Rewrote test suite to be cleaner and easier. Added a "non-CL reject" test just to sanity check that reason for rejection doesn't affect behavior of HFS.
The -v6 patch passes all these tests.
The -v6 patch is passing all my tests. May want to consider setting loophalt higher than 2. Would catch cases where slots untried in previous iteration were rejected in latest iteration. Previous setting of 5 is probably plenty. That was my reasoning for 5. I'm fine with leaving it in. I was trying to come up with a number that made sense but didn't impact performance too much. Fix has been pushed to V7_4-HFS-branch. Equivalent to patch -v6, except loophalt threshold set back to 5. Build in 7.4.4-0.12 Created attachment 450437 [details]
negotiator log for test7.jsub
I run test7 from attachment after submitting two jobs one run as "b.user" second as "none.user". The result of the test7 was not as expected, problem is that wrong number of slots goes to second iteration of negotiation cycle.
Tested with:
condor-7.4.4-0.16
Scenario:
$ cat submit.file
cmd = /bin/sleep
args = 2
+AccountingGroup = "b.user"
queue
+AccountingGroup = "none.user"
queue
$ condor_submit submit.file
Submitting job(s)..
2 job(s) submitted to cluster 1.
--- wait until all jobs finish
$ condor_q | grep jobs
0 jobs; 0 idle, 0 running, 0 held
$ condor_submit test7.jsub
Submitting job(s)........................................
40 job(s) submitted to cluster 2.
--- The result was
$ condor_q -run -l | grep AccountingGroup | sort | uniq -c
2 AccountingGroup = "a.user"
14 AccountingGroup = "b.q.user"
4 AccountingGroup = "b.r.user"
The result should be "a.user": 2, "b.q.user": 12, "b.r.user": 6
In negotiator log, it can be found that in first iteration "a.user" gets 2 slots, because of concurrency limit, "b.q.user" gets 6 slots and "b.r.user" obtains 4 slots.
So only 8 slots are not used but in logs there is: "Failed to match 12.800001 slots on iteration 1."
It looks like what happened was that the jobs for nongroup and jobs for group b were in the process of ending at the start of the negotiation cycle. There were classAds for these jobs, but the values for
ad->LookupInteger(ATTR_RUNNING_JOBS, numrunning);
ad->LookupInteger(ATTR_IDLE_JOBS, numidle);
were zero. This meant numsubmits was zero. Since there was a classAd, the ad was inserted into the list for the group.
groupArray[0].submitterAds.Insert(ad);
Later, this caused a problem with
if ( groupArray[i].submitterAds.MyLength() == 0 ) {
So I think,
if ( groupArray[i].submitterAds.MyLength() == 0 ) {
should be changed to
if ( groupArray[i].numsubmits == 0 ) {
We use numsubmits in a number of places, so this should be better. Negotiation is skipped in negotiatewithgroup based upon:
num_idle_jobs = 0;
schedd->LookupInteger(ATTR_IDLE_JOBS,num_idle_jobs);
if ( num_idle_jobs < 0 ) {
num_idle_jobs = 0;
}
This is where we get the ..." skipped because no idle jobs" message. So I think the later stages of negotiation rely on these same attributes. If we make the change, negotiation would be skipped higher in the call chain and it would not add to the number of unused slots.
New bug 639244 was created to cover Comment 18. Tested with (version):
condor-7.4.4-0.16
Reproduction:
All tests from attachment were successful if they run on clear condor - no jobs had been running before them.
Tested on:
RHEL5 x86_64,i386 - passed
RHEL4 x86_64,i386 - passed
>>> VERIFIED
|