Bug 712987 - sort on most starved in negotiateWithGroup
Summary: sort on most starved in negotiateWithGroup
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 2.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: 2.0.1
: ---
Assignee: Erik Erlandson
QA Contact: Tomas Rusnak
URL:
Whiteboard:
Depends On:
Blocks: 723887
TreeView+ depends on / blocked
 
Reported: 2011-06-13 19:51 UTC by Jon Thomas
Modified: 2011-09-07 16:43 UTC (History)
5 users (show)

Fixed In Version: condor-7.6.3-0.2
Doc Type: Bug Fix
Doc Text:
Cause: Secondary submitter sort by name, previously used to avoid counting submitters twice when computing normalization factor Consequence: When all submitters have same priority (e.g. configured with a very large PRIORITY_HALFLIFE), the secondary sort by name resulted in submitter starvation because same submitters negotiated first each time. Fix: Computation of normalization factor was updated to not be dependent on secondary sort by name. Submitter sort was changed so that primary sort was by submitter priority, and secondary sort by starvation ratio usage/(usage+submitter_limit), on 1st pie spin. Result: Submitters are no longer starved when a large PRIORITY_HALFLIFE forces priorities to equality.
Clone Of:
Environment:
Last Closed: 2011-09-07 16:43:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1249 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Grid 2.0 security, bug fix and enhancement update 2011-09-07 16:40:45 UTC

Description Jon Thomas 2011-06-13 19:51:56 UTC
// ----- Sort the schedd list in decreasing priority order
dprintf( D_ALWAYS, "Phase 3:  Sorting submitter ads by priority ...\n" );
scheddAds.Sort( (lessThanFunc)comparisonFunction, this );

Currently the code sorts on prio. Comments indicate it does a random secondary sort of names, but it looks like it doesn't really. The sort needs to be changed to either

1) based on starvation

2) based on prio and then a true secondary sort on starvation.

Comment 2 Erik Erlandson 2011-06-15 17:15:01 UTC
upstream: https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2240

Comment 4 Erik Erlandson 2011-06-24 16:37:55 UTC
repro/test

Using the following configuration, which sets a very large PRIORITY_HALFLIFE so that user priorities all remain equal at 0.5:

CLAIM_WORKLIFE = 0
NEGOTIATOR_CONSIDER_PREEMPTION = FALSE
NEGOTIATOR_DEBUG = D_FULLDEBUG

NEGOTIATOR_USE_SLOT_WEIGHTS = FALSE

GROUP_QUOTA_MAX_ALLOCATION_ROUNDS = 1

NEGOTIATOR_INTERVAL = 30
SCHEDD_INTERVAL = 15

PRIORITY_HALFLIFE = 1e100

NUM_CPUS = 20

GROUP_NAMES = a
GROUP_QUOTA_a = 20

GROUP_ACCEPT_SURPLUS = FALSE



The following three job submissions are used in a sequence (submission 2 will be used twice). They are designed to switch the starvation order, to demonstrate that the new secondary sort on starvation is working (and demonstrate it doesn't happen in previous code)

Submission (1): will result in "a.u1" having a higher ratio than "a.u2" (affects order on subsequent negotiation round)

universe = vanilla
cmd = /bin/sleep
args = 600
should_transfer_files = if_needed
when_to_transfer_output = on_exit
+AccountingGroup="a.u2"
queue 1
+AccountingGroup="a.u1"
queue 2

Submission (2): leaves starvation order unchanged from previous negotiation round to demonstrate re-ordering of submitters from previous submission

universe = vanilla
cmd = /bin/sleep
args = 600
should_transfer_files = if_needed
when_to_transfer_output = on_exit
+AccountingGroup="a.u2"
queue 1
+AccountingGroup="a.u1"
queue 1

Submission (3): switches the order so that now "a.u2" has the higher ratio

universe = vanilla
cmd = /bin/sleep
args = 600
should_transfer_files = if_needed
when_to_transfer_output = on_exit
+AccountingGroup="a.u2"
queue 3
+AccountingGroup="a.u1"
queue 1



The order of submissions is (1), (2), (3), (2) (allow each submission to negotiate before submitting the next).

Repro: before fix, we see that "a.u1" always negotiates first, because all priorities are equal, and secondary sort is by submitter name:

$ tail -f NegotiatorLog | grep -e 'Negotiating with.* at' -e 'SubmitterPrio .*=' -e 'submitterLimit .*=' -e 'submitterUsage .*='
# submission (1)
06/24/11 08:32:13   Negotiating with a.u1@localdomain at <192.168.1.2:45791>
06/24/11 08:32:13     SubmitterPrio       = 0.500000
06/24/11 08:32:13     submitterLimit    = 1.500000
06/24/11 08:32:13     submitterUsage    = 0.000000
06/24/11 08:32:13   Negotiating with a.u2@localdomain at <192.168.1.2:45791>
06/24/11 08:32:13     SubmitterPrio       = 0.500000
06/24/11 08:32:13     submitterLimit    = 1.500000
06/24/11 08:32:13     submitterUsage    = 0.000000
06/24/11 08:32:14   Negotiating with a.u1@localdomain at <192.168.1.2:45791>
06/24/11 08:32:14     SubmitterPrio       = 0.500000
06/24/11 08:32:14     submitterLimit    = 1.000000
06/24/11 08:32:14     submitterUsage    = 1.000000

# submission (2)
06/24/11 08:32:34   Negotiating with a.u1@localdomain at <192.168.1.2:45791>
06/24/11 08:32:34     SubmitterPrio       = 0.500000
06/24/11 08:32:34     submitterLimit    = 0.500000
06/24/11 08:32:34     submitterUsage    = 2.000000
06/24/11 08:32:34   Negotiating with a.u2@localdomain at <192.168.1.2:45791>
06/24/11 08:32:34     SubmitterPrio       = 0.500000
06/24/11 08:32:34     submitterLimit    = 1.000000 (starved 0.500000)
06/24/11 08:32:34     submitterUsage    = 1.000000

# submission (3)
06/24/11 08:32:55   Negotiating with a.u1@localdomain at <192.168.1.2:45791>
06/24/11 08:32:55     SubmitterPrio       = 0.500000
06/24/11 08:32:55     submitterLimit    = 1.500000
06/24/11 08:32:55     submitterUsage    = 3.000000
06/24/11 08:32:55   Negotiating with a.u2@localdomain at <192.168.1.2:45791>
06/24/11 08:32:55     SubmitterPrio       = 0.500000
06/24/11 08:32:55     submitterLimit    = 2.500000
06/24/11 08:32:55     submitterUsage    = 2.000000
06/24/11 08:32:55   Negotiating with a.u2@localdomain at <192.168.1.2:45791>
06/24/11 08:32:55     SubmitterPrio       = 0.500000
06/24/11 08:32:55     submitterLimit    = 1.000000
06/24/11 08:32:55     submitterUsage    = 4.000000

# submission (2) again
06/24/11 08:33:16   Negotiating with a.u1@localdomain at <192.168.1.2:45791>
06/24/11 08:33:16     SubmitterPrio       = 0.500000
06/24/11 08:33:16     submitterLimit    = 1.500000
06/24/11 08:33:16     submitterUsage    = 4.000000
06/24/11 08:33:16   Negotiating with a.u2@localdomain at <192.168.1.2:45791>
06/24/11 08:33:16     SubmitterPrio       = 0.500000
06/24/11 08:33:16     submitterLimit    = 0.500000
06/24/11 08:33:16     submitterUsage    = 5.000000



After the fix: we see that "a.u2" gets to negotiate first when it's (incoming) ratio is the lowest:

$ tail -f NegotiatorLog | grep -e 'Negotiating with.* at' -e 'SubmitterPrio .*=' -e 'submitterLimit .*=' -e 'submitterUsage .*='
# submission (1) (set up a.u2 to go first next round)
06/24/11 08:03:06   Negotiating with a.u1@localdomain at <192.168.1.2:39904>
06/24/11 08:03:06     SubmitterPrio       = 0.500000
06/24/11 08:03:06     submitterLimit    = 1.500000
06/24/11 08:03:06     submitterUsage    = 0.000000
06/24/11 08:03:06   Negotiating with a.u2@localdomain at <192.168.1.2:39904>
06/24/11 08:03:06     SubmitterPrio       = 0.500000
06/24/11 08:03:06     submitterLimit    = 1.500000
06/24/11 08:03:06     submitterUsage    = 0.000000
06/24/11 08:03:06   Negotiating with a.u1@localdomain at <192.168.1.2:39904>
06/24/11 08:03:06     SubmitterPrio       = 0.500000
06/24/11 08:03:06     submitterLimit    = 1.000000
06/24/11 08:03:06     submitterUsage    = 1.000000

# submission (2)  (a.u2 goes first)
06/24/11 08:03:26   Negotiating with a.u2@localdomain at <192.168.1.2:39904>
06/24/11 08:03:26     SubmitterPrio       = 0.500000
06/24/11 08:03:26     submitterLimit    = 1.500000
06/24/11 08:03:26     submitterUsage    = 1.000000
06/24/11 08:03:26   Negotiating with a.u1@localdomain at <192.168.1.2:39904>
06/24/11 08:03:26     SubmitterPrio       = 0.500000
06/24/11 08:03:26     submitterLimit    = 0.500000
06/24/11 08:03:26     submitterUsage    = 2.000000

# submission (3) (set up a.u1 to go first next round)
06/24/11 08:03:48   Negotiating with a.u2@localdomain at <192.168.1.2:39904>
06/24/11 08:03:48     SubmitterPrio       = 0.500000
06/24/11 08:03:48     submitterLimit    = 2.500000
06/24/11 08:03:48     submitterUsage    = 2.000000
06/24/11 08:03:48   Negotiating with a.u1@localdomain at <192.168.1.2:39904>
06/24/11 08:03:48     SubmitterPrio       = 0.500000
06/24/11 08:03:48     submitterLimit    = 1.500000
06/24/11 08:03:48     submitterUsage    = 3.000000
06/24/11 08:03:48   Negotiating with a.u2@localdomain at <192.168.1.2:39904>
06/24/11 08:03:48     SubmitterPrio       = 0.500000
06/24/11 08:03:48     submitterLimit    = 1.000000
06/24/11 08:03:48     submitterUsage    = 4.000000

# submission (2) (a.u1 goes first)
06/24/11 08:04:09   Negotiating with a.u1@localdomain at <192.168.1.2:39904>
06/24/11 08:04:09     SubmitterPrio       = 0.500000
06/24/11 08:04:09     submitterLimit    = 1.500000
06/24/11 08:04:09     submitterUsage    = 4.000000
06/24/11 08:04:09   Negotiating with a.u2@localdomain at <192.168.1.2:39904>
06/24/11 08:04:09     SubmitterPrio       = 0.500000
06/24/11 08:04:09     submitterLimit    = 0.500000
06/24/11 08:04:09     submitterUsage    = 5.000000

Comment 5 Erik Erlandson 2011-06-24 16:37:55 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
Secondary submitter sort by name, previously used to avoid counting submitters twice when computing normalization factor

Consequence:
When all submitters have same priority (e.g. configured with a very large PRIORITY_HALFLIFE), the secondary sort by name resulted in submitter starvation because same submitters negotiated first each time.

Fix:
Computation of normalization factor was updated to not be dependent on secondary sort by name.  Submitter sort was changed so that primary sort was by submitter priority, and secondary sort by starvation ratio usage/(usage+submitter_limit), on 1st pie spin.

Result:
Submitters are no longer starved when a large PRIORITY_HALFLIFE forces priorities to equality.

Comment 6 Erik Erlandson 2011-07-11 23:25:30 UTC
tracking branch: UPSTREAM-7.7.1-BZ712987-submitter-sort

Comment 8 Tomas Rusnak 2011-07-27 09:46:28 UTC
Reproduced on:

$CondorVersion: 7.6.1 Jun 02 2011 BuildID: RH-7.6.1-0.10.el5 $
$CondorPlatform: X86_64-RedHat_5.6 $

# tail -f /var/log/condor/NegotiatorLog | grep -e 'Negotiating with.* at' -e 'SubmitterPrio.*=' -e 'submitterLimit .*=' -e 'submitterUsage .*='
07/27/11 12:24:39   Negotiating with a.u1@localhost at <IP:39578>
07/27/11 12:24:39     SubmitterPrio       = 0.500000
07/27/11 12:24:39     SubmitterPrioFactor = 1.000000
07/27/11 12:24:39     submitterLimit    = 1.500000
07/27/11 12:24:39     submitterUsage    = 0.000000
07/27/11 12:24:39   Negotiating with a.u2@localhost at <IP:39578>
07/27/11 12:24:39     SubmitterPrio       = 0.500000
07/27/11 12:24:39     SubmitterPrioFactor = 1.000000
07/27/11 12:24:39     submitterLimit    = 1.500000
07/27/11 12:24:39     submitterUsage    = 0.000000
07/27/11 12:24:39   Negotiating with a.u1@localhost at <IP:39578>
07/27/11 12:24:39     SubmitterPrio       = 0.500000
07/27/11 12:24:39     SubmitterPrioFactor = 1.000000
07/27/11 12:24:39     submitterLimit    = 1.000000
07/27/11 12:24:39     submitterUsage    = 1.000000
07/27/11 12:25:08   Negotiating with a.u1@localhost at <IP:39578>
07/27/11 12:25:08     SubmitterPrio       = 0.500000
07/27/11 12:25:08     SubmitterPrioFactor = 1.000000
07/27/11 12:25:08     submitterLimit    = 0.500000
07/27/11 12:25:08     submitterUsage    = 2.000000
07/27/11 12:25:08   Negotiating with a.u2@localhost at <IP:39578>
07/27/11 12:25:08     SubmitterPrio       = 0.500000
07/27/11 12:25:08     SubmitterPrioFactor = 1.000000
07/27/11 12:25:08     submitterLimit    = 1.000000 (starved 0.500000)
07/27/11 12:25:08     submitterUsage    = 1.000000
07/27/11 12:25:56   Negotiating with a.u1@localhost at <IP:39578>
07/27/11 12:25:56     SubmitterPrio       = 0.500000
07/27/11 12:25:56     SubmitterPrioFactor = 1.000000
07/27/11 12:25:56     submitterLimit    = 1.500000
07/27/11 12:25:56     submitterUsage    = 3.000000
07/27/11 12:25:56   Negotiating with a.u2@localhost at <IP:39578>
07/27/11 12:25:56     SubmitterPrio       = 0.500000
07/27/11 12:25:56     SubmitterPrioFactor = 1.000000
07/27/11 12:25:56     submitterLimit    = 2.500000
07/27/11 12:25:56     submitterUsage    = 2.000000
07/27/11 12:25:56   Negotiating with a.u2@localhost at <IP:39578>
07/27/11 12:25:56     SubmitterPrio       = 0.500000
07/27/11 12:25:56     SubmitterPrioFactor = 1.000000
07/27/11 12:25:56     submitterLimit    = 1.000000
07/27/11 12:25:56     submitterUsage    = 4.000000
07/27/11 12:26:48   Negotiating with a.u1@localhost at <IP:39578>
07/27/11 12:26:48     SubmitterPrio       = 0.500000
07/27/11 12:26:48     SubmitterPrioFactor = 1.000000
07/27/11 12:26:48     submitterLimit    = 1.500000
07/27/11 12:26:48     submitterUsage    = 4.000000
07/27/11 12:26:48   Negotiating with a.u2@localhost at <IP:39578>
07/27/11 12:26:48     SubmitterPrio       = 0.500000
07/27/11 12:26:48     SubmitterPrioFactor = 1.000000
07/27/11 12:26:48     submitterLimit    = 0.500000
07/27/11 12:26:48     submitterUsage    = 5.000000

Comment 9 Tomas Rusnak 2011-07-27 12:14:49 UTC
Retested over all supported platforms x86,x86_64/RHEL5,RHEL6 with:

condor-7.6.3-0.2

# tail -f /var/log/condor/NegotiatorLog | grep -e 'Negotiating with.* at' -e 'SubmitterPrio.*=' -e 'submitterLimit .*=' -e 'submitterUsage .*='
07/27/11 14:33:24   Negotiating with a.u1@localhost at <IP:49109>
07/27/11 14:33:24     SubmitterPrio       = 0.500000
07/27/11 14:33:24     SubmitterPrioFactor = 1.000000
07/27/11 14:33:24     submitterLimit    = 1.500000
07/27/11 14:33:24     submitterUsage    = 0.000000
07/27/11 14:33:24   Negotiating with a.u2@localhost at <IP:49109>
07/27/11 14:33:24     SubmitterPrio       = 0.500000
07/27/11 14:33:24     SubmitterPrioFactor = 1.000000
07/27/11 14:33:24     submitterLimit    = 1.500000
07/27/11 14:33:24     submitterUsage    = 0.000000
07/27/11 14:33:25   Negotiating with a.u1@localhost at <IP:49109>
07/27/11 14:33:25     SubmitterPrio       = 0.500000
07/27/11 14:33:25     SubmitterPrioFactor = 1.000000
07/27/11 14:33:25     submitterLimit    = 1.000000
07/27/11 14:33:25     submitterUsage    = 1.000000

07/27/11 14:36:19   Negotiating with a.u2@localhost at <IP:49109>
07/27/11 14:36:19     SubmitterPrio       = 0.500000
07/27/11 14:36:19     SubmitterPrioFactor = 1.000000
07/27/11 14:36:19     submitterLimit    = 1.500000
07/27/11 14:36:19     submitterUsage    = 1.000000
07/27/11 14:36:19   Negotiating with a.u1@localhost at <IP:49109>
07/27/11 14:36:19     SubmitterPrio       = 0.500000
07/27/11 14:36:19     SubmitterPrioFactor = 1.000000
07/27/11 14:36:19     submitterLimit    = 0.500000
07/27/11 14:36:19     submitterUsage    = 2.000000

07/27/11 14:36:39   Negotiating with a.u2@localhost at <IP:49109>
07/27/11 14:36:39     SubmitterPrio       = 0.500000
07/27/11 14:36:39     SubmitterPrioFactor = 1.000000
07/27/11 14:36:39     submitterLimit    = 2.500000
07/27/11 14:36:39     submitterUsage    = 2.000000
07/27/11 14:36:39   Negotiating with a.u1@localhost at <IP:49109>
07/27/11 14:36:39     SubmitterPrio       = 0.500000
07/27/11 14:36:39     SubmitterPrioFactor = 1.000000
07/27/11 14:36:39     submitterLimit    = 1.500000
07/27/11 14:36:39     submitterUsage    = 3.000000
07/27/11 14:36:39   Negotiating with a.u2@localhost at <IP:49109>
07/27/11 14:36:39     SubmitterPrio       = 0.500000
07/27/11 14:36:39     SubmitterPrioFactor = 1.000000
07/27/11 14:36:39     submitterLimit    = 1.000000
07/27/11 14:36:39     submitterUsage    = 4.000000

07/27/11 14:37:01   Negotiating with a.u1@localhost at <IP:49109>
07/27/11 14:37:01     SubmitterPrio       = 0.500000
07/27/11 14:37:01     SubmitterPrioFactor = 1.000000
07/27/11 14:37:01     submitterLimit    = 1.500000
07/27/11 14:37:01     submitterUsage    = 4.000000
07/27/11 14:37:01   Negotiating with a.u2@localhost at <IP:49109>
07/27/11 14:37:01     SubmitterPrio       = 0.500000
07/27/11 14:37:01     SubmitterPrioFactor = 1.000000
07/27/11 14:37:01     submitterLimit    = 0.500000
07/27/11 14:37:01     submitterUsage    = 5.000000

The 2nd group was negotiated first with lower ratio as expected.

>>> VERIFIED

Comment 10 errata-xmlrpc 2011-09-07 16:43:13 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1249.html


Note You need to log in before you can comment on or make changes to this bug.