// ----- Sort the schedd list in decreasing priority order dprintf( D_ALWAYS, "Phase 3: Sorting submitter ads by priority ...\n" ); scheddAds.Sort( (lessThanFunc)comparisonFunction, this ); Currently the code sorts on prio. Comments indicate it does a random secondary sort of names, but it looks like it doesn't really. The sort needs to be changed to either 1) based on starvation 2) based on prio and then a true secondary sort on starvation.
upstream: https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2240
repro/test Using the following configuration, which sets a very large PRIORITY_HALFLIFE so that user priorities all remain equal at 0.5: CLAIM_WORKLIFE = 0 NEGOTIATOR_CONSIDER_PREEMPTION = FALSE NEGOTIATOR_DEBUG = D_FULLDEBUG NEGOTIATOR_USE_SLOT_WEIGHTS = FALSE GROUP_QUOTA_MAX_ALLOCATION_ROUNDS = 1 NEGOTIATOR_INTERVAL = 30 SCHEDD_INTERVAL = 15 PRIORITY_HALFLIFE = 1e100 NUM_CPUS = 20 GROUP_NAMES = a GROUP_QUOTA_a = 20 GROUP_ACCEPT_SURPLUS = FALSE The following three job submissions are used in a sequence (submission 2 will be used twice). They are designed to switch the starvation order, to demonstrate that the new secondary sort on starvation is working (and demonstrate it doesn't happen in previous code) Submission (1): will result in "a.u1" having a higher ratio than "a.u2" (affects order on subsequent negotiation round) universe = vanilla cmd = /bin/sleep args = 600 should_transfer_files = if_needed when_to_transfer_output = on_exit +AccountingGroup="a.u2" queue 1 +AccountingGroup="a.u1" queue 2 Submission (2): leaves starvation order unchanged from previous negotiation round to demonstrate re-ordering of submitters from previous submission universe = vanilla cmd = /bin/sleep args = 600 should_transfer_files = if_needed when_to_transfer_output = on_exit +AccountingGroup="a.u2" queue 1 +AccountingGroup="a.u1" queue 1 Submission (3): switches the order so that now "a.u2" has the higher ratio universe = vanilla cmd = /bin/sleep args = 600 should_transfer_files = if_needed when_to_transfer_output = on_exit +AccountingGroup="a.u2" queue 3 +AccountingGroup="a.u1" queue 1 The order of submissions is (1), (2), (3), (2) (allow each submission to negotiate before submitting the next). Repro: before fix, we see that "a.u1" always negotiates first, because all priorities are equal, and secondary sort is by submitter name: $ tail -f NegotiatorLog | grep -e 'Negotiating with.* at' -e 'SubmitterPrio .*=' -e 'submitterLimit .*=' -e 'submitterUsage .*=' # submission (1) 06/24/11 08:32:13 Negotiating with a.u1@localdomain at <192.168.1.2:45791> 06/24/11 08:32:13 SubmitterPrio = 0.500000 06/24/11 08:32:13 submitterLimit = 1.500000 06/24/11 08:32:13 submitterUsage = 0.000000 06/24/11 08:32:13 Negotiating with a.u2@localdomain at <192.168.1.2:45791> 06/24/11 08:32:13 SubmitterPrio = 0.500000 06/24/11 08:32:13 submitterLimit = 1.500000 06/24/11 08:32:13 submitterUsage = 0.000000 06/24/11 08:32:14 Negotiating with a.u1@localdomain at <192.168.1.2:45791> 06/24/11 08:32:14 SubmitterPrio = 0.500000 06/24/11 08:32:14 submitterLimit = 1.000000 06/24/11 08:32:14 submitterUsage = 1.000000 # submission (2) 06/24/11 08:32:34 Negotiating with a.u1@localdomain at <192.168.1.2:45791> 06/24/11 08:32:34 SubmitterPrio = 0.500000 06/24/11 08:32:34 submitterLimit = 0.500000 06/24/11 08:32:34 submitterUsage = 2.000000 06/24/11 08:32:34 Negotiating with a.u2@localdomain at <192.168.1.2:45791> 06/24/11 08:32:34 SubmitterPrio = 0.500000 06/24/11 08:32:34 submitterLimit = 1.000000 (starved 0.500000) 06/24/11 08:32:34 submitterUsage = 1.000000 # submission (3) 06/24/11 08:32:55 Negotiating with a.u1@localdomain at <192.168.1.2:45791> 06/24/11 08:32:55 SubmitterPrio = 0.500000 06/24/11 08:32:55 submitterLimit = 1.500000 06/24/11 08:32:55 submitterUsage = 3.000000 06/24/11 08:32:55 Negotiating with a.u2@localdomain at <192.168.1.2:45791> 06/24/11 08:32:55 SubmitterPrio = 0.500000 06/24/11 08:32:55 submitterLimit = 2.500000 06/24/11 08:32:55 submitterUsage = 2.000000 06/24/11 08:32:55 Negotiating with a.u2@localdomain at <192.168.1.2:45791> 06/24/11 08:32:55 SubmitterPrio = 0.500000 06/24/11 08:32:55 submitterLimit = 1.000000 06/24/11 08:32:55 submitterUsage = 4.000000 # submission (2) again 06/24/11 08:33:16 Negotiating with a.u1@localdomain at <192.168.1.2:45791> 06/24/11 08:33:16 SubmitterPrio = 0.500000 06/24/11 08:33:16 submitterLimit = 1.500000 06/24/11 08:33:16 submitterUsage = 4.000000 06/24/11 08:33:16 Negotiating with a.u2@localdomain at <192.168.1.2:45791> 06/24/11 08:33:16 SubmitterPrio = 0.500000 06/24/11 08:33:16 submitterLimit = 0.500000 06/24/11 08:33:16 submitterUsage = 5.000000 After the fix: we see that "a.u2" gets to negotiate first when it's (incoming) ratio is the lowest: $ tail -f NegotiatorLog | grep -e 'Negotiating with.* at' -e 'SubmitterPrio .*=' -e 'submitterLimit .*=' -e 'submitterUsage .*=' # submission (1) (set up a.u2 to go first next round) 06/24/11 08:03:06 Negotiating with a.u1@localdomain at <192.168.1.2:39904> 06/24/11 08:03:06 SubmitterPrio = 0.500000 06/24/11 08:03:06 submitterLimit = 1.500000 06/24/11 08:03:06 submitterUsage = 0.000000 06/24/11 08:03:06 Negotiating with a.u2@localdomain at <192.168.1.2:39904> 06/24/11 08:03:06 SubmitterPrio = 0.500000 06/24/11 08:03:06 submitterLimit = 1.500000 06/24/11 08:03:06 submitterUsage = 0.000000 06/24/11 08:03:06 Negotiating with a.u1@localdomain at <192.168.1.2:39904> 06/24/11 08:03:06 SubmitterPrio = 0.500000 06/24/11 08:03:06 submitterLimit = 1.000000 06/24/11 08:03:06 submitterUsage = 1.000000 # submission (2) (a.u2 goes first) 06/24/11 08:03:26 Negotiating with a.u2@localdomain at <192.168.1.2:39904> 06/24/11 08:03:26 SubmitterPrio = 0.500000 06/24/11 08:03:26 submitterLimit = 1.500000 06/24/11 08:03:26 submitterUsage = 1.000000 06/24/11 08:03:26 Negotiating with a.u1@localdomain at <192.168.1.2:39904> 06/24/11 08:03:26 SubmitterPrio = 0.500000 06/24/11 08:03:26 submitterLimit = 0.500000 06/24/11 08:03:26 submitterUsage = 2.000000 # submission (3) (set up a.u1 to go first next round) 06/24/11 08:03:48 Negotiating with a.u2@localdomain at <192.168.1.2:39904> 06/24/11 08:03:48 SubmitterPrio = 0.500000 06/24/11 08:03:48 submitterLimit = 2.500000 06/24/11 08:03:48 submitterUsage = 2.000000 06/24/11 08:03:48 Negotiating with a.u1@localdomain at <192.168.1.2:39904> 06/24/11 08:03:48 SubmitterPrio = 0.500000 06/24/11 08:03:48 submitterLimit = 1.500000 06/24/11 08:03:48 submitterUsage = 3.000000 06/24/11 08:03:48 Negotiating with a.u2@localdomain at <192.168.1.2:39904> 06/24/11 08:03:48 SubmitterPrio = 0.500000 06/24/11 08:03:48 submitterLimit = 1.000000 06/24/11 08:03:48 submitterUsage = 4.000000 # submission (2) (a.u1 goes first) 06/24/11 08:04:09 Negotiating with a.u1@localdomain at <192.168.1.2:39904> 06/24/11 08:04:09 SubmitterPrio = 0.500000 06/24/11 08:04:09 submitterLimit = 1.500000 06/24/11 08:04:09 submitterUsage = 4.000000 06/24/11 08:04:09 Negotiating with a.u2@localdomain at <192.168.1.2:39904> 06/24/11 08:04:09 SubmitterPrio = 0.500000 06/24/11 08:04:09 submitterLimit = 0.500000 06/24/11 08:04:09 submitterUsage = 5.000000
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: Secondary submitter sort by name, previously used to avoid counting submitters twice when computing normalization factor Consequence: When all submitters have same priority (e.g. configured with a very large PRIORITY_HALFLIFE), the secondary sort by name resulted in submitter starvation because same submitters negotiated first each time. Fix: Computation of normalization factor was updated to not be dependent on secondary sort by name. Submitter sort was changed so that primary sort was by submitter priority, and secondary sort by starvation ratio usage/(usage+submitter_limit), on 1st pie spin. Result: Submitters are no longer starved when a large PRIORITY_HALFLIFE forces priorities to equality.
tracking branch: UPSTREAM-7.7.1-BZ712987-submitter-sort
Reproduced on: $CondorVersion: 7.6.1 Jun 02 2011 BuildID: RH-7.6.1-0.10.el5 $ $CondorPlatform: X86_64-RedHat_5.6 $ # tail -f /var/log/condor/NegotiatorLog | grep -e 'Negotiating with.* at' -e 'SubmitterPrio.*=' -e 'submitterLimit .*=' -e 'submitterUsage .*=' 07/27/11 12:24:39 Negotiating with a.u1@localhost at <IP:39578> 07/27/11 12:24:39 SubmitterPrio = 0.500000 07/27/11 12:24:39 SubmitterPrioFactor = 1.000000 07/27/11 12:24:39 submitterLimit = 1.500000 07/27/11 12:24:39 submitterUsage = 0.000000 07/27/11 12:24:39 Negotiating with a.u2@localhost at <IP:39578> 07/27/11 12:24:39 SubmitterPrio = 0.500000 07/27/11 12:24:39 SubmitterPrioFactor = 1.000000 07/27/11 12:24:39 submitterLimit = 1.500000 07/27/11 12:24:39 submitterUsage = 0.000000 07/27/11 12:24:39 Negotiating with a.u1@localhost at <IP:39578> 07/27/11 12:24:39 SubmitterPrio = 0.500000 07/27/11 12:24:39 SubmitterPrioFactor = 1.000000 07/27/11 12:24:39 submitterLimit = 1.000000 07/27/11 12:24:39 submitterUsage = 1.000000 07/27/11 12:25:08 Negotiating with a.u1@localhost at <IP:39578> 07/27/11 12:25:08 SubmitterPrio = 0.500000 07/27/11 12:25:08 SubmitterPrioFactor = 1.000000 07/27/11 12:25:08 submitterLimit = 0.500000 07/27/11 12:25:08 submitterUsage = 2.000000 07/27/11 12:25:08 Negotiating with a.u2@localhost at <IP:39578> 07/27/11 12:25:08 SubmitterPrio = 0.500000 07/27/11 12:25:08 SubmitterPrioFactor = 1.000000 07/27/11 12:25:08 submitterLimit = 1.000000 (starved 0.500000) 07/27/11 12:25:08 submitterUsage = 1.000000 07/27/11 12:25:56 Negotiating with a.u1@localhost at <IP:39578> 07/27/11 12:25:56 SubmitterPrio = 0.500000 07/27/11 12:25:56 SubmitterPrioFactor = 1.000000 07/27/11 12:25:56 submitterLimit = 1.500000 07/27/11 12:25:56 submitterUsage = 3.000000 07/27/11 12:25:56 Negotiating with a.u2@localhost at <IP:39578> 07/27/11 12:25:56 SubmitterPrio = 0.500000 07/27/11 12:25:56 SubmitterPrioFactor = 1.000000 07/27/11 12:25:56 submitterLimit = 2.500000 07/27/11 12:25:56 submitterUsage = 2.000000 07/27/11 12:25:56 Negotiating with a.u2@localhost at <IP:39578> 07/27/11 12:25:56 SubmitterPrio = 0.500000 07/27/11 12:25:56 SubmitterPrioFactor = 1.000000 07/27/11 12:25:56 submitterLimit = 1.000000 07/27/11 12:25:56 submitterUsage = 4.000000 07/27/11 12:26:48 Negotiating with a.u1@localhost at <IP:39578> 07/27/11 12:26:48 SubmitterPrio = 0.500000 07/27/11 12:26:48 SubmitterPrioFactor = 1.000000 07/27/11 12:26:48 submitterLimit = 1.500000 07/27/11 12:26:48 submitterUsage = 4.000000 07/27/11 12:26:48 Negotiating with a.u2@localhost at <IP:39578> 07/27/11 12:26:48 SubmitterPrio = 0.500000 07/27/11 12:26:48 SubmitterPrioFactor = 1.000000 07/27/11 12:26:48 submitterLimit = 0.500000 07/27/11 12:26:48 submitterUsage = 5.000000
Retested over all supported platforms x86,x86_64/RHEL5,RHEL6 with: condor-7.6.3-0.2 # tail -f /var/log/condor/NegotiatorLog | grep -e 'Negotiating with.* at' -e 'SubmitterPrio.*=' -e 'submitterLimit .*=' -e 'submitterUsage .*=' 07/27/11 14:33:24 Negotiating with a.u1@localhost at <IP:49109> 07/27/11 14:33:24 SubmitterPrio = 0.500000 07/27/11 14:33:24 SubmitterPrioFactor = 1.000000 07/27/11 14:33:24 submitterLimit = 1.500000 07/27/11 14:33:24 submitterUsage = 0.000000 07/27/11 14:33:24 Negotiating with a.u2@localhost at <IP:49109> 07/27/11 14:33:24 SubmitterPrio = 0.500000 07/27/11 14:33:24 SubmitterPrioFactor = 1.000000 07/27/11 14:33:24 submitterLimit = 1.500000 07/27/11 14:33:24 submitterUsage = 0.000000 07/27/11 14:33:25 Negotiating with a.u1@localhost at <IP:49109> 07/27/11 14:33:25 SubmitterPrio = 0.500000 07/27/11 14:33:25 SubmitterPrioFactor = 1.000000 07/27/11 14:33:25 submitterLimit = 1.000000 07/27/11 14:33:25 submitterUsage = 1.000000 07/27/11 14:36:19 Negotiating with a.u2@localhost at <IP:49109> 07/27/11 14:36:19 SubmitterPrio = 0.500000 07/27/11 14:36:19 SubmitterPrioFactor = 1.000000 07/27/11 14:36:19 submitterLimit = 1.500000 07/27/11 14:36:19 submitterUsage = 1.000000 07/27/11 14:36:19 Negotiating with a.u1@localhost at <IP:49109> 07/27/11 14:36:19 SubmitterPrio = 0.500000 07/27/11 14:36:19 SubmitterPrioFactor = 1.000000 07/27/11 14:36:19 submitterLimit = 0.500000 07/27/11 14:36:19 submitterUsage = 2.000000 07/27/11 14:36:39 Negotiating with a.u2@localhost at <IP:49109> 07/27/11 14:36:39 SubmitterPrio = 0.500000 07/27/11 14:36:39 SubmitterPrioFactor = 1.000000 07/27/11 14:36:39 submitterLimit = 2.500000 07/27/11 14:36:39 submitterUsage = 2.000000 07/27/11 14:36:39 Negotiating with a.u1@localhost at <IP:49109> 07/27/11 14:36:39 SubmitterPrio = 0.500000 07/27/11 14:36:39 SubmitterPrioFactor = 1.000000 07/27/11 14:36:39 submitterLimit = 1.500000 07/27/11 14:36:39 submitterUsage = 3.000000 07/27/11 14:36:39 Negotiating with a.u2@localhost at <IP:49109> 07/27/11 14:36:39 SubmitterPrio = 0.500000 07/27/11 14:36:39 SubmitterPrioFactor = 1.000000 07/27/11 14:36:39 submitterLimit = 1.000000 07/27/11 14:36:39 submitterUsage = 4.000000 07/27/11 14:37:01 Negotiating with a.u1@localhost at <IP:49109> 07/27/11 14:37:01 SubmitterPrio = 0.500000 07/27/11 14:37:01 SubmitterPrioFactor = 1.000000 07/27/11 14:37:01 submitterLimit = 1.500000 07/27/11 14:37:01 submitterUsage = 4.000000 07/27/11 14:37:01 Negotiating with a.u2@localhost at <IP:49109> 07/27/11 14:37:01 SubmitterPrio = 0.500000 07/27/11 14:37:01 SubmitterPrioFactor = 1.000000 07/27/11 14:37:01 submitterLimit = 0.500000 07/27/11 14:37:01 submitterUsage = 5.000000 The 2nd group was negotiated first with lower ratio as expected. >>> VERIFIED
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1249.html