Bug 805448 - bad submitter limit
Summary: bad submitter limit
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: Development
Hardware: All
OS: Linux
high
medium
Target Milestone: 2.3
: ---
Assignee: Erik Erlandson
QA Contact: Lubos Trilety
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-21 10:40 UTC by Lubos Trilety
Modified: 2013-03-06 18:43 UTC (History)
4 users (show)

Fixed In Version: condor-7.8.2-0.1
Doc Type: Bug Fix
Doc Text:
Cause: Previously enhanced logic for computing submitter limits did not take NEGOTIATOR_CONSIDER_PREEMPTION setting into account. Consequence: When NEGOTIATOR_CONSIDER_PREEMPTION was set to false, submitter limits were not as tight as possible, resulting in some inefficiency Fix: Logic was updated to take consider-preemption settings into account Result: Submitter limits are tighter when consider-preemption setting is off.
Clone Of:
Environment:
Last Closed: 2013-03-06 18:43:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Condor 2952 0 None None None Never
Red Hat Bugzilla 805581 0 medium CLOSED Number of group quota exceeded 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHSA-2013:0564 0 normal SHIPPED_LIVE Low: Red Hat Enterprise MRG Grid 2.3 security update 2013-03-06 23:37:09 UTC

Internal Links: 805581

Description Lubos Trilety 2012-03-21 10:40:45 UTC
Description of problem:
There is bigger submitter limit in negotiator log than it should be.

Version-Release number of selected component (if applicable):
condor-7.6.7-0.7

How reproducible:
100%

Steps to Reproduce:
Configuration:
$ more 95.bz706512.config 
NEGOTIATOR_CONSIDER_PREEMPTION = FALSE
NUM_CPUS = 10
GROUP_NAMES = a, b
GROUP_QUOTA_a = 5
GROUP_QUOTA_b = 5


Test uses two submissions.  The first submits a single job under submitter
"a.u1":
$ more a.u1.jsub
universe = vanilla
cmd = /bin/sleep
args = 300
should_transfer_files = if_needed
when_to_transfer_output = on_exit
+AccountingGroup="a.u1"
queue 1

The second will submit 4 or more jobs for submitter "a.u2":
$ more a.u2.jsub
universe = vanilla
cmd = /bin/sleep
args = 300
should_transfer_files = if_needed
when_to_transfer_output = on_exit
+AccountingGroup="a.u2"
queue 4

The test is to first submit "a.u1.jsub", and let it negotiate.  After it has
negotiated, submit "a.u2.jsub."


$ tail -f NegotiatorLog | grep -e 'Phase 4..:' -e 'Negotiating with.* at' -e 'submitterLimit *='
03/21/12 05:59:43 Phase 4.1:  Negotiating with schedds ...
03/21/12 05:59:43   Negotiating with a.u1@host at <IP:57823>
03/21/12 05:59:43     submitterLimit    = 1.000000
03/21/12 06:08:19 Phase 4.1:  Negotiating with schedds ...
03/21/12 06:08:19   Negotiating with a.u2@host at <IP:57823>
03/21/12 06:08:19     submitterLimit    = 5.000000
  
Actual results:
Submitter limit is equal to 5.000000

Expected results:
There should be 'submitterLimit    = 4.000000'

Additional info:
The number of running jobs under the group 'a' does not exceed group quota if there is 5 jobs submitted in second submission. There are following lines in negotiator log:
# cat NegotiatorLog
...
03/21/12 06:08:19 matchmakingAlgorithm: limit 5.000000 used 4.000000 pieLeft 1.000000
03/21/12 06:08:19 Attempting to use cached MatchList: Failed (MatchList length: 0, Autocluster: 1, Schedd Name: a.u2.eng.bos.redhat.com, Schedd Address: <10.16.64.239:57823>)
03/21/12 06:08:19       Rejected 2.4 a.u2.eng.bos.redhat.com <10.16.64.239:57823>: group quota exceeded
03/21/12 06:08:19     Hit submitter limit: done negotiating
03/21/12 06:08:19   This submitter hit its submitterLimit.
03/21/12 06:08:19  resources used scheddUsed= 4.000000
03/21/12 06:08:19 Group a is using its quota 5 - halting negotiation
03/21/12 06:08:19  negotiateWithGroup resources used scheddAds length 1
...

Comment 1 Erik Erlandson 2012-04-25 17:39:31 UTC
fix pushed to UPSTREAM-7.9.0-BZ805448-submitter-limits

Comment 2 Erik Erlandson 2012-04-25 17:40:55 UTC
REPRO/TEST.

Configuration:

NEGOTIATOR_CONSIDER_PREEMPTION = FALSE
NUM_CPUS = 5
GROUP_NAMES = a
GROUP_QUOTA_a = 5

Spin up pool. Watch negotiator log:

$ tail -f NegotiatorLog | grep -e 'Phase 4..:' -e 'Negotiating with.* at' -e 'submitterLimit *='

Submit two jobs. Allow the first to negotiate before submitting the second:

universe = vanilla
cmd = /bin/sleep
args = 300
should_transfer_files = if_needed
when_to_transfer_output = on_exit
+AccountingGroup="a.u1"
queue 1

universe = vanilla
cmd = /bin/sleep
args = 300
should_transfer_files = if_needed
when_to_transfer_output = on_exit
+AccountingGroup="a.u2"
queue 4

BEFORE FIX, you should see the following (2nd submitter limit is 5)

$ tail -f NegotiatorLog | grep -e 'Phase 4..:' -e 'Negotiating with.* at' -e 'submitterLimit *='
04/25/12 10:08:20 Phase 4.1:  Negotiating with schedds ...
04/25/12 10:08:20   Negotiating with a.u1@localdomain at <192.168.1.2:33501>
04/25/12 10:08:20     submitterLimit    = 1.000000
04/25/12 10:08:40 Phase 4.1:  Negotiating with schedds ...
04/25/12 10:08:40   Negotiating with a.u2@localdomain at <192.168.1.2:33501>
04/25/12 10:08:40     submitterLimit    = 5.000000

AFTER FIX, you should see (2nd submitter limit is 4)

$ tail -f NegotiatorLog | grep -e 'Phase 4..:' -e 'Negotiating with.* at' -e 'submitterLimit *='
04/25/12 10:09:47 Phase 4.1:  Negotiating with schedds ...
04/25/12 10:09:47   Negotiating with a.u1@localdomain at <192.168.1.2:54831>
04/25/12 10:09:47     submitterLimit    = 1.000000
04/25/12 10:10:07 Phase 4.1:  Negotiating with schedds ...
04/25/12 10:10:07   Negotiating with a.u2@localdomain at <192.168.1.2:54831>
04/25/12 10:10:07     submitterLimit    = 4.000000

Comment 3 Erik Erlandson 2012-04-25 17:45:08 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
Previously enhanced logic for computing submitter limits did not take NEGOTIATOR_CONSIDER_PREEMPTION setting into account.

Consequence:
When NEGOTIATOR_CONSIDER_PREEMPTION was set to false, submitter limits were not as tight as possible, resulting in some inefficiency

Fix:
Logic was updated to take consider-preemption settings into account

Result:
Submitter limits are tighter when consider-preemption setting is off.

Comment 7 Lubos Trilety 2012-11-12 20:06:13 UTC
Successfully reproduced on condor-7.6.7-0.7

Tested with condor-7.8.7-0.4

Tested on:
RHEL5 x86_64,i386
RHEL6 x86_64,i386

# tail -f NegotiatorLog | grep -e 'Phase 4..:' -e 'Negotiating with.* at' -e 'submitterLimit *='
11/12/12 19:40:15 Phase 4.1:  Negotiating with schedds ...
11/12/12 19:40:15   Negotiating with a.u1@host at <IP:35649>
11/12/12 19:40:15     submitterLimit    = 5.000000
11/12/12 19:40:39 Phase 4.1:  Negotiating with schedds ...
11/12/12 19:40:39   Negotiating with a.u2@host at <IP:35649>
11/12/12 19:40:39     submitterLimit    = 4.000000

The submitterLimit is now 4 as it should be, no errors group quota exceeded in NegotiatorLog.

>>> verified

Comment 9 errata-xmlrpc 2013-03-06 18:43:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0564.html


Note You need to log in before you can comment on or make changes to this bug.