Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 721110 - RFE: Concurrency limit default grouping
RFE: Concurrency limit default grouping
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor (Show other bugs)
1.3
x86_64 Linux
high Severity medium
: 2.2
: ---
Assigned To: Erik Erlandson
Lubos Trilety
done
: FutureFeature
Depends On:
Blocks: 805351 828434
  Show dependency treegraph
 
Reported: 2011-07-13 14:34 EDT by Scott Spurrier
Modified: 2012-09-19 14:03 EDT (History)
8 users (show)

See Also:
Fixed In Version: condor-7.6.5-0.15
Doc Type: Enhancement
Doc Text:
Cause: Customer wanted to alter the available concurrency limits for jobs on a frequent basis. Consequence: Frequent negotiator reconfigurations were required, and impacted pool performance. Change: The negotiator accountant was enhanced to support named groups for scoping multiple concurrency limit defaults based on a limit name prefix. Result: Concurrency limits can be defined with multiple possible default values based on name prefix, without needing to invoke frequent negotiator reconfigurations.
Story Points: ---
Clone Of:
: 805351 (view as bug list)
Environment:
Last Closed: 2012-09-19 13:41:33 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Condor 2863 None None None Never
Red Hat Product Errata RHSA-2012:1278 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Grid 2.2 security update 2012-09-19 17:40:26 EDT

  None (edit)
Description Scott Spurrier 2011-07-13 14:34:23 EDT
Description of problem:
We would like the ability to set a 1-tier grouping of concurrency limit defaults to assist in avoiding a high rate of negotiator reconfigs for concurrency limit adjustments. We envision consuming that as something like: 

CONCURRENCY_LIMIT_DEFAULT=150 
CONCURRENCY_LIMIT_DEFAULT_LIST = direct,batch CONCURRENCY_LIMIT_DEFAULT_LIST_direct = 40 
CONCURRENCY_LIMIT_DEFAULT_LIST_batch = 120 

A concurrency limit of "gl5000" would start with a value of 150. A concurrency limit of "direct.gl5000" would start with a value of 40. A concurrency limit of "batch.gl5000" would start with a value of 120.
Comment 8 Erik Erlandson 2012-03-08 15:03:56 EST
Fix pushed to: UPSTREAM-7.7.6-BZ721110-scoped-default-limits
Comment 9 Erik Erlandson 2012-03-08 15:04:40 EST
TESTING:

Begin with this configuration, that defines some scoped cc-limit defaults and a traditional cc-limit "TEST":

NEGOTIATOR_DEBUG = D_ACCOUNTANT | D_FULLDEBUG
NEGOTIATOR_INTERVAL = 20
NEGOTIATOR_CYCLE_DELAY = 5

# oversubscribe some slots to make testing easier:
NUM_CPUS = 25

# concurrency limit defaults:
CONCURRENCY_LIMIT_DEFAULT = 1
CONCURRENCY_LIMIT_DEFAULT_small = 2
CONCURRENCY_LIMIT_DEFAULT_medium = 5
CONCURRENCY_LIMIT_DEFAULT_large = 11

TEST_LIMIT = 3

Spin up a pool with this configuration, and submit this job:

universe = vanilla
cmd = /bin/sleep
args = 600
transfer_executable = false
should_transfer_files = if_needed
when_to_transfer_output = on_exit
concurrency_limits = large.test
queue 20
concurrency_limits = medium.test
queue 20
concurrency_limits = small.test
queue 20
concurrency_limits = test
queue 20
concurrency_limits = undef.test
queue 20
concurrency_limits = undef
queue 20

If you grep the negotiator log as follows, after submission you should see:

$ tail -f NegotiatorLog | grep -e '-------' -e 'Limits --' -e 'Limit:'
03/06/12 16:28:05 ---------- Started Negotiation Cycle ----------
03/06/12 16:28:06 Previous Limits --
03/06/12 16:28:06 Current Limits --
03/06/12 16:28:06 ---------- Finished Negotiation Cycle ----------
03/06/12 16:28:12 ---------- Started Negotiation Cycle ----------
03/06/12 16:28:13 Previous Limits --
03/06/12 16:28:13 Current Limits --
03/06/12 16:28:13 Concurrency Limit: large.test is 0.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 1.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 2.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 3.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 4.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 5.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 6.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 7.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 8.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 9.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 10.000000
03/06/12 16:28:13 Concurrency Limit: large.test is 11.000000
03/06/12 16:28:13 Concurrency Limit: medium.test is 0.000000
03/06/12 16:28:14 Concurrency Limit: medium.test is 1.000000
03/06/12 16:28:14 Concurrency Limit: medium.test is 2.000000
03/06/12 16:28:14 Concurrency Limit: medium.test is 3.000000
03/06/12 16:28:14 Concurrency Limit: medium.test is 4.000000
03/06/12 16:28:14 Concurrency Limit: medium.test is 5.000000
03/06/12 16:28:14 Concurrency Limit: small.test is 0.000000
03/06/12 16:28:14 Concurrency Limit: small.test is 1.000000
03/06/12 16:28:14 Concurrency Limit: small.test is 2.000000
03/06/12 16:28:14 Concurrency Limit: test is 0.000000
03/06/12 16:28:14 Concurrency Limit: test is 1.000000
03/06/12 16:28:14 Concurrency Limit: test is 2.000000
03/06/12 16:28:14 Concurrency Limit: test is 3.000000
03/06/12 16:28:14 Concurrency Limit: undef.test is 0.000000
03/06/12 16:28:14 Concurrency Limit: undef.test is 1.000000
03/06/12 16:28:14 Concurrency Limit: undef is 0.000000
03/06/12 16:28:14 Concurrency Limit: undef is 1.000000
03/06/12 16:28:15 ---------- Finished Negotiation Cycle ----------

Another verification that the concurrency limits were obeyed as defined:

$ qvhist ConcurrencyLimits -c 'JobStatus == 2'
     11 large.test
      5 medium.test
      2 small.test
      3 test
      1 undef
      1 undef.test
     23 total

note, qvhist can be found here: https://github.com/erikerlandson/bash_condor_tools
Comment 10 Erik Erlandson 2012-03-08 15:12:16 EST
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
Customer wanted to alter the available concurrency limits for jobs on a frequent basis.

Consequence:
Frequent negotiator reconfigurations were required, and impacted pool performance.

Change:
The negotiator accountant was enhanced to support named groups for scoping multiple concurrency limit defaults based on a limit name prefix.

Result:
Concurrency limits can be defined with multiple possible default values based on name prefix, without needing to invoke frequent negotiator reconfigurations.
Comment 12 Lubos Trilety 2012-04-05 09:44:50 EDT
I run the scenario from Comment 9 and then I run condor_userprio -l. condor_userprio prints only two concurrency limits 'undef' and 'test':
# condor_userprio -l | grep -i limit
ConcurrencyLimit_undef = 1.000000
ConcurrencyLimit_test = 3.000000

I suppose there should be all used limits even those special with point in name.
Comment 13 Erik Erlandson 2012-04-05 19:37:56 EDT
(In reply to comment #12)
> I run the scenario from Comment 9 and then I run condor_userprio -l.
> condor_userprio prints only two concurrency limits 'undef' and 'test':
> # condor_userprio -l | grep -i limit
> ConcurrencyLimit_undef = 1.000000
> ConcurrencyLimit_test = 3.000000
> 
> I suppose there should be all used limits even those special with point in
> name.

The problem here is interesting: the concurrency limits are advertised by constructing an attribute name ConcurrencyLimit_<name>, but in the case of these new names, there is a '.' in them, which has special meaning in classads.  

The accountant attempts to invoke something like Assign('ConcurrencyLimit_large.test = 3'), but that fails, and it is ignored, because it's interpreting the '.' as a selection operator on 'ConcurrencyLimit_large'

It's not easy to get around, as the classads go over the wire, which means they are unparsed and then reparsed on the receiving end (userprio).  So attempting to use any other delimiting character is going to fail, by virtue of the fact that the lexer only lexes identifiers with alpha-numerics and '_'.    

So, we could advertise 'ConcurrencyLimit_large_test' instead of 'ConcurrencyLimit_large.test', but I think anything else would require rethinking how we report concurrency limits to userprio, or somehow leveraging support for single-quoted identifiers from the classad standard.
Comment 14 Erik Erlandson 2012-04-09 13:13:28 EDT
Update that advertises grouped cc-limits by replacing '.' with '_', (e.g. ConcurrencyLimit_large_test), committed to UPSTREAM-7.7.6-BZ721110-scoped-default-limits
Comment 16 Lubos Trilety 2012-05-25 02:57:08 EDT
Tested with:
condor-7.6.8-0.1

Tested on:
RHEL5 x86_64,i386  - passed
RHEL6 x86_64,i386  - passed

>>> VERIFIED
Comment 18 Lubos Trilety 2012-06-18 10:27:04 EDT
Tested with:
condor-7.6.5-0.15

Tested on:
RHEL5 x86_64,i386  - passed
RHEL6 x86_64,i386  - passed

>>> VERIFIED
Comment 20 errata-xmlrpc 2012-09-19 13:41:33 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-1278.html

Note You need to log in before you can comment on or make changes to this bug.