Description of problem: We would like the ability to set a 1-tier grouping of concurrency limit defaults to assist in avoiding a high rate of negotiator reconfigs for concurrency limit adjustments. We envision consuming that as something like: CONCURRENCY_LIMIT_DEFAULT=150 CONCURRENCY_LIMIT_DEFAULT_LIST = direct,batch CONCURRENCY_LIMIT_DEFAULT_LIST_direct = 40 CONCURRENCY_LIMIT_DEFAULT_LIST_batch = 120 A concurrency limit of "gl5000" would start with a value of 150. A concurrency limit of "direct.gl5000" would start with a value of 40. A concurrency limit of "batch.gl5000" would start with a value of 120.
Fix pushed to: UPSTREAM-7.7.6-BZ721110-scoped-default-limits
TESTING: Begin with this configuration, that defines some scoped cc-limit defaults and a traditional cc-limit "TEST": NEGOTIATOR_DEBUG = D_ACCOUNTANT | D_FULLDEBUG NEGOTIATOR_INTERVAL = 20 NEGOTIATOR_CYCLE_DELAY = 5 # oversubscribe some slots to make testing easier: NUM_CPUS = 25 # concurrency limit defaults: CONCURRENCY_LIMIT_DEFAULT = 1 CONCURRENCY_LIMIT_DEFAULT_small = 2 CONCURRENCY_LIMIT_DEFAULT_medium = 5 CONCURRENCY_LIMIT_DEFAULT_large = 11 TEST_LIMIT = 3 Spin up a pool with this configuration, and submit this job: universe = vanilla cmd = /bin/sleep args = 600 transfer_executable = false should_transfer_files = if_needed when_to_transfer_output = on_exit concurrency_limits = large.test queue 20 concurrency_limits = medium.test queue 20 concurrency_limits = small.test queue 20 concurrency_limits = test queue 20 concurrency_limits = undef.test queue 20 concurrency_limits = undef queue 20 If you grep the negotiator log as follows, after submission you should see: $ tail -f NegotiatorLog | grep -e '-------' -e 'Limits --' -e 'Limit:' 03/06/12 16:28:05 ---------- Started Negotiation Cycle ---------- 03/06/12 16:28:06 Previous Limits -- 03/06/12 16:28:06 Current Limits -- 03/06/12 16:28:06 ---------- Finished Negotiation Cycle ---------- 03/06/12 16:28:12 ---------- Started Negotiation Cycle ---------- 03/06/12 16:28:13 Previous Limits -- 03/06/12 16:28:13 Current Limits -- 03/06/12 16:28:13 Concurrency Limit: large.test is 0.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 1.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 2.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 3.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 4.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 5.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 6.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 7.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 8.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 9.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 10.000000 03/06/12 16:28:13 Concurrency Limit: large.test is 11.000000 03/06/12 16:28:13 Concurrency Limit: medium.test is 0.000000 03/06/12 16:28:14 Concurrency Limit: medium.test is 1.000000 03/06/12 16:28:14 Concurrency Limit: medium.test is 2.000000 03/06/12 16:28:14 Concurrency Limit: medium.test is 3.000000 03/06/12 16:28:14 Concurrency Limit: medium.test is 4.000000 03/06/12 16:28:14 Concurrency Limit: medium.test is 5.000000 03/06/12 16:28:14 Concurrency Limit: small.test is 0.000000 03/06/12 16:28:14 Concurrency Limit: small.test is 1.000000 03/06/12 16:28:14 Concurrency Limit: small.test is 2.000000 03/06/12 16:28:14 Concurrency Limit: test is 0.000000 03/06/12 16:28:14 Concurrency Limit: test is 1.000000 03/06/12 16:28:14 Concurrency Limit: test is 2.000000 03/06/12 16:28:14 Concurrency Limit: test is 3.000000 03/06/12 16:28:14 Concurrency Limit: undef.test is 0.000000 03/06/12 16:28:14 Concurrency Limit: undef.test is 1.000000 03/06/12 16:28:14 Concurrency Limit: undef is 0.000000 03/06/12 16:28:14 Concurrency Limit: undef is 1.000000 03/06/12 16:28:15 ---------- Finished Negotiation Cycle ---------- Another verification that the concurrency limits were obeyed as defined: $ qvhist ConcurrencyLimits -c 'JobStatus == 2' 11 large.test 5 medium.test 2 small.test 3 test 1 undef 1 undef.test 23 total note, qvhist can be found here: https://github.com/erikerlandson/bash_condor_tools
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: Customer wanted to alter the available concurrency limits for jobs on a frequent basis. Consequence: Frequent negotiator reconfigurations were required, and impacted pool performance. Change: The negotiator accountant was enhanced to support named groups for scoping multiple concurrency limit defaults based on a limit name prefix. Result: Concurrency limits can be defined with multiple possible default values based on name prefix, without needing to invoke frequent negotiator reconfigurations.
I run the scenario from Comment 9 and then I run condor_userprio -l. condor_userprio prints only two concurrency limits 'undef' and 'test': # condor_userprio -l | grep -i limit ConcurrencyLimit_undef = 1.000000 ConcurrencyLimit_test = 3.000000 I suppose there should be all used limits even those special with point in name.
(In reply to comment #12) > I run the scenario from Comment 9 and then I run condor_userprio -l. > condor_userprio prints only two concurrency limits 'undef' and 'test': > # condor_userprio -l | grep -i limit > ConcurrencyLimit_undef = 1.000000 > ConcurrencyLimit_test = 3.000000 > > I suppose there should be all used limits even those special with point in > name. The problem here is interesting: the concurrency limits are advertised by constructing an attribute name ConcurrencyLimit_<name>, but in the case of these new names, there is a '.' in them, which has special meaning in classads. The accountant attempts to invoke something like Assign('ConcurrencyLimit_large.test = 3'), but that fails, and it is ignored, because it's interpreting the '.' as a selection operator on 'ConcurrencyLimit_large' It's not easy to get around, as the classads go over the wire, which means they are unparsed and then reparsed on the receiving end (userprio). So attempting to use any other delimiting character is going to fail, by virtue of the fact that the lexer only lexes identifiers with alpha-numerics and '_'. So, we could advertise 'ConcurrencyLimit_large_test' instead of 'ConcurrencyLimit_large.test', but I think anything else would require rethinking how we report concurrency limits to userprio, or somehow leveraging support for single-quoted identifiers from the classad standard.
Update that advertises grouped cc-limits by replacing '.' with '_', (e.g. ConcurrencyLimit_large_test), committed to UPSTREAM-7.7.6-BZ721110-scoped-default-limits
Tested with: condor-7.6.8-0.1 Tested on: RHEL5 x86_64,i386 - passed RHEL6 x86_64,i386 - passed >>> VERIFIED
Tested with: condor-7.6.5-0.15 Tested on: RHEL5 x86_64,i386 - passed RHEL6 x86_64,i386 - passed >>> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-1278.html