Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1080491

Summary: Concurrency limits exceed their setting
Product: Red Hat Enterprise MRG Reporter: Lubos Trilety <ltrilety>
Component: condorAssignee: grid-maint-list <grid-maint-list>
Status: CLOSED NOTABUG QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.5CC: eerlands, matt, sgraf
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-03-27 13:19:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
NegotiatorLog none

Description Lubos Trilety 2014-03-25 14:30:17 UTC
Created attachment 878485 [details]
NegotiatorLog

Description of problem:
Sometimes the negotiator matches more jobs for some concurrency limit than it's actual size of the limit.

Version-Release number of selected component (if applicable):
condor-7.8.9-0.8

How reproducible:
30%

Steps to Reproduce:
1. set condor
NUM_CPUS=30
NEGOTIATOR_INTERVAL=20
TEST_LIMIT=3
NEGOTIATOR_CYCLE_DELAY=5
NEGOTIATOR_DEBUG=D_ACCOUNTANT | D_FULLDEBUG
CONCURRENCY_LIMIT_DEFAULT_small=2
CONCURRENCY_LIMIT_DEFAULT_medium=5
CONCURRENCY_LIMIT_DEFAULT=1
CONCURRENCY_LIMIT_DEFAULT_large=11

2. submit job
should_transfer_files=IF_NEEDED
concurrency_limits=large.test
executable=/bin/sleep
iwd=/tmp
requirements=(FileSystemDomain =!= UNDEFINED && Arch =!= UNDEFINED)
transfer_executable=False
universe=vanilla
arguments=6000
when_to_transfer_output=ON_EXIT
queue 20
concurrency_limits=medium.test
queue 20
concurrency_limits=small.test
queue 20
concurrency_limits=test
queue 20
concurrency_limits=undef.test
queue 20
concurrency_limits=undef
queue 20
concurrency_limits=medium.undef
queue 20

3. see limits
# condor_userprio -l | grep "ConcurrencyLimit"
ConcurrencyLimit_medium_test = 5.000000
ConcurrencyLimit_small_test = 2.000000
ConcurrencyLimit_medium_undef = 5.000000
ConcurrencyLimit_large_test = 13.000000
ConcurrencyLimit_undef = 1.000000
ConcurrencyLimit_undef_test = 1.000000
ConcurrencyLimit_test = 3.000000

# condor_q -c 'JobStatus == 2' -l | grep "ConcurrencyLimits " | uniq -c
     13 ConcurrencyLimits = "large.test"
      5 ConcurrencyLimits = "medium.test"
      2 ConcurrencyLimits = "small.test"
      3 ConcurrencyLimits = "test"
      1 ConcurrencyLimits = "undef.test"
      1 ConcurrencyLimits = "undef"
      5 ConcurrencyLimits = "medium.undef"


Actual results:
There is more running jobs than it should with such setting. In this case 'large.test' concurrency limit is exceeded. In other runs it's for example 'medium.test'.

Expected results:
No concurrency limit should exceed.

Additional info:
See NegotiatorLog in attachment

Comment 1 Erik Erlandson 2014-03-25 17:46:55 UTC
From the attached log, it looks like the accountant is picking up stale ads from the collector in its "(Accountant) Checking Matches" phase.   The concurrency limits are being properly respected at the point where they are checked, but it is seeing 'extra' ads that are probably stale in the collector.  It is not adding new matches against the cc-limits, which is the desired behavior.

In the configuration above, NEGOTIATOR_INTERVAL is set to 20 seconds, which is a tight interval that can get ahead of the state changes in the startds and the collector.  That can cause the negotiator's counting of various resources to get a bit out of sync.

Setting NEGOTIATOR_INTERVAL to be longer (e.g. 60 seconds), and making sure to configure the startd update interval, UPDATE_INTERVAL, to be shorter than NEGOTIATOR_INTERVAL, should prevent the negotiator from getting ahead of the collector and startds, and the cc-limit accounting will stop looking out of sync.

Comment 2 Lubos Trilety 2014-03-26 13:49:57 UTC
(In reply to Erik Erlandson from comment #1)

I took the configuration from Bug 721110. It works in previous builds pretty well or I we were just really 'lucky'.
I thought that negotiator communicates directly with startd not via collector.
Anyway it happens also with update_interval set to 28 and negotiator_interval set to 30. However I was not able reproduce the bug with negotiator_interval set to 60 and update_interval to 55 or with default settings (both intervals not set).

Comment 3 Erik Erlandson 2014-03-26 17:54:37 UTC
(In reply to Lubos Trilety from comment #2)
> (In reply to Erik Erlandson from comment #1)

> I thought that negotiator communicates directly with startd not via
> collector.

The negotiator gets all its information about the state of resource usage across the pool from the collector.   The startds only update their state to the collector at intervals, so the negotiator technically never sees the 'true' instantaneous state of the system.   For example, the negotiator may not see resources that have freed up very recently. 

On a related note: the negotiator sends its match information to the scheduler(s), where it gets passed to the various startds.  the startds then go through the process of spinning up starters, possibly creating dynamic slots, changing slot states to 'claimed', eventually sending claim information back to the collector.

So, when negotiator intervals are short, it is even possible for the negotiator to make a match, but that match information will have not yet circulated its way back to the collector when the negotiator begins its next cycle.

At any rate, when testing behaviors of things like accounting groups and cc-limits, it is important to take these various propagation latencies into account.   Any changes to resource usage, either using resources or freeing them up, need time to get back around to the collector before the negotiator will see them and respond.

Comment 4 Lubos Trilety 2014-03-27 13:19:02 UTC
(In reply to Erik Erlandson from comment #3)

Seems to me explained enough, closing as not a bug.