Bug 641431 - RFE: implement placeholder for new negotiator stat ATTR_LAST_NEGOTIATION_CYCLE_SUBMITTERS_SHARE_LIMIT
Summary: RFE: implement placeholder for new negotiator stat ATTR_LAST_NEGOTIATION_CYCL...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 1.3
Hardware: All
OS: All
low
low
Target Milestone: 1.3.2
: ---
Assignee: Erik Erlandson
QA Contact: Lubos Trilety
URL:
Whiteboard:
Depends On: 641418
Blocks: 674669
TreeView+ depends on / blocked
 
Reported: 2010-10-08 17:38 UTC by Erik Erlandson
Modified: 2018-11-14 16:37 UTC (History)
4 users (show)

Fixed In Version: condor-7.4.5-0.2
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-02-15 13:02:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Erik Erlandson 2010-10-08 17:38:16 UTC
ATTR_LAST_NEGOTIATION_CYCLE_SUBMITTERS_SHARE_LIMIT is a placeholder for submitters which hit the HFS group share limit during negotiation cycle.

Implementing this stat correctly depends on addressing this:
https://bugzilla.redhat.com/show_bug.cgi?id=641418

Comment 2 Erik Erlandson 2010-11-24 23:48:57 UTC
Addressed in devel branch: V7_4-BZ619557-HFS-tree-structure

Comment 5 Lubos Trilety 2011-01-27 09:24:29 UTC
How to test this functionality? Where can the placeholder ATTR_LAST_NEGOTIATION_CYCLE_SUBMITTERS_SHARE_LIMIT be seen?

Comment 6 Erik Erlandson 2011-01-27 13:26:11 UTC
(In reply to comment #5)

It is implemented in the latest releases, and can be seen in the Negotiator classad.

Comment 7 Lubos Trilety 2011-01-27 15:16:08 UTC
config file:
NUM_CPUS = 2
GROUP_NAMES = a, b
GROUP_QUOTA_DYNAMIC_a = 0.5
GROUP_QUOTA_DYNAMIC_b = 0.5


# echo -e "universe=vanilla\ncmd=/bin/sleep\nargs=1d\n+AccountingGroup=\"a.u3\"\nqueue 2\n+AccountingGroup=\"a.u1\"\nqueue 2\n+AccountingGroup=\"a.u2\"\nqueue 2\n" | runuser condor -s /bin/bash -c "condor_submit"
Submitting job(s)......
6 job(s) submitted to cluster 1.

# condor_q
-- Submitter: hostname : <IP:47042> : hostname
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
   1.0   condor          1/27 09:57   0+00:00:00 I  0   0.0  sleep 1d          
   1.1   condor          1/27 09:57   0+00:00:00 I  0   0.0  sleep 1d          
   1.2   condor          1/27 09:57   0+00:10:16 R  0   0.0  sleep 1d          
   1.3   condor          1/27 09:57   0+00:00:00 I  0   0.0  sleep 1d          
   1.4   condor          1/27 09:57   0+00:00:00 I  0   0.0  sleep 1d          
   1.5   condor          1/27 09:57   0+00:00:00 I  0   0.0  sleep 1d          
6 jobs; 5 idle, 1 running, 0 held

# condor_status -subsystem negotiator -l | grep SubmittersShareLimit
LastNegotiationCycleSubmittersShareLimit0 = ""
LastNegotiationCycleSubmittersShareLimit1 = ""
LastNegotiationCycleSubmittersShareLimit2 = ""

>>> ASSIGNED

Comment 8 Erik Erlandson 2011-01-28 20:42:28 UTC
I tracked down the reconfig connection.  In fact the behavior prior to reconfig is correct:   prior to reconfig, the HFS logic is skipping negotiation because it's already at quota:

01/28/11 12:08:02 Group <none> - skipping, zero slots allocated
01/28/11 12:08:02 Group a - skipping, at or over quota (usage=1)
01/28/11 12:08:02 Group b - skipping, zero slots allocated

## Note userprio:
[eje@rorschach gt986]$ condor_userprio -all
Last Priority Update:  1/28 12:09
                                    Effective   Real     Priority   Res   Total Usage       Usage            Last      
User Name                           Priority  Priority    Factor    Used (wghted-hrs)    Start Time       Usage Time   
------------------------------      --------- -------- ------------ ----  ----------- ---------------- ----------------
a                                        0.50     0.50         1.00    1         0.03  1/28/2011 12:08  1/28/2011 12:09
a.user1@localdomain                      0.50     0.50         1.00    1         0.03  1/28/2011 12:08  1/28/2011 12:09


## Now, after reconfig, there is a glitch in the update of usage of group "a" it got set to zero when it should have been 1, like above:
[eje@rorschach gt986]$ condor_userprio -all
Last Priority Update:  1/28 12:10
                                    Effective   Real     Priority   Res   Total Usage       Usage            Last      
User Name                           Priority  Priority    Factor    Used (wghted-hrs)    Start Time       Usage Time   
------------------------------      --------- -------- ------------ ----  ----------- ---------------- ----------------
a                                        0.50     0.50         1.00    0         0.04  1/28/2011 12:08  1/28/2011 12:10
a.user1@localdomain                      0.50     0.50         1.00    1         0.05  1/28/2011 12:08  1/28/2011 12:10


## Now, the usage for group "a" is (incorrectly) zero, and the HFS logic does not skip the negotiation:
01/28/11 12:10:34 Group <none> - skipping, zero slots allocated
01/28/11 12:10:34 Group b - skipping, zero slots allocated

Therefore, any excess submitters go thru negotiation and are rejected for the limit, and show up in the negotiator stats.

NOTE: the above also assume that NEGOTIATOR_CONSIDER_PREEMPTION=TRUE

My conclusion is that the negotiator stats in the classad are being maintained correctly. My recommendation is that we close this bug as verified, but I have opened a new one to fix the glitch in the accountant stats for groups on reconfig:
https://bugzilla.redhat.com/show_bug.cgi?id=673592

Comment 11 Lubos Trilety 2011-02-07 15:41:08 UTC
Actually only submitter limit are listed, more intuitive will be to list both submitter and accounting group limits. A new bug 674669 was created for that.

Tested with (version):
condor-7.4.5-0.8

NEGOTIATOR_UPDATE_INTERVAL = 5
NUM_CPUS = 2
GROUP_NAMES = a, b
GROUP_QUOTA_DYNAMIC_a = 0.8
GROUP_QUOTA_DYNAMIC_b = 0.2
GROUP_AUTOREGROUP_b = TRUE

In first terminal
watch "condor_status -subsystem negotiator -l | grep SubmittersShareLimit"

In second terminal
echo -e "cmd=/bin/sleep\nargs=1d\n+AccountingGroup=\"b.user\"\nqueue 2" | runuser condor -s /bin/bash -c "condor_submit"

observe in first terminal
LastNegotiationCycleSubmittersShareLimitX = "b.user@hostname"


Tested on:
RHEL5 i386,x86_64  - passed
RHEL4 i386,x86_64  - passed

>>> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.