Bug 675697 - reset of group user priority
Summary: reset of group user priority
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 1.2
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: 2.1
: ---
Assignee: Erik Erlandson
QA Contact: Daniel Horák
URL:
Whiteboard:
Depends On:
Blocks: 743350
TreeView+ depends on / blocked
 
Reported: 2011-02-07 12:10 UTC by Lubos Trilety
Modified: 2012-01-23 17:25 UTC (History)
6 users (show)

Fixed In Version: 7.6.4-0.3
Doc Type: Bug Fix
Doc Text:
The check that determined whether a submitter record could safely be deleted from the Accountant contained a logic error. The logic checked for priority factor equal to default priority factor DEFAULT_PRIO_FACTOR, which allowed record deletion if the submitter's factor happened to be set to the default. The deletion-checking logic has been corrected in these updated packages so that any explicitly-set value for priority factor, default or otherwise, is now detected. As a result, sumbitter records are no longer improperly deleted if the user-set priority factor happens to be equal to the DEFAULT_PRIO_FACTOR value.
Clone Of:
Environment:
Last Closed: 2012-01-23 17:25:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
NegotiatorLog (177.75 KB, text/plain)
2011-02-07 13:14 UTC, Lubos Trilety
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2012:0045 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Grid 2.1 bug fix and enhancement update 2012-01-23 22:22:58 UTC

Comment 1 Lubos Trilety 2011-02-07 13:14:19 UTC
Description of problem:
Group user priority change is reset to GROUP_PRIO_FACTOR_<GROUP> by condor if
the changed value is equal to DEFAULT_PRIO_FACTOR value.

Version-Release number of selected component (if applicable):
condor-7.4.5-0.8

How reproducible:
100%

Steps to Reproduce:
1. set (the line 'DEFAULT_PRIO_FACTOR = 1' is not needed, cause that's the
default setting)
NUM_CPUS = 100
GROUP_NAMES = a1
GROUP_QUOTA_DYNAMIC_a1 = 0.05
GROUP_PRIO_FACTOR_a1 = 4.0
GROUP_AUTOREGROUP_a1 = TRUE

2. start condor

3. set group user priority factor for a1.user2 to 1
# condor_userprio -setfactor a1.user2@`hostname` 1
The priority factor of a1.user2@<hostname> was set to 1.000000

4. run 'condor_userprio -l'
# condor_userprio -l
LastUpdate = 1297075475
Name1 = "a1.user2@<hostname>"
Priority1 = 0.500000
ResourcesUsed1 = 0
WeightedResourcesUsed1 = 0.000000
AccumulatedUsage1 = 0.000000
WeightedAccumulatedUsage1 = 0.000000
BeginUsageTime1 = 0
LastUsageTime1 = 0
PriorityFactor1 = 1.000000
NumSubmittors = 1

5. wait a minute, run 'condor_userprio -l' again
# condor_userprio -l
LastUpdate = 1297075515
NumSubmittors = 0

6. submit 100 jobs for each A1.user1 and A1.user2
# su condor_user -c 'echo -e
"cmd=/bin/sleep\nargs=1d\n+AccountingGroup=\"a1.user1\"\nqueue
100\n+AccountingGroup=\"a1.user2\"\nqueue 100" | condor_submit'
Submitting
job(s)........................................................................................................................................................................................................
200 job(s) submitted to cluster 1.

7. wait until all 100 jobs start run 'condor_userprio -l' again
# condor_userprio -l
LastUpdate = 1297075555
Name1 = "a1"
Priority1 = 2.000000
ResourcesUsed1 = 100
...
PriorityFactor1 = 4.000000
Name2 = "a1.user1@<hostname>"
Priority2 = 2.000000
ResourcesUsed2 = 50
...
PriorityFactor2 = 4.000000
Name3 = "a1.user2@<hostname>"
Priority3 = 2.000000
ResourcesUsed3 = 50
...
PriorityFactor3 = 4.000000
NumSubmittors = 3


Actual results:
a1.user1 - resources ... 50
         - priority factor ... 4
a1.user2 - resources ... 50
         - priority factor ... 4


Expected results:
a1.user1 - resources ... 20
         - priority factor ... 4
a1.user2 - resources ... 80
         - priority factor ... 1

Additional info:
see NegotiatorLog in attachment

Comment 2 Lubos Trilety 2011-02-07 13:14:42 UTC
Created attachment 477407 [details]
NegotiatorLog

Comment 9 Erik Erlandson 2011-09-07 20:48:52 UTC
upstream: https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2442

Comment 10 Erik Erlandson 2011-09-12 21:58:06 UTC
REPRO/TEST

Using the following configuration:

NEGOTIATOR_DEBUG = D_FULLDEBUG
NEGOTIATOR_INTERVAL = 30
SCHEDD_INTERVAL	= 15
NEGOTIATOR_USE_SLOT_WEIGHTS = FALSE
GROUP_QUOTA_MAX_ALLOCATION_ROUNDS = 1

NUM_CPUS = 10

GROUP_NAMES = a
GROUP_QUOTA_a = 10
GROUP_PRIO_FACTOR_a = 4.0
GROUP_ACCEPT_SURPLUS_a = TRUE

To reproduce, set the prio-factor on "a.u1" to be 1, and "a.u2" to be 2. Before the fix, the entry for "a.u1" will be deleted on the next accountant update, because its prio-factor was set to DEFAULT_PRIO_FACTOR (=1):

# when the pool starts up, we have only entries for "a" and "<none>"
[eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "a"
Name2 = "<none>"
PriorityFactor1 = 4.000000
PriorityFactor2 = 1.000000

# now set prio-factor for "a.u1" and "a.u2"
[eje@rorschach ~]$ condor_userprio -setfactor a.u1@localdomain 1
The priority factor of a.u1@localdomain was set to 1.000000
[eje@rorschach ~]$ condor_userprio -setfactor a.u2@localdomain 2
The priority factor of a.u2@localdomain was set to 2.000000

# immediately after setting, we see entries for "a.u1" and "a.u2"
[eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "a"
Name2 = "<none>"
Name3 = "a.u2@localdomain"
Name4 = "a.u1@localdomain"
PriorityFactor1 = 4.000000
PriorityFactor2 = 1.000000
PriorityFactor3 = 2.000000
PriorityFactor4 = 1.000000

# next time the accountant updates, the entry for "a.u1" is erased, because its prio factor happened to be equal to DEFAULT_PRIO_FACTOR.  "a.u2" still exists.
[eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "a"
Name2 = "<none>"
Name3 = "a.u2@localdomain"
PriorityFactor1 = 4.000000
PriorityFactor2 = 1.000000
PriorityFactor3 = 2.000000

After the fix, we see that the entry for "a.u1" does not get removed, which is the correct behavior

# accountant entries before any submitters:
[eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "a"
Name2 = "<none>"
PriorityFactor1 = 4.000000
PriorityFactor2 = 1.000000

# set prio factor for "a.u1" and "a.u2" as before:
[eje@rorschach ~]$ condor_userprio -setfactor a.u1@localdomain 1
The priority factor of a.u1@localdomain was set to 1.000000
[eje@rorschach ~]$ condor_userprio -setfactor a.u2@localdomain 2
The priority factor of a.u2@localdomain was set to 2.000000

# entries immediately after setting prio factors:
[eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "a"
Name2 = "<none>"
Name3 = "a.u2@localdomain"
Name4 = "a.u1@localdomain"
PriorityFactor1 = 4.000000
PriorityFactor2 = 1.000000
PriorityFactor3 = 2.000000
PriorityFactor4 = 1.000000

# Now we see that both "a.u1" and "a.u2" still exist after update, as expected:
[eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "a"
Name2 = "<none>"
Name3 = "a.u2@localdomain"
Name4 = "a.u1@localdomain"
PriorityFactor1 = 4.000000
PriorityFactor2 = 1.000000
PriorityFactor3 = 2.000000
PriorityFactor4 = 1.000000

Comment 11 Erik Erlandson 2011-09-12 22:03:05 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
A logic error in the check that determines whether a submitter record can be safely deleted from the Accountant.  The logic checked for priority factor equal to default priority factor DEFAULT_PRIO_FACTOR, which allowed record deletion if the submitter's factor happened to be set to the default.

Consequence:
A submitter record with an explicity set priority factor could be deleted in appropriately if that factor was set to DEFAULT_PRIO_FACTOR.

Fix:
The deletion checking logic was corrected so that it detected any explicitly set value for priority factor, default or otherwise.

Result:
Submitter records are no longer improperly deleted if user-set priority factor happens to be equal to DEFAULT_PRIO_FACTOR.

Comment 13 Daniel Horák 2011-10-13 13:21:48 UTC
Reproduced on RHEL 5.7 i386 along to #10:
# rpm -qa | grep condor
  condor-classads-7.6.3-0.3.el5
  condor-7.6.3-0.3.el5

Configuration:
  NEGOTIATOR_DEBUG = D_FULLDEBUG
  NEGOTIATOR_INTERVAL = 30
  SCHEDD_INTERVAL = 15
  NEGOTIATOR_USE_SLOT_WEIGHTS = FALSE
  GROUP_QUOTA_MAX_ALLOCATION_ROUNDS = 1
  NUM_CPUS = 10
  GROUP_NAMES = a
  GROUP_QUOTA_a = 10
  GROUP_PRIO_FACTOR_a = 4.0
  GROUP_ACCEPT_SURPLUS_a = TRUE

# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
  Name1 = "a"
  Name2 = "<none>"
  PriorityFactor1 = 4.000000
  PriorityFactor2 = 1.000000

# condor_userprio -setfactor a.u1@$(hostname) 1
  The priority factor of a.u1@<hostname> was set to 1.000000

# condor_userprio -setfactor a.u2@$(hostname) 2
  The priority factor of a.u2@<hostname> was set to 2.000000

# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
  Name1 = "a.u1@<hostname>"
  Name2 = "a"
  Name3 = "<none>"
  Name4 = "a.u2@<hostname>"
  PriorityFactor1 = 1.000000
  PriorityFactor2 = 4.000000
  PriorityFactor3 = 1.000000
  PriorityFactor4 = 2.000000

# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
  Name1 = "a"
  Name2 = "<none>"
  Name3 = "a.u2@<hostname>"
  PriorityFactor1 = 4.000000
  PriorityFactor2 = 1.000000
  PriorityFactor3 = 2.000000



Verified on RHEL 5.7 i386 along to #10: 
# rpm -qa | grep condor
  condor-7.6.4-0.7.el5
  condor-classads-7.6.4-0.7.el5

# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
  Name1 = "<none>"
  Name2 = "a"
  PriorityFactor1 = 1.000000
  PriorityFactor2 = 4.000000

# condor_userprio -setfactor a.u1@$(hostname) 1
  The priority factor of a.u1@<hostname> was set to 1.000000

# condor_userprio -setfactor a.u2@$(hostname) 2
  The priority factor of a.u2@<hostname> was set to 2.000000

# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
  Name1 = "<none>"
  Name2 = "a"
  Name3 = "a.u1@<hostname>"
  Name4 = "a.u2@<hostname>"
  PriorityFactor1 = 1.000000
  PriorityFactor2 = 4.000000
  PriorityFactor3 = 1.000000
  PriorityFactor4 = 2.000000

# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
  Name1 = "<none>"
  Name2 = "a"
  Name3 = "a.u1@<hostname>"
  Name4 = "a.u2@<hostname>"
  PriorityFactor1 = 1.000000
  PriorityFactor2 = 4.000000
  PriorityFactor3 = 1.000000
  PriorityFactor4 = 2.000000

Output on platforms RHEL 5.7 x86_64, RHEL 6.1 i386 and RHEL 6.1 x86_64 is similar.


Along to #0 reproduced on RHEL 5.7 i386 and verified on RHEL 5.7 i386, RHEL 5.7 x86_64, RHEL 6.1 i386 and RHEL 6.1 x86_64 => expected result.



>>> VERIFIED

Comment 14 Douglas Silas 2011-11-16 16:19:35 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,11 +1 @@
-Cause:
+The check that determined whether a submitter record could safely be deleted from the Accountant contained a logic error. The logic checked for priority factor equal to default priority factor DEFAULT_PRIO_FACTOR, which allowed record deletion if the submitter's factor happened to be set to the default. The deletion-checking logic has been corrected in these updated packages so that any explicitly-set value for priority factor, default or otherwise, is now detected. As a result, sumbitter records are no longer improperly deleted if the user-set priority factor happens to be equal to the DEFAULT_PRIO_FACTOR value.-A logic error in the check that determines whether a submitter record can be safely deleted from the Accountant.  The logic checked for priority factor equal to default priority factor DEFAULT_PRIO_FACTOR, which allowed record deletion if the submitter's factor happened to be set to the default.
-
-Consequence:
-A submitter record with an explicity set priority factor could be deleted in appropriately if that factor was set to DEFAULT_PRIO_FACTOR.
-
-Fix:
-The deletion checking logic was corrected so that it detected any explicitly set value for priority factor, default or otherwise.
-
-Result:
-Submitter records are no longer improperly deleted if user-set priority factor happens to be equal to DEFAULT_PRIO_FACTOR.

Comment 15 errata-xmlrpc 2012-01-23 17:25:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2012-0045.html


Note You need to log in before you can comment on or make changes to this bug.