| Summary: | reset of group user priority | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Lubos Trilety <ltrilety> | ||||
| Component: | condor | Assignee: | Erik Erlandson <eerlands> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Daniel Horák <dahorak> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 1.2 | CC: | dahorak, iboverma, jneedle, matt, mkudlej, tstclair | ||||
| Target Milestone: | 2.1 | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | 7.6.4-0.3 | Doc Type: | Bug Fix | ||||
| Doc Text: |
The check that determined whether a submitter record could safely be deleted from the Accountant contained a logic error. The logic checked for priority factor equal to default priority factor DEFAULT_PRIO_FACTOR, which allowed record deletion if the submitter's factor happened to be set to the default. The deletion-checking logic has been corrected in these updated packages so that any explicitly-set value for priority factor, default or otherwise, is now detected. As a result, sumbitter records are no longer improperly deleted if the user-set priority factor happens to be equal to the DEFAULT_PRIO_FACTOR value.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2012-01-23 17:25:50 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Bug Depends On: | |||||||
| Bug Blocks: | 743350 | ||||||
| Attachments: |
|
||||||
Created attachment 477407 [details]
NegotiatorLog
REPRO/TEST Using the following configuration: NEGOTIATOR_DEBUG = D_FULLDEBUG NEGOTIATOR_INTERVAL = 30 SCHEDD_INTERVAL = 15 NEGOTIATOR_USE_SLOT_WEIGHTS = FALSE GROUP_QUOTA_MAX_ALLOCATION_ROUNDS = 1 NUM_CPUS = 10 GROUP_NAMES = a GROUP_QUOTA_a = 10 GROUP_PRIO_FACTOR_a = 4.0 GROUP_ACCEPT_SURPLUS_a = TRUE To reproduce, set the prio-factor on "a.u1" to be 1, and "a.u2" to be 2. Before the fix, the entry for "a.u1" will be deleted on the next accountant update, because its prio-factor was set to DEFAULT_PRIO_FACTOR (=1): # when the pool starts up, we have only entries for "a" and "<none>" [eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor' Name1 = "a" Name2 = "<none>" PriorityFactor1 = 4.000000 PriorityFactor2 = 1.000000 # now set prio-factor for "a.u1" and "a.u2" [eje@rorschach ~]$ condor_userprio -setfactor a.u1@localdomain 1 The priority factor of a.u1@localdomain was set to 1.000000 [eje@rorschach ~]$ condor_userprio -setfactor a.u2@localdomain 2 The priority factor of a.u2@localdomain was set to 2.000000 # immediately after setting, we see entries for "a.u1" and "a.u2" [eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor' Name1 = "a" Name2 = "<none>" Name3 = "a.u2@localdomain" Name4 = "a.u1@localdomain" PriorityFactor1 = 4.000000 PriorityFactor2 = 1.000000 PriorityFactor3 = 2.000000 PriorityFactor4 = 1.000000 # next time the accountant updates, the entry for "a.u1" is erased, because its prio factor happened to be equal to DEFAULT_PRIO_FACTOR. "a.u2" still exists. [eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor' Name1 = "a" Name2 = "<none>" Name3 = "a.u2@localdomain" PriorityFactor1 = 4.000000 PriorityFactor2 = 1.000000 PriorityFactor3 = 2.000000 After the fix, we see that the entry for "a.u1" does not get removed, which is the correct behavior # accountant entries before any submitters: [eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor' Name1 = "a" Name2 = "<none>" PriorityFactor1 = 4.000000 PriorityFactor2 = 1.000000 # set prio factor for "a.u1" and "a.u2" as before: [eje@rorschach ~]$ condor_userprio -setfactor a.u1@localdomain 1 The priority factor of a.u1@localdomain was set to 1.000000 [eje@rorschach ~]$ condor_userprio -setfactor a.u2@localdomain 2 The priority factor of a.u2@localdomain was set to 2.000000 # entries immediately after setting prio factors: [eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor' Name1 = "a" Name2 = "<none>" Name3 = "a.u2@localdomain" Name4 = "a.u1@localdomain" PriorityFactor1 = 4.000000 PriorityFactor2 = 1.000000 PriorityFactor3 = 2.000000 PriorityFactor4 = 1.000000 # Now we see that both "a.u1" and "a.u2" still exist after update, as expected: [eje@rorschach ~]$ condor_userprio -l | grep -e '^Name' -e '^PriorityFactor' Name1 = "a" Name2 = "<none>" Name3 = "a.u2@localdomain" Name4 = "a.u1@localdomain" PriorityFactor1 = 4.000000 PriorityFactor2 = 1.000000 PriorityFactor3 = 2.000000 PriorityFactor4 = 1.000000
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
Cause:
A logic error in the check that determines whether a submitter record can be safely deleted from the Accountant. The logic checked for priority factor equal to default priority factor DEFAULT_PRIO_FACTOR, which allowed record deletion if the submitter's factor happened to be set to the default.
Consequence:
A submitter record with an explicity set priority factor could be deleted in appropriately if that factor was set to DEFAULT_PRIO_FACTOR.
Fix:
The deletion checking logic was corrected so that it detected any explicitly set value for priority factor, default or otherwise.
Result:
Submitter records are no longer improperly deleted if user-set priority factor happens to be equal to DEFAULT_PRIO_FACTOR.
Reproduced on RHEL 5.7 i386 along to #10:
# rpm -qa | grep condor
condor-classads-7.6.3-0.3.el5
condor-7.6.3-0.3.el5
Configuration:
NEGOTIATOR_DEBUG = D_FULLDEBUG
NEGOTIATOR_INTERVAL = 30
SCHEDD_INTERVAL = 15
NEGOTIATOR_USE_SLOT_WEIGHTS = FALSE
GROUP_QUOTA_MAX_ALLOCATION_ROUNDS = 1
NUM_CPUS = 10
GROUP_NAMES = a
GROUP_QUOTA_a = 10
GROUP_PRIO_FACTOR_a = 4.0
GROUP_ACCEPT_SURPLUS_a = TRUE
# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "a"
Name2 = "<none>"
PriorityFactor1 = 4.000000
PriorityFactor2 = 1.000000
# condor_userprio -setfactor a.u1@$(hostname) 1
The priority factor of a.u1@<hostname> was set to 1.000000
# condor_userprio -setfactor a.u2@$(hostname) 2
The priority factor of a.u2@<hostname> was set to 2.000000
# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "a.u1@<hostname>"
Name2 = "a"
Name3 = "<none>"
Name4 = "a.u2@<hostname>"
PriorityFactor1 = 1.000000
PriorityFactor2 = 4.000000
PriorityFactor3 = 1.000000
PriorityFactor4 = 2.000000
# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "a"
Name2 = "<none>"
Name3 = "a.u2@<hostname>"
PriorityFactor1 = 4.000000
PriorityFactor2 = 1.000000
PriorityFactor3 = 2.000000
Verified on RHEL 5.7 i386 along to #10:
# rpm -qa | grep condor
condor-7.6.4-0.7.el5
condor-classads-7.6.4-0.7.el5
# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "<none>"
Name2 = "a"
PriorityFactor1 = 1.000000
PriorityFactor2 = 4.000000
# condor_userprio -setfactor a.u1@$(hostname) 1
The priority factor of a.u1@<hostname> was set to 1.000000
# condor_userprio -setfactor a.u2@$(hostname) 2
The priority factor of a.u2@<hostname> was set to 2.000000
# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "<none>"
Name2 = "a"
Name3 = "a.u1@<hostname>"
Name4 = "a.u2@<hostname>"
PriorityFactor1 = 1.000000
PriorityFactor2 = 4.000000
PriorityFactor3 = 1.000000
PriorityFactor4 = 2.000000
# condor_userprio -l | grep -e '^Name' -e '^PriorityFactor'
Name1 = "<none>"
Name2 = "a"
Name3 = "a.u1@<hostname>"
Name4 = "a.u2@<hostname>"
PriorityFactor1 = 1.000000
PriorityFactor2 = 4.000000
PriorityFactor3 = 1.000000
PriorityFactor4 = 2.000000
Output on platforms RHEL 5.7 x86_64, RHEL 6.1 i386 and RHEL 6.1 x86_64 is similar.
Along to #0 reproduced on RHEL 5.7 i386 and verified on RHEL 5.7 i386, RHEL 5.7 x86_64, RHEL 6.1 i386 and RHEL 6.1 x86_64 => expected result.
>>> VERIFIED
Technical note updated. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
Diffed Contents:
@@ -1,11 +1 @@
-Cause:
+The check that determined whether a submitter record could safely be deleted from the Accountant contained a logic error. The logic checked for priority factor equal to default priority factor DEFAULT_PRIO_FACTOR, which allowed record deletion if the submitter's factor happened to be set to the default. The deletion-checking logic has been corrected in these updated packages so that any explicitly-set value for priority factor, default or otherwise, is now detected. As a result, sumbitter records are no longer improperly deleted if the user-set priority factor happens to be equal to the DEFAULT_PRIO_FACTOR value.-A logic error in the check that determines whether a submitter record can be safely deleted from the Accountant. The logic checked for priority factor equal to default priority factor DEFAULT_PRIO_FACTOR, which allowed record deletion if the submitter's factor happened to be set to the default.
-
-Consequence:
-A submitter record with an explicity set priority factor could be deleted in appropriately if that factor was set to DEFAULT_PRIO_FACTOR.
-
-Fix:
-The deletion checking logic was corrected so that it detected any explicitly set value for priority factor, default or otherwise.
-
-Result:
-Submitter records are no longer improperly deleted if user-set priority factor happens to be equal to DEFAULT_PRIO_FACTOR.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2012-0045.html |
Description of problem: Group user priority change is reset to GROUP_PRIO_FACTOR_<GROUP> by condor if the changed value is equal to DEFAULT_PRIO_FACTOR value. Version-Release number of selected component (if applicable): condor-7.4.5-0.8 How reproducible: 100% Steps to Reproduce: 1. set (the line 'DEFAULT_PRIO_FACTOR = 1' is not needed, cause that's the default setting) NUM_CPUS = 100 GROUP_NAMES = a1 GROUP_QUOTA_DYNAMIC_a1 = 0.05 GROUP_PRIO_FACTOR_a1 = 4.0 GROUP_AUTOREGROUP_a1 = TRUE 2. start condor 3. set group user priority factor for a1.user2 to 1 # condor_userprio -setfactor a1.user2@`hostname` 1 The priority factor of a1.user2@<hostname> was set to 1.000000 4. run 'condor_userprio -l' # condor_userprio -l LastUpdate = 1297075475 Name1 = "a1.user2@<hostname>" Priority1 = 0.500000 ResourcesUsed1 = 0 WeightedResourcesUsed1 = 0.000000 AccumulatedUsage1 = 0.000000 WeightedAccumulatedUsage1 = 0.000000 BeginUsageTime1 = 0 LastUsageTime1 = 0 PriorityFactor1 = 1.000000 NumSubmittors = 1 5. wait a minute, run 'condor_userprio -l' again # condor_userprio -l LastUpdate = 1297075515 NumSubmittors = 0 6. submit 100 jobs for each A1.user1 and A1.user2 # su condor_user -c 'echo -e "cmd=/bin/sleep\nargs=1d\n+AccountingGroup=\"a1.user1\"\nqueue 100\n+AccountingGroup=\"a1.user2\"\nqueue 100" | condor_submit' Submitting job(s)........................................................................................................................................................................................................ 200 job(s) submitted to cluster 1. 7. wait until all 100 jobs start run 'condor_userprio -l' again # condor_userprio -l LastUpdate = 1297075555 Name1 = "a1" Priority1 = 2.000000 ResourcesUsed1 = 100 ... PriorityFactor1 = 4.000000 Name2 = "a1.user1@<hostname>" Priority2 = 2.000000 ResourcesUsed2 = 50 ... PriorityFactor2 = 4.000000 Name3 = "a1.user2@<hostname>" Priority3 = 2.000000 ResourcesUsed3 = 50 ... PriorityFactor3 = 4.000000 NumSubmittors = 3 Actual results: a1.user1 - resources ... 50 - priority factor ... 4 a1.user2 - resources ... 50 - priority factor ... 4 Expected results: a1.user1 - resources ... 20 - priority factor ... 4 a1.user2 - resources ... 80 - priority factor ... 1 Additional info: see NegotiatorLog in attachment