Bug 719019

Summary: Startd RANK will only preempt if user priorities allow it
Product: Red Hat Enterprise MRG Reporter: Timothy St. Clair <tstclair>
Component: condorAssignee: Timothy St. Clair <tstclair>
Status: CLOSED ERRATA QA Contact: Lubos Trilety <ltrilety>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.3CC: jneedle, ltoscano, ltrilety, matt, mkudlej, tstclair
Target Milestone: 2.0.1   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: condor-7.6.3-0.1 Doc Type: Bug Fix
Doc Text:
C: Support for slot weight broke startd RANK C: Startd RANK will only preempt if user priorities allow it F: Update negotiator logic which handles evaluation of RANK R: Negotiator will correctly handle RANK
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-09-07 16:44:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 723887    

Description Timothy St. Clair 2011-07-05 13:56:41 UTC
Description of problem:
Startd RANK preemption only occurs if the preempting user has sufficient user priority to claim another machine

Version-Release number of selected component (if applicable):
7.3.2 && > 


Additional info:
Upstream tracking https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2275

Comment 1 Timothy St. Clair 2011-07-06 20:57:55 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
C: Support for slot weight broke startd RANK
C: Startd RANK will only preempt if user priorities allow it
F: Update negotiator logic which handles evaluation of RANK
R: Negotiator will correctly handle RANK

Comment 3 Lubos Trilety 2011-08-03 12:40:00 UTC
Successfully reproduced on:
$CondorVersion: 7.6.1 Jun 02 2011 BuildID: RH-7.6.1-0.10.el5 $
$CondorPlatform: X86_64-RedHat_5.6 $

08/03/11 13:56:32   Negotiating with condor_user@host at <IP:54681>
08/03/11 13:56:32 0 seconds so far
08/03/11 13:56:32   Calculating submitter limit with the following parameters
08/03/11 13:56:32     SubmitterPrio       = 9977.242188
08/03/11 13:56:32     SubmitterPrioFactor = 1.000000
08/03/11 13:56:32     submitterShare      = 0.090911
08/03/11 13:56:32     submitterAbsShare   = 0.500000
08/03/11 13:56:32     submitterLimit    = 0.727285
08/03/11 13:56:32     submitterUsage    = 0.000000
08/03/11 13:56:32 Socket to condor_user@host (<IP:54681>) already in cache, reusing
08/03/11 13:56:32     Sending SEND_JOB_INFO/eom
08/03/11 13:56:32     Getting reply from schedd ...
08/03/11 13:56:32     Got JOB_INFO command; getting classad/eom
08/03/11 13:56:32     Request 00003.00000:
08/03/11 13:56:32 matchmakingAlgorithm: limit 0.727285 used 0.000000 pieLeft 0.727285
08/03/11 13:56:32       Rejected 3.0 condor_user@host <IP:54681>: fair share exceeded

Comment 4 Lubos Trilety 2011-08-03 12:45:39 UTC
Tested on:
$CondorVersion: 7.6.3 Jul 27 2011 BuildID: RH-7.6.3-0.3.el5 $
$CondorPlatform: I686-RedHat_5.7 $

$CondorVersion: 7.6.3 Jul 27 2011 BuildID: RH-7.6.3-0.3.el5 $
$CondorPlatform: X86_64-RedHat_5.7 $

$CondorVersion: 7.6.3 Jul 27 2011 BuildID: RH-7.6.3-0.3.el6 $
$CondorPlatform: I686-RedHat_6.1 $

$CondorVersion: 7.6.3 Jul 27 2011 BuildID: RH-7.6.3-0.3.el6 $
$CondorPlatform: X86_64-RedHat_6.1 $


08/03/11 16:19:50   Negotiating with condor_user@host at <IP:58256>
08/03/11 16:19:50 0 seconds so far
08/03/11 16:19:50   Calculating submitter limit with the following parameters
08/03/11 16:19:50     SubmitterPrio       = 9996.630859
08/03/11 16:19:50     SubmitterPrioFactor = 1.000000
08/03/11 16:19:50     submitterShare      = 0.090909
08/03/11 16:19:50     submitterAbsShare   = 0.500000
08/03/11 16:19:50     submitterLimit    = 0.727275
08/03/11 16:19:50     submitterUsage    = 0.000000
08/03/11 16:19:50 Socket to condor_user@host (<IP:58256>) already in cache, reusing
08/03/11 16:19:50     Sending SEND_JOB_INFO/eom
08/03/11 16:19:50     Getting reply from schedd ...
08/03/11 16:19:50     Got JOB_INFO command; getting classad/eom
08/03/11 16:19:50     Request 00003.00000:
08/03/11 16:19:50 matchmakingAlgorithm: limit 0.727275 used 0.000000 pieLeft 0.727275
08/03/11 16:19:50 Start of sorting MatchList (len=8)
08/03/11 16:19:50 Finished sorting MatchList
08/03/11 16:19:50       Preempting condor@host (user prio=999.67, startd rank=0.00) on slot8@host for condor_user@host (user prio=9996.63, startd rank=10.00)
08/03/11 16:19:50       Connecting to startd slot8@host at <IP:47952>
08/03/11 16:19:50       Sending MATCH_INFO/claim id to slot8@host
08/03/11 16:19:50       (Claim ID is "<IP:47952>#1312381128#9#..." )
08/03/11 16:19:50       Sending PERMISSION, claim id, startdAd to schedd
08/03/11 16:19:50       Matched 3.0 condor_user@host <IP:58256> preempting condor@host <IP:47952> slot8@host
08/03/11 16:19:50       Notifying the accountant
08/03/11 16:19:51       Successfully matched with slot8@host

>>> VERIFIED

Comment 5 errata-xmlrpc 2011-09-07 16:44:17 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1249.html