Bug 716519 - RFE: earlier spin halting when zero pie left
Summary: RFE: earlier spin halting when zero pie left
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 2.0
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: 2.0.1
: ---
Assignee: Erik Erlandson
QA Contact: Jan Sarenik
URL:
Whiteboard:
Depends On:
Blocks: 723887
TreeView+ depends on / blocked
 
Reported: 2011-06-24 18:21 UTC by Erik Erlandson
Modified: 2012-03-28 09:43 UTC (History)
7 users (show)

Fixed In Version: condor-7.6.3-0.2
Doc Type: Bug Fix
Doc Text:
Cause: Halting conditions for outer negotiator loop are at the end of the loop. Consequence: In case of pieLeft == 0.0, the loop may execute at least once even though there are no slots (pie) to allocate. Change: An earlier check for zero pieLeft was added to halt the loop. Result: Extra loop execution on pieLeft == 0.0 is now avoided.
Clone Of:
Environment:
Last Closed: 2011-09-07 16:43:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1249 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise MRG Grid 2.0 security, bug fix and enhancement update 2011-09-07 16:40:45 UTC

Description Erik Erlandson 2011-06-24 18:21:26 UTC
Description of problem:

there looks like a performance tweak can be made in the pieleft logic.
In negotiateWithGroup, we enter the do{ negotiate() } while (pieleft..) loop
and calculate pieleft at the top of the loop. We continue within this loop and
eventually hit negotiate even if pieleft==0 && ConsiderPreemption==false. The
tweak would be break on (pieleft==0 && ConsiderPreemption==false).

Comment 1 Erik Erlandson 2011-06-24 18:26:26 UTC
upstream: https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2267

Comment 2 Timothy St. Clair 2011-07-08 18:30:57 UTC
posted a patch upstream according to comments.

Comment 3 Erik Erlandson 2011-07-12 23:27:31 UTC
tracking branch: UPSTREAM-7.7.1-BZ716519-zero-pie-halt

Note: I created this branch off of tracking branch UPSTREAM-7.7.1-BZ712987-submitter-sort, because it involves a change that is adjacent to the new code for BZ712987.   So I think tracking branch UPSTREAM-7.7.1-BZ712987-submitter-sort needs to be merged prior to UPSTREAM-7.7.1-BZ716519-zero-pie-halt while we carry them.

Comment 4 Erik Erlandson 2011-07-12 23:28:02 UTC
repro/test

Using following configuration (no group quotas):

CLAIM_WORKLIFE = 0
NEGOTIATOR_CONSIDER_PREEMPTION = FALSE
NEGOTIATOR_DEBUG = D_FULLDEBUG

NEGOTIATOR_INTERVAL = 30
SCHEDD_INTERVAL = 15

NUM_CPUS = 5

Before fix: Spool up a condor pool. No job submissions are required. Observe output of negotiator log, see that loop proceeds when pie left is zero:

$ tail -f NegotiatorLog | grep -e 'Started.*Cycle' -e 'Phase 4' -e 'pieLeft ='
07/12/11 15:31:18 ---------- Started Negotiation Cycle ----------
07/12/11 15:31:18 Phase 4.1:  Negotiating with schedds ...
07/12/11 15:31:18     pieLeft = 0.000
07/12/11 15:31:48 ---------- Started Negotiation Cycle ----------
07/12/11 15:31:48 Phase 4.1:  Negotiating with schedds ...
07/12/11 15:31:48     pieLeft = 0.000

After fix: see that loop halts early:

$ tail -f NegotiatorLog | grep -e 'Started.*Cycle' -e 'Phase 4' -e 'pieLeft ='
07/12/11 15:35:14 ---------- Started Negotiation Cycle ----------
07/12/11 15:35:44 ---------- Started Negotiation Cycle ----------
07/12/11 15:36:14 ---------- Started Negotiation Cycle ----------

Comment 5 Erik Erlandson 2011-07-12 23:31:46 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
Halting conditions for outer negotiator loop are at the end of the loop.

Consequence:
In case of pieLeft == 0.0, the loop may execute at least once even though there are no slots (pie) to allocate.

Change:
An earlier check for zero pieLeft was added to halt the loop.

Result:
Extra loop execution on pieLeft == 0.0 is now avoided.

Comment 7 Jan Sarenik 2011-07-27 12:08:53 UTC
Reproduced with MRG 2.0 package condor-7.6.1-0.10.el6.i686

---------- Started Negotiation Cycle ----------
Phase 4.1:  Negotiating with schedds ...
    pieLeft = 5.000
---------- Started Negotiation Cycle ----------
Phase 4.1:  Negotiating with schedds ...
    pieLeft = 4.000
---------- Started Negotiation Cycle ----------
Phase 4.1:  Negotiating with schedds ...
    pieLeft = 4.000
---------- Started Negotiation Cycle ----------
Phase 4.1:  Negotiating with schedds ...
    pieLeft = 4.000
---------- Started Negotiation Cycle ----------
Phase 4.1:  Negotiating with schedds ...
    pieLeft = 4.000

Comment 8 Jan Sarenik 2011-07-27 12:52:28 UTC
Verified with
  condor-7.6.3-0.2.el6.i686
  condor-7.6.3-0.2.el6.x86_64
  condor-7.6.3-0.2.el5.i686
  condor-7.6.3-0.2.el5.x86_64
  
---------- Started Negotiation Cycle ----------
Phase 4.1:  Negotiating with schedds ...
    pieLeft = 5.000
---------- Started Negotiation Cycle ----------
---------- Started Negotiation Cycle ----------
---------- Started Negotiation Cycle ----------

Comment 9 errata-xmlrpc 2011-09-07 16:43:16 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1249.html


Note You need to log in before you can comment on or make changes to this bug.