Description of problem: there looks like a performance tweak can be made in the pieleft logic. In negotiateWithGroup, we enter the do{ negotiate() } while (pieleft..) loop and calculate pieleft at the top of the loop. We continue within this loop and eventually hit negotiate even if pieleft==0 && ConsiderPreemption==false. The tweak would be break on (pieleft==0 && ConsiderPreemption==false).
upstream: https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2267
posted a patch upstream according to comments.
tracking branch: UPSTREAM-7.7.1-BZ716519-zero-pie-halt Note: I created this branch off of tracking branch UPSTREAM-7.7.1-BZ712987-submitter-sort, because it involves a change that is adjacent to the new code for BZ712987. So I think tracking branch UPSTREAM-7.7.1-BZ712987-submitter-sort needs to be merged prior to UPSTREAM-7.7.1-BZ716519-zero-pie-halt while we carry them.
repro/test Using following configuration (no group quotas): CLAIM_WORKLIFE = 0 NEGOTIATOR_CONSIDER_PREEMPTION = FALSE NEGOTIATOR_DEBUG = D_FULLDEBUG NEGOTIATOR_INTERVAL = 30 SCHEDD_INTERVAL = 15 NUM_CPUS = 5 Before fix: Spool up a condor pool. No job submissions are required. Observe output of negotiator log, see that loop proceeds when pie left is zero: $ tail -f NegotiatorLog | grep -e 'Started.*Cycle' -e 'Phase 4' -e 'pieLeft =' 07/12/11 15:31:18 ---------- Started Negotiation Cycle ---------- 07/12/11 15:31:18 Phase 4.1: Negotiating with schedds ... 07/12/11 15:31:18 pieLeft = 0.000 07/12/11 15:31:48 ---------- Started Negotiation Cycle ---------- 07/12/11 15:31:48 Phase 4.1: Negotiating with schedds ... 07/12/11 15:31:48 pieLeft = 0.000 After fix: see that loop halts early: $ tail -f NegotiatorLog | grep -e 'Started.*Cycle' -e 'Phase 4' -e 'pieLeft =' 07/12/11 15:35:14 ---------- Started Negotiation Cycle ---------- 07/12/11 15:35:44 ---------- Started Negotiation Cycle ---------- 07/12/11 15:36:14 ---------- Started Negotiation Cycle ----------
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: Halting conditions for outer negotiator loop are at the end of the loop. Consequence: In case of pieLeft == 0.0, the loop may execute at least once even though there are no slots (pie) to allocate. Change: An earlier check for zero pieLeft was added to halt the loop. Result: Extra loop execution on pieLeft == 0.0 is now avoided.
Reproduced with MRG 2.0 package condor-7.6.1-0.10.el6.i686 ---------- Started Negotiation Cycle ---------- Phase 4.1: Negotiating with schedds ... pieLeft = 5.000 ---------- Started Negotiation Cycle ---------- Phase 4.1: Negotiating with schedds ... pieLeft = 4.000 ---------- Started Negotiation Cycle ---------- Phase 4.1: Negotiating with schedds ... pieLeft = 4.000 ---------- Started Negotiation Cycle ---------- Phase 4.1: Negotiating with schedds ... pieLeft = 4.000 ---------- Started Negotiation Cycle ---------- Phase 4.1: Negotiating with schedds ... pieLeft = 4.000
Verified with condor-7.6.3-0.2.el6.i686 condor-7.6.3-0.2.el6.x86_64 condor-7.6.3-0.2.el5.i686 condor-7.6.3-0.2.el5.x86_64 ---------- Started Negotiation Cycle ---------- Phase 4.1: Negotiating with schedds ... pieLeft = 5.000 ---------- Started Negotiation Cycle ---------- ---------- Started Negotiation Cycle ---------- ---------- Started Negotiation Cycle ----------
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1249.html