Bug 491038

Summary: No jobs processed when Negotiator reporting: Over submitter resource limit (0) ... only consider startd ranks
Product: Red Hat Enterprise MRG Reporter: Charlie Wyse <cwyse>
Component: condorAssignee: Matthew Farrellee <matt>
Status: CLOSED ERRATA QA Contact: ppecka <ppecka>
Severity: medium Docs Contact:
Priority: low    
Version: 1.1CC: lans.carstensen, ltoscano, matt, ppecka, tao
Target Milestone: 1.2   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-16 14:31:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Charlie Wyse 2009-03-19 03:09:42 UTC
Description of problem:
After enabling dynamic provisioning we ran into an issue where machines with resources were not taking jobs.  It looked like our user was hitting a resource limit.

Version-Release number of selected component (if applicable):
condor-7.2.2-0.8.el5

How reproducible:
Very

Steps to Reproduce:
1. Submit a few hundred jobs.
2. Watch them idle.
3. Turn on full debugging in Negotiator log and see the resource limit error
  
Actual results:
jobs just sitting around and doing nothing.

Expected results:
jobs running about, free and happy.

Additional info:
Found release notes saying this was fixed in the UW version.
http://www.cs.wisc.edu/condor/manual/v7.2/8_5Stable_Release.html
Fixed a problem in the condor_negotiator in which machines go unassigned when user priorities result in the machines getting split into shares that are rounded down to 0. For example if there are 10 machines and 100 equal priority submitters, then each submitter was getting 0.1 machines, which got rounded down to 0, so no machines were assigned to anybody. The message in the condor_negotiator log in this case was this:

Over submitter resource limit (0) ... only consider startd ranks

Comment 1 Charlie Wyse 2009-03-19 03:10:35 UTC
After setting the following config variable on our negotiator everything began to work as normal.
NEGOTIATOR_IGNORE_USER_PRIORITIES = True

Comment 2 Matthew Farrellee 2009-10-28 20:02:53 UTC
This is believed fixed in the 7.4 series

Comment 4 ppecka 2009-11-19 16:07:37 UTC
with condor-7.4.1-0.5.el5

In /var/log/condor/NegotiatorLog there is:
Over submitter resource limit (0.000000, used 0.000000) ... only consider startd ranks


with condor-7.2.2-0.9.el5
In /var/log/condor/NegotiatorLog there is:
Over submitter resource limit (0) ... only consider startd ranks

Comment 6 ppecka 2009-11-24 13:49:34 UTC
The issue has been fixed on RHEL 4.8 / 5.4 i386 / x86_64 on packages:

# rpm -qa | grep condor | sort
condor-7.4.1-0.7.el5
condor-qmf-plugins-7.4.1-0.7.el5

-> VERIFIED

Comment 8 Jeff Needle 2010-03-16 14:31:43 UTC
The fix for this bug was included in the MRG 1.2 release.

Comment 9 Jeff Needle 2010-03-16 14:35:37 UTC
The fix for this bug was included in the MRG 1.2 release.