Bug 673538

Summary: Default NEGOTIATOR_UPDATE_INTERVAL versus NEGOTIATOR_INTERVAL plus NEGOTIATION_CYCLE_STATS_LENGTH
Product: Red Hat Enterprise MRG Reporter: Lubos Trilety <ltrilety>
Component: condorAssignee: Matthew Farrellee <matt>
Status: CLOSED ERRATA QA Contact: Lubos Trilety <ltrilety>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.3CC: iboverma, jneedle, matt
Target Milestone: 2.0   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: condor-7.6.0-0.2 Doc Type: Bug Fix
Doc Text:
C: Negotiator published statistics every NEGOTIATOR_UPDATE_INTERVAL C: LastNegotiationCycle* stats can be missed F: New parameter added, NEGOTIATOR_UPDATE_AFTER_CYCLE, to publish after every cycle R: Statistics are published on every cycle
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-23 15:39:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 693778    
Attachments:
Description Flags
patch v1 none

Description Lubos Trilety 2011-01-28 16:08:36 UTC
Description of problem:
Default values are:
NEGOTIATOR_UPDATE_INTERVAL = 300
NEGOTIATOR_INTERVAL = 20
NEGOTIATION_CYCLE_STATS_LENGTH = 3

It means that negotiation cycle is run every 20 seconds (NEGOTIATOR_INTERVAL), and negotiator remembers statistics from three of those cycles (NEGOTIATION_CYCLE_STATS_LENGTH). In another word the negotiator remembers statistics one minute ago. On the other hand it sends those statistics to collector every 5 minutes (NEGOTIATOR_UPDATE_INTERVAL).

This settings leads to situation when some statistics are not propagated to collector and thus not shown to user.


Version-Release number of selected component (if applicable):
condor-7.4.5-0.7

How reproducible:
100%

Steps to Reproduce:
1. set condor configuration:
NUM_CPUS = 2
GROUP_NAMES = a
GROUP_QUOTA_a = 1

2. wait till negotiator update classad on collector and then run a job
# echo -e
"universe=vanilla\ncmd=/bin/sleep\nargs=30\n+AccountingGroup=\"a.u1\"\nqueue\n+AccountingGroup=\"a.u2\"\nqueue" | runuser condor -s /bin/bash -c condor_submit

3. wait till the negotiator updates again and look at LastNegotiationCycleRejections statistic
# condor_status -subsystem negotiator -l | grep -i CycleRejections
LastNegotiationCycleRejections0 = 0
LastNegotiationCycleRejections1 = 0
LastNegotiationCycleRejections2 = 0

  
Actual results:
no rejection is seen on collector

Expected results:
change the default so the user could see all available statistics from negotiator OR add some warning to documentation that this could happen

Additional info:
if the scenario is run with NEGOTIATOR_UPDATE_INTERVAL set to 5 it is possible to see LastNegotiationCycleRejectionsX = 1

Comment 1 Matthew Farrellee 2011-03-04 12:32:13 UTC
Created attachment 482283 [details]
patch v1

Comment 2 Matthew Farrellee 2011-03-04 12:33:31 UTC
Submitted NEGOTIATOR_UPDATE_AFTER_CYCLE, defaulting to False, upstream. When true, it will cause the Negotiator to publish an update ad to the Collector after every negotiation cycle.

Comment 3 Matthew Farrellee 2011-03-04 12:41:07 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
C: Negotiator published statistics every NEGOTIATOR_UPDATE_INTERVAL
C: LastNegotiationCycle* stats can be missed
F: New parameter added, NEGOTIATOR_UPDATE_AFTER_CYCLE, to publish after every cycle
R: Statistics are published on every cycle

Comment 5 Lubos Trilety 2011-04-28 10:35:43 UTC
For documentation a new bug 700403 was created

Tested with:
condor-7.6.1-0.1

Tested on:
RHEL5 i386,x86_64  - passed
RHEL6 i386,x86_64  - passed

>>> VERIFIED

Comment 6 errata-xmlrpc 2011-06-23 15:39:15 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0889.html