Bug 689785

Summary: Change default QMF update interval to 30 seconds, special config for submissions
Product: Red Hat Enterprise MRG Reporter: Trevor McKay <tmckay>
Component: Management_Console_Installation_GuideAssignee: Alison Young <alyoung>
Status: CLOSED CURRENTRELEASE QA Contact: ecs-bugs
Severity: medium Docs Contact:
Priority: medium    
Version: DevelopmentCC: iboverma, jneedle, jsarenik, jskeoch
Target Milestone: 2.0   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-24 01:37:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
BZ689785 content
none
BZ689785 content, revised none

Description Trevor McKay 2011-03-22 12:57:25 UTC
Description of problem:

QMF traffic and processing load on cumin can be decreased by setting the default QMF update interval for all condor daemons to 30 seconds (currently 10 seconds).

However, the 10 second update interval should be retained for submission objects.  This can be set in the default condor config files for schedd and/or jobserver.

Comment 1 Matthew Farrellee 2011-03-22 19:22:34 UTC
NOTE: The update interval is per agent, not type.

(agent) - (type)*
condor_job_server - JobServer, Submission
MgmtCollectorPlugin - Collector, Grid, Slot
MgmtMasterPlugin - Master
MgmtNegotiatorPlugin - Negotiator
MgmtScheddPlugin - Scheduler, Submitter, JobServer, Submission
MgmtStartdPlugin - Slot (when loaded)
condor_triggerd - CondorTriggerService, CondorTrigger, EventCondorTriggerNotify

MgmtCollectorPlugin publishes Slots when QMF_IGNORE_UPDATE_STARTD_AD=false, default is true.
MgmtStartdPlugin publishes Slots unconditionally.
MgmtScheddPlugin publishes JobServer and Submission when QMF_PUBLISH_SUBMISSIONS=true, which is the default.
condor_job_server publishes JobServer and Submission unconditionally.

Configuration is scoped to Subsystem, not Agent. Meaning SLOT.QMF_UPDATE_INTERVAL is gibberish. Setting the interval for Slots requires COLLECTOR.QMF_UPDATE_INTERVAL and STARTD.QMF_UPDATE_INTERVAL, and the Collector configuration change will impact Grid and Collector objects.

Comment 2 Matthew Farrellee 2011-03-23 13:09:15 UTC
NOTE: The NEGOTIATOR_INTERVAL defaults to 60 (seconds) and is often pushed down to 20. The update interval should be <= NEGOTIATOR_INTERVAL to avoid missing stat updates.

Having multiple, layered sampling intervals is unfortunate.

Comment 3 Matthew Farrellee 2011-03-23 19:58:57 UTC
Though the Collector can publish Slot data, it can do so only with a hidden configuration variable ('til now!).

To reduce the update interval on just Slots use: STARTD.QMF_UPDATE_INTERVAL

See bug 690285 for a request to allow class level interval adjustments.

Comment 4 Trevor McKay 2011-03-30 17:26:25 UTC
Note, this is actually going to be handled as a configuration change and covered as part of the Management Console Installation Guide section on configuring for scale.  No code changes here.

Comment 5 Trevor McKay 2011-04-04 19:57:46 UTC
Created attachment 489846 [details]
BZ689785 content

Comment 6 Trevor McKay 2011-04-07 00:56:44 UTC
Created attachment 490442 [details]
BZ689785 content, revised

Comment 7 Alison Young 2011-04-07 05:33:33 UTC
Change made in revision 0.1-2, build: Red_Hat_Enterprise_MRG-Management_Console_Installation_Guide-2.0-web-en-US-0.1-2.el5


Code snippet: 
<section id="sect-Management_Console_Installation_Guide-Medium_Deployment_Configuration-Increase_Default_QMF_Update_Interval_for_GRID_Components">
<title>Increasing the Default QMF Update Interval for &GRID; Components 
<para>
	The default QMF update interval for &GRID; components is 10 seconds. This interval affects how frequently &GRID; notifies the &CONSOLE; of changes in status. Increasing this interval for certain components can noticeably decrease load on the &CONSOLE;. Edit the <filename>/etc/condor/config.d/40QMF.config</filename> file created in <xref linkend="chap-Management_Console_Installation_Guide-Using_the_CONSOLE" /> to add the following recommended setting for a medium scale deployment:
</para>
<programlisting>
STARTD.QMF_UPDATE_INTERVAL = 30
</programlisting>
<important>
<para>
	The <command>NEGOTIATOR.QMF_UPDATE_INTERVAL</command> should be less than or equal to the <command>NEGOTIATOR_INTERVAL</command> (which defaults to 60 seconds). If either of these intervals is modified check that this relationship still holds.
</para>
</important>
</section>