Bug 867989
| Summary: | Cumin missing scheduler stats | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Pete MacKinnon <pmackinn> |
| Component: | condor-qmf | Assignee: | Pete MacKinnon <pmackinn> |
| Status: | CLOSED ERRATA | QA Contact: | Daniel Horák <dahorak> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | Development | CC: | dahorak, eerlands, ltoscano, matt, sgraf, tmckay, tstclair |
| Target Milestone: | 2.3 | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | condor-7.8.7-0.1 | Doc Type: | Bug Fix |
| Doc Text: |
Cause: Upstream changes in HTCondor modified the names of various condor_schedd daemon ClassAd statistical attributes.
Consequence: Statistics for QMF scheduler object shows 0 values for attributes that should be non-zero.
Fix: Enhanced the implementation of the QMF schedd plug-in to implicitly map from the old attribute names (7.6 series) to those renamed in the 7.8 series.
Result: Statistics for QMF scheduler object shows correct values for attributes as appropriate.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-03-06 18:47:16 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 869414 | ||
For historical and reference purposes, here's a table that represents a mapping of semantics between previous and current statistic attributes Old Names New Names ------------------------------------------- WINDOWED_STAT_WIDTH STATISTICS_WINDOW_SECONDS // quantized to schedd_stats_window_quantum = 200 WindowedStatWidth RecentStatsLifetime JobsSubmitted RecentJobsSubmitted JobsSubmittedCumulative JobsSubmitted JobsStarted RecentJobsStarted JobsStartedCumulative JobsStarted JobsExited RecentJobsExited JobsExitedCumulative JobsExited JobsCompleted RecentJobsCompleted JobsCompletedCumulative JobsCompleted ShadowExceptions RecentJobsExitException ShadowExceptionsCumulative JobsExitException <null> RecentJobsAccumTimeToStart SumTimeToStartCumulative JobsAccumTimeToStart <null> RecentJobsAccumRunningTime SumRunningTimeCumulative JobsAccumRunningTime JobSubmissionRate RecentJobsSubmitted / RecentStatsLifetime JobCompletionRate RecentJobsCompleted / RecentStatsLifetime JobStartRate RecentJobsStarted / RecentStatsLifetime MeanTimeToStart RecentJobsAccumTimeToStart / RecentJobsStarted MeanTimeToStartCumulative JobsAccumTimeToStart / JobsStarted MeanRunningTime RecentJobsAccumRunningTime / RecentJobsCompleted MeanRunningTimeCumulative JobsAccumRunningTime / JobsCompleted UpdateTime <null> // subtract StatsLastUpdateTime from consecutive ads Upstream confirms that recent/windowed stats make use of a ring buffer whose behavior is essentially equivalent to the previous windowed stat behavior, but with lower resolution in time. The main impact is that when a ring-buffer bin falls off the back end of the time window, it can cause a larger step-function drop in the value. How visible this is to anybody consuming the stats depends on how the timing interacts with the ad publication interval. Upstream confirmed that they are amenable to exposing the quantization level to configuration. Such a feature would require relatively little effort to implement and pull back via a tracking branch. Addressed for cumin in the QMF schedd plugin based on provided stat mapping Are there any visible changes, or is it just an internal change which should lead to "working exactly as before"? Or will the "working exactly as before" be ready when both this and 869414 are fixed? Ideally, "working exactly as before" with the resolution of 869414. Note, we are planning to include the old->new mapping table in Comment 1 in the tech note. Are the changes mentioned in comment 1 visible somewhere? I check the scheduler from qpid-tool and there are attributes (only) from the left "Old" column. Is it ok? (Are the changes made only somewhere internally?) Internal changes only so that there is zero impact to the QMF schema. Tested on RHEL 5.9, 6.4 - i386, x86_64.
Compared between:
# rpm -qa | grep -e condor -e cumin -e qmd | sort
condor-7.8.8-0.3.el6.x86_64
condor-aviary-7.8.8-0.3.el6.x86_64
condor-classads-7.8.8-0.3.el6.x86_64
condor-qmf-7.8.8-0.3.el6.x86_64
cumin-0.1.5648-1.el6.noarch
and:
# rpm -qa | grep -e condor -e cumin -e qmd | sort
condor-7.6.5-0.22.el6.i686
condor-aviary-7.6.5-0.22.el6.i686
condor-classads-7.6.5-0.22.el6.i686
condor-qmf-7.6.5-0.22.el6.i686
cumin-0.1.5444-3.el6.noarch
And it is "working exactly as before".
>>> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0564.html |
Scheduler statistical attributes are internally computed and represented slightly differently between the upstream 7.6 and 7.8 series. This BZ represents work within the QMF scheduler plugin to ensure that the existing statistical attributes exposed to QMF are mapped and represented the same as they were in previous versions, particularly as consumed by Cumin. For example, Job scheduler info: Job submission rate Job start rate Job completion rate Mean time to start should all have non-zero values when the pool/scheduler is at steady state and processing jobs.