Description of problem: RunningJobs/IdleJobs advertised by condor_collector are updated more slowly than other properties (CurrentJobsRunningAll, HostsTotal, HostsClaimed, ...). It would be nice to have their update rate to be aligned with the one of the other properties (COLLECTOR_UPDATE_INTERVAL). Version-Release number of selected component (if applicable): condor-7.4.4-0.9, all supported architectures. How reproducible: always
The Jobs attributes are aggregated from Schedd updates and the Host attributes from Startds. The information arrives at the Collector at different rates. It isn't desirable to create a sync point for the statistics. What problem is this causing?
The provided information are not coherent. I don't think that a sync point should be created, but at least the update rate should be closer for all the sources. Moreover, the current update rate is a bit strange. Configure condor with: CONDOR_DEVELOPERS_COLLECTOR = localhost COLLECTOR_UPDATE_INTERVAL = 10 and submit this simple job: ---- universe = vanilla executable = /bin/sleep arguments = 10 Queue 10 ----- IdleJobs is (almost) immediataly updated to 10, so HostClaimed/HostUnclaimed and CurrentJobsRunningAll. Subsequent updates are quite strange: IdleJobs and RunningJobs do not change (with the example above, they are _never_ updated even if there is only one slot); but if the jobs are held/removed, IdleJobs changes to 0 when Host* and CurrentJobsRunningAll are updated.
The COLLECTOR_UPDATE_INTERVAL defines when the Collector calculates the aggregates. So a delayed publish from a Schedd (SCHEDD_UPDATE_INTERVAL) could easily result in the inconsistency seen. I'm inclined to close this as NOTABUG, though it may be confusing behavior. I welcome an RFE that may cover a means to rationalize the data published by the Collector. Possibly something similar to bug 673179.