Description of problem: The metrics aggregation that is run during the DataPurgeJob will deadlock on a single processor (having only a single core) machine. The thread pool size used during aggregation is configured as follows, private int numAggregationWorkers = Math.min(Integer.parseInt(System.getProperty("rhq.metrics.aggregation.workers", "4")), Runtime.getRuntime().availableProcessors()); If Runtime.getRuntime().availableProcessors() returns a value of one, then the aggregation will deadlock. BatchAggregationScheduler runs as a thread pool task. It queries metrics_index and schedules aggregation tasks. Reads are throttled using a Semaphore. The relevant code looks like, for (Row row : rows) { aggregationState.getPermits().acquire(); // schedule tasks... } Permits are released by the tasks that get scheduled. With a single thread, those tasks will not run until BatchAggregationScheduler finishes, and it will block indefinitely once there are no more permits. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Note that this bug ONLY affects RHQ 4.10.0.
Changes have been pushed to master. We now default to 4 threads regardless of the number of processors. If the user overrides the rhq.metrics.aggregation.workers system property with a value of 1, then we default to 2 to avoid the possible deadlock scenario. master commit hash: 039044395d
I have also updated the RHQ Server Measurement Subsystem resource type in the rhq server plugin. I have added a minimum value constraint for the AggregationWorkers property. master commit hash: 1beac55fa
Bulk closing of RHQ 4.11 issues, now that RHQ 4.12 is out. If you find an issue with those, please open a new BZ, linking to the old one.