Bug 862447

Summary: Platform plugin times out getting information on 64 processor system
Product: [Other] RHQ Project Reporter: Elias Ross <genman>
Component: AgentAssignee: RHQ Project Maintainer <rhq-maint>
Status: CLOSED NOTABUG QA Contact: Mike Foley <mfoley>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.2CC: hrupp
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-03 19:30:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Elias Ross 2012-10-02 23:47:20 UTC
I've been seeing this. The machine has 64 processors and of this description: (from /proc/cpu)

model name      : Intel(R) Xeon(R) CPU           X7560  @ 2.27GHz

2012-10-02 23:41:08,789 WARN  [MeasurementManager.collector-1] (rhq.core.pc.measurement.MeasurementCollectorRunner)- Failure to collect measurement data for Resource[id=98910, type=CPU, key=43, name=CPU 43, parent=xxx, version=Xeon] - cause: org.rhq.core.pc.inventory.TimeoutException:Call to [org.rhq.plugins.platform.CpuComponent.getValues()] with args [[org.rhq.core.domain.measurement.MeasurementReport@d98c113, [ScheduledMeasurementInfo[res=98910, name=CpuTrait.vendor, sched=1096200], ScheduledMeasurementInfo[res=98910, name=CpuTrait.model, sched=1096370], ScheduledMeasurementInfo[res=98910, name=CpuTrait.mhz, sched=1096262], ScheduledMeasurementInfo[res=98910, name=CpuTrait.cacheSize, sched=1095845], ScheduledMeasurementInfo[res=98910, name=CpuPerc.wait, sched=1095927], ScheduledMeasurementInfo[res=98910, name=CpuPerc.user, sched=1095723], ScheduledMeasurementInfo[res=98910, name=CpuPerc.sys, sched=1095012]]]] timed out. Invocation thread will be interrupted

RHQ 4.5 with the newer Sigar might work better?

Unclear if this has something to do with the number of CPUs or slowness for this particular model of processor or the Kernel being slow.

Comment 1 Elias Ross 2012-10-03 19:30:29 UTC
I think this was due to an issue with the host itself, possibly a resource issue with running out of file handles or something else.

Closing for now.