Bug 858829 - Client fails to send measurements - ClientCommandSenderTask - always times out with error
Client fails to send measurements - ClientCommandSenderTask - always times ou...
Status: NEW
Product: RHQ Project
Classification: Other
Component: Agent (Show other bugs)
4.2
Unspecified Unspecified
unspecified Severity unspecified (vote)
: ---
: ---
Assigned To: RHQ Project Maintainer
Mike Foley
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-19 14:51 EDT by Elias Ross
Modified: 2012-12-20 14:13 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Elias Ross 2012-09-19 14:51:35 EDT
This is RHQ 4.2 and might have been addressed.

Still, I see this on my agent:

2012-09-19 18:40:47,365 ERROR [ClientCommandSenderTask Thread #4] (enterprise.communications.command.client.ClientCommandSenderTask)- {ClientCommandSenderTask.send-failed}Failed to send command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.agent-name=app001, rhq.externalizable-strategy=AGENT, rhq.security-token=, rhq.guaranteed-delivery=true, rhq.send-throttle=true}]; params=[{invocation=NameBasedInvocation[mergeMeasurementReport], targetInterfaceName=org.rhq.core.clientapi.server.measurement.MeasurementServerService}]]. Cause: java.util.concurrent.TimeoutException:null. Cause: java.util.concurrent.TimeoutException
2012-09-19 18:40:47,365 WARN  [ClientCommandSenderTask Thread #4] (enterprise.communications.command.client.ClientCommandSenderTask)- {ClientCommandSenderTask.queuing-failed-command}The command that failed has its guaranteed-delivery flag set so it is being queued again

When the agent starts to report this, it gets into a state where measurements fail to be reported. Restarting the agent DOES NOT WORK.

The fix is to clean out the container using the

--purgeplugins --purgedata

options.

I need to gather more details.
Comment 1 Elias Ross 2012-09-19 15:11:31 EDT
Note: Other commands (like 'get live data' etc.) work from the server side.

I do see this a lot:

2012-09-19 19:09:23,464 DEBUG [ClientCommandSenderTask Timer Thread #0] (enterprise.communications.command.client.JBossRemotingRemoteCommunicator)- {CommandService.remote-pojo-execute-not-permitted}Command not permitted - server reached its limit of concurrent invocations [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{... rhq.guaranteed-delivery=true, rhq.send-throttle=true}]; params=[{invocation=NameBasedInvocation[mergeMeasurementReport], targetInterfaceName=org.rhq.core.clientapi.server.measurement.MeasurementServerService}]]. Retry in [6,000]ms

Might need to tune the server a bit.
Comment 2 Elias Ross 2012-12-20 14:13:56 EST
The cause of this appears to be a very busy database. You can increase the number of active threads writing metrics to the database but this may or may not help.

Note You need to log in before you can comment on or make changes to this bug.