Bug 808648
Summary: | Max retries should be configurable in org.rhq.enterprise.agent.AgentMain | ||
---|---|---|---|
Product: | [JBoss] JBoss Operations Network | Reporter: | David van Balen <dvanbale> |
Component: | Agent | Assignee: | RHQ Project Maintainer <rhq-maint> |
Status: | CLOSED NOTABUG | QA Contact: | Mike Foley <mfoley> |
Severity: | low | Docs Contact: | |
Priority: | unspecified | ||
Version: | JON 3.0.0 | CC: | dvanbale, mazz, myarboro |
Target Milestone: | --- | ||
Target Release: | JON 3.2.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-03-04 23:48:49 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
David van Balen
2012-03-31 02:27:59 UTC
Looks like there was actually an error in the configuration file I was passing the installer (forgot to mention I was passing a config file by setting RHQ_AGENT_CMDLINE_OPTS="-c path-to-config-file"), although I'm not certain what the problem was. Now that I've corrected it, the agent continues to retry connecting to the server beyond the hardcoded 5 tries. Since the error I was seeing is likely to have been a non-recoverable error, I'm not sure this bug is still valid. I'll leave it up to the RHQ team to decide. set priority per BZ triage 4/2/2012 (crocuh, loleary, mfoley, asantos) Mazz, is this configurable? I don't think the agent completely gives up here. Only under rare conditions will the agent ever completely just stop retrying (IIRC, only if it gets a registration error like "missing token, cannot register" or something fatal like that). The agent has been designed/implemented to stay running indefinitely - that is, it should wait indefinitely for a server to come online. So, after your step #3 in the replication procedure, what happens when you DO start the server? In other words, I would only consider a problem to exist here if you added a step #4 "Start the JON Server" and then the agent NEVER registers and connects. So, what happens when you try that? And what does your agent log file say after starting the JON Server? You may have to wait an addition number of seconds after starting the JON Server before the agent actually registers/connects. (In reply to comment #1) > Looks like there was actually an error in the configuration file I was > passing the installer (forgot to mention I was passing a config file by > setting RHQ_AGENT_CMDLINE_OPTS="-c path-to-config-file"), although I'm not > certain what the problem was. Now that I've corrected it, the agent > continues to retry connecting to the server beyond the hardcoded 5 tries. > Since the error I was seeing is likely to have been a non-recoverable error, > I'm not sure this bug is still valid. I'll leave it up to the RHQ team to > decide. Ahh.. I didn't read this comment. Right - this is what I was saying in my last comment. The agent will always retry UNLESS there is some fatal error at startup that simply would cause the agent to never be able to register/connect. A bad startup config might be such a case. I do not consider this bug to be valid - I would close it. |