Version 1.0 saw the removal of agent deletion (rhq_agent row removal from the db) in response to several issues around agent/server synchronization (see RHQ-124). This can be problematic in HA since all agents are considered when distributing agent load across servers. It's only an issue when lots of platforms are uninventoried, since those are the agent entries that will be in the db but not utilized. Improvements in resource synchronization may allow us to return agent deletion to be part of the platform uninventory logic. This would be the best solution. Aother option would be explicit deletion of agents via HAAC. As a workaround, direct SQL could be executed to clean out dead agents (those with no child resources.
SQL to clean out agents with no resources: delete from RHQ_FAILOVER_DETAILS where id in (select failoverli0_.id from RHQ_FAILOVER_DETAILS failoverli0_ inner join RHQ_FAILOVER_LIST failoverli1_ on failoverli0_.FAILOVER_LIST_ID=failoverli1_.ID where failoverli1_.AGENT_ID not in (select resource2_.AGENT_ID from RHQ_RESOURCE resource2_)) ; delete from RHQ_FAILOVER_LIST failoverli0_ where failoverli0_.AGENT_ID not in (select resource1_.AGENT_ID from RHQ_RESOURCE resource1_) ; delete from RHQ_AGENT a where a.id not in (select res.AGENT_ID from RHQ_RESOURCE res) This can be used to cleanup if you've got a lot of stranded agent records messing up HA. Can be run from /admin/test/sql.jsp
Targeting fix due to support case
the original code that did this can be seen by doing an svn diff on ResourceManagerBean, svn rev 597 and 639 - the change is to method "public List<Integer> deleteResource(Subject user, Integer resourceId)"
Can we please fix this? We might also want a way to change an agent's name (RHQ_AGENT.name) in the case people want to/need to change the agent's internal name.
this is needed in order to change the agent's internal name. If we fix this issue, then RHQ-1245 might not be needed.
we must fix this in the next release. too many people are running into issues when they need to purge agents.
agent is now purged from rhq_agent and the failover tables
to test import an agent shutdown the agent delete the agent's platform select * from rhq_agent and make sure the agent row is deleted
QA Verified, after platform is removed, running the sql listed above shows that agent is gone.
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-914 This bug relates to RHQ-1187 This bug incorporates RHQ-1245