An issue with synchronization between JON Agents and Servers when a database remained DOWN for longer than normal caused all JBoss ON Agents connected to the database to incorrectly show a DOWN status, when the agents were still running correctly. The only way the issue could be fixed was to restart all agents so their availability status was refreshed. A fix to the JON UI now shows JON Agents as UP after communication is restored to the database. This fixes the originally-reported issue.
Description of problem:
If the server backend database gets down for more than 15 minutes (i.e. in case of a database backup performed every day), all JBoss ON Agents get DOWN and a restart has to be performed to update their availability status. The issue is due to the inventory between JBoss ON Agent and Server get unsync when database remains DOWN for awhile.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Start JBoss ON Server and Agents.
2. After starting is done, stop server backend database (it should remain stopped about 20 minutes or more).
3. After restarting database, the communication between server and agents is restored. However, agents remains as DOWN in JON UI.
JON Agents get DOWN in JON UI but they are still running.
After restoring communication, JON UI should show JON Agents as UP since they are still running.
Restarting all agents or executing "inventory --sync" in JON UI for each one are some workarounds.
I think this the same issue as reported in RHQ Bug 918205.
I would like this to be retested with 3.3. I didn't reproduce it and there has been significant sync work since 3.1.2. I am setting MODIFIED so it can be tested in ER1.
Moving to ON_QA as available to test with brew build of DR01: https://brewweb.devel.redhat.com//buildinfo?buildID=373993