Bug 1056562

Summary: Inventory between JON Agents and Server get unsync after the server backend database backup
Product: [JBoss] JBoss Operations Network Reporter: Amana <ajuricic>
Component: InventoryAssignee: Jay Shaughnessy <jshaughn>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: medium Docs Contact:
Priority: unspecified    
Version: JON 3.1.2CC: jshaughn, loleary, mmahoney
Target Milestone: DR01   
Target Release: JON 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
An issue with synchronization between JON Agents and Servers when a database remained DOWN for longer than normal caused all JBoss ON Agents connected to the database to incorrectly show a DOWN status, when the agents were still running correctly. The only way the issue could be fixed was to restart all agents so their availability status was refreshed. A fix to the JON UI now shows JON Agents as UP after communication is restored to the database. This fixes the originally-reported issue.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-11 14:00:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amana 2014-01-22 13:24:49 UTC
Description of problem:

If the server backend database gets down for more than 15 minutes (i.e. in case of a database backup performed every day), all JBoss ON Agents get DOWN and a restart has to be performed to update their availability status. The issue is due to the inventory between JBoss ON Agent and Server get unsync when database remains DOWN for awhile.

Version-Release number of selected component (if applicable):

JON 3.1.2

How reproducible:

Always

Steps to Reproduce:
1. Start JBoss ON Server and Agents.
2. After starting is done, stop server backend database (it should remain stopped about 20 minutes or more).
3. After restarting database, the communication between server and agents is restored. However, agents remains as DOWN in JON UI.

Actual results:

JON Agents get DOWN in JON UI but they are still running.

Expected results:

After restoring communication, JON UI should show JON Agents as UP since they are still running.

Additional info:
Restarting all agents or executing "inventory --sync" in JON UI for each one are some workarounds.

Comment 1 Jay Shaughnessy 2014-04-04 17:31:12 UTC
I think this the same issue as reported in RHQ Bug 918205.

Comment 2 Jay Shaughnessy 2014-04-10 16:34:39 UTC
I would like this to be retested with 3.3.  I didn't reproduce it and there has been significant sync work since 3.1.2.  I am setting MODIFIED so it can be tested in ER1.

Comment 3 Simeon Pinder 2014-07-31 15:51:46 UTC
Moving to ON_QA as available to test with brew build of DR01: https://brewweb.devel.redhat.com//buildinfo?buildID=373993