Bug 1473840

Summary: refresh items and relationships not working
Product: [JBoss] Middleware Manager Reporter: Bill Hirsch <bhirsch>
Component: InventoryAssignee: Edgar Hernández <ehernand>
Status: CLOSED NOTABUG QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact: jstickle
Priority: unspecified    
Version: 7.0.0CC: abonas, bhirsch, ehernand, mmahoney
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-09 19:36:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bill Hirsch 2017-07-21 20:23:49 UTC
Description of problem:
"Refresh items and Relationships" under CF->Middleware->Providers->Hawkular appears to take no action.  Have run this multiple times over several days, but "last refresh" still shows "11 days".



Version-Release number of selected component (if applicable): 7.0.0


How reproducible:  Always (sort of).  See additional info.


Steps to Reproduce:
1. Deploy an app
2. Select "Refresh items and Relationships" at the provider level
3. Review last refresh time and states of EAP

Actual results:

New deployments are not reflected in CF.  Change of server state in EAP server is not reflected in CF.

Expected results:
Change of state (deployments, server started, stopped, reloaded, etc) should be reflected in CF

Additional info:
While writing this bug, I ran a docker inspect of the middleware mgr container to find the exact version.  When I returned to the CF interface about 2 minutes later, the "last refresh" status showed "2 minutes" and my missing deployment now showed up and "server state" shows "stopping" which I initiated from cloudforms about 10 minutes prior.  As I write this, I'm initiating another refresh.  This one updated the "last refresh" field, but not the actual state of my EAP server. (It's been offline for at least 20 minutes now), but CF still sees it as "Stopping".

Comment 1 Joel Takvorian 2017-07-24 06:55:01 UTC
@Edgar, can you have a look please? It seems that I'm automatically assigned for Middleware mngr/inventory although I'm not working on the hawkular provider.

Comment 2 Edgar Hernández 2017-07-24 23:32:19 UTC
If refresh is not working, it may mean that at some point CF failed to authenticate against middleware mngr (Hawkular). A "connection refused" condition may cause "authentication status" to go into fail state. When this happens, refreshes stop working until authentication becomes normal again. 

As far as I know, CF regularly does authentication checks. This may explain why refreshes started working suddenly. If you need to, you can force an authentication re-check using the "Authentication" button located in the summary page of middleware mngr. This should also trigger a refresh.

With regards to the EAP status being holded at "Stopping", this may mean that the availability metric of the server is not going from "up" to "down". Please, query Hawkular through its API. If Hawkular reports "up" in the availability metric, that explain why CF reports it in "stopping" state.

Are you running the EAP server in standalone or domain mode?

Comment 3 Alissa 2017-07-25 12:44:00 UTC
Joel, you can reassign this to Edgar.
Bill, can you please specify which version of CFME and which version of Hawkular (services, metrics, inventory) are you running?
It is important as some changes in the hawkular inventory implementation (which was cancelled as a separate component in the recent version) are not backwards compatible, and won't work with all cfme versions.

Comment 4 Bill Hirsch 2017-07-25 14:36:51 UTC
Alissa,  CFME 5.8.0.17 . For Hawkular, Services 0.38.0 , metrics 0.26.1 , Alerts 1.6.2 .  I'm not sure this is everything you need, if not, can you tell me where to find those versions?

Comment 5 Bill Hirsch 2017-07-25 14:38:22 UTC
Edgar,
I'm not receiving any errors when I reauthenticate
I am running EAP in standalone
Can you send some examples of how to query the API?  I'm not familiar.

Comment 6 Edgar Hernández 2017-07-25 18:13:48 UTC
Bill,

If you have ruby, try this script to query Hawkular API: https://gist.github.com/israel-hdez/80a181414f09cf9fcea78dca0777eff6

Else, line #15 of the script should give you an idea on how to use 'curl' to query Hawkular API from a console (it's a one line command).

Regarding reauthentication, if refresh is now working for you, it will be hard to say if that was the problem. Next time it's not working, just try to reauthenticate and check if it starts working. May be we should display authentication status in the UI to make that fact much clearer to the user.

Can you also post hawkular-agent version?

Comment 7 Bill Hirsch 2017-07-26 11:58:54 UTC
(In reply to Edgar Hernández from comment #6)
> Bill,
> 
> If you have ruby, try this script to query Hawkular API:
> https://gist.github.com/israel-hdez/80a181414f09cf9fcea78dca0777eff6
> 
> Else, line #15 of the script should give you an idea on how to use 'curl' to
> query Hawkular API from a console (it's a one line command).
> 
> Regarding reauthentication, if refresh is now working for you, it will be
> hard to say if that was the problem. Next time it's not working, just try to
> reauthenticate and check if it starts working. May be we should display
> authentication status in the UI to make that fact much clearer to the user.
> 
> Can you also post hawkular-agent version?

Agent version is 1.0.0.CR5.
I'm traveling most of the day, but will try the curl query as soon as I can.
Agreed on authentication.
Another issue I noticed - Once the refresh finally worked, a Server disappeared.  Initially, I had 3 Servers - 1 for EAP and 2 for Hawkular.  I didn't realize it shouldn't be like that until the refresh worked and one Hawkular Server disappeared.  Now I'm wondering about my setup and if I had some issues I wasn't aware of during the deployment of the hawkular container.  I may need to wipe everything out and start fresh, documenting each step a bit better as I go.  Perhaps I can recreate that condition.

Comment 9 Edgar Hernández 2017-11-09 19:36:35 UTC
Closing because issue cannot be replicated and, most likely, it was an issue with the setup.