Bug 1397114

Summary: OPS Tools | Availability Monitoring | OS Checks | Current OS checks initial deployment on each overcloud node is not effective and gives a misleading information to the openstack administrator.
Product: Red Hat OpenStack Reporter: Leonid Natapov <lnatapov>
Component: sensuAssignee: Lars Kellogg-Stedman <lars>
Status: CLOSED CURRENTRELEASE QA Contact: Leonid Natapov <lnatapov>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: lars, mbracho, mmagr, mrunge, oblaut, sclewis
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-16 16:51:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Leonid Natapov 2016-11-21 16:17:58 UTC
OPS Tools | Availability Monitoring | OS Checks | Current OS checks initial deployment on each overcloud node is not effective  and gives a misleading information to the ops tools administrator.


Current oschecks are checking openstack as a unit and not checking each overcloud node. Checks are running against Virtual IP and checking whether certain API responding or not. Each check runs against only 1 controller or compute. The one that holds Virtual IP. So ,obviously,if some of controllers or computes will be down or certain services on them will be down (HA scenario) it won't effect openstack as a unit and oschecks will still report "ok" status.


The problem is that by deploying and running those checks on each overcloud node we provide to a user a misleading information and it looks like all the checks are being executed on EACH overcloud node checking API status on EACH overcloud node when in fact it's being checked only against 1 node.

Moreover if ,for some reason, one of controllers/computes will have a problem with openstack services it still will be reported as "ok" in Availability Monitoring UI (Uchiwa).

Comment 1 Martin Magr 2016-11-21 16:37:52 UTC
Removing OSP flag as check configuration is server side.

Comment 2 Martin Magr 2016-11-22 12:00:23 UTC
Check configuration is performed on server side and as such it cannot be marked for OSP and cannot be blocker.

Comment 6 Martin Magr 2016-12-08 08:40:03 UTC
Patch was merged to opstools-ansible.

Comment 8 Leonid Natapov 2016-12-12 10:17:20 UTC
Current opstools-ansible build includes systemd checks for openstack services on each overcloud node.