Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1290251 - [RFE][OpsTools] We need an availability monitoring solution deployed by director.
[RFE][OpsTools] We need an availability monitoring solution deployed by direc...
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates (Show other bugs)
10.0 (Newton)
All Linux
high Severity high
: rc
: 10.0 (Newton)
Assigned To: Martin Magr
Leonid Natapov
: FutureFeature, Triaged
: 1290250 (view as bug list)
Depends On: 1379538
Blocks: 1398468
  Show dependency treegraph
 
Reported: 2015-12-09 17:57 EST by Nick Barcet
Modified: 2017-07-10 05:37 EDT (History)
17 users (show)

See Also:
Fixed In Version: openstack-tripleo-heat-templates-5.0.0-0.20160907212643.90c852e.1.el7ost
Doc Type: Enhancement
Doc Text:
With this update, a new feature to enable connecting the overcloud to a monitoring infrastructure adds availability monitoring agents (sensu-client) to be deployed on the overcloud nodes. To enable the monitoring agents deployment, use the environment file '/usr/share/openstack/tripleo-heat-templates/environments/monitoring-environment.yaml' and fill in the following parameters in the configuration YAML file: MonitoringRabbitHost: host where the RabbitMQ instance for monitoring purposes is running MonitoringRabbitPort: port on which the RabbitMQ instance for monitoring purposes is running MonitoringRabbitUserName: username to connect to RabbitMQ instance MonitoringRabbitPassword: password to connect to RabbitMQ instance MonitoringRabbitVhost: RabbitMQ vhost used for monitoring purposes
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-12-14 10:19:40 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 254788 None None None 2016-08-17 05:19 EDT
OpenStack gerrit 349690 None None None 2016-08-17 05:20 EDT
Red Hat Product Errata RHEA-2016:2948 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 14:55:27 EST

  None (edit)
Description Nick Barcet 2015-12-09 17:57:34 EST
We need to deploy the availability monitoring solution that Ggillies has put together.
http://file.bne.redhat.com/~ggillies/optools_doc/

Use case:
As an operator, I need to be able to validate that openstack services are correctly functioning

Satisfaction criterias:
* solution is automatically deployed when the appropriate option is activated
* solution is documented
Comment 2 Nick Barcet 2015-12-09 18:22:51 EST
*** Bug 1290250 has been marked as a duplicate of this bug. ***
Comment 3 Rob Young 2016-06-14 11:04:15 EDT
Under the new HA architecture that is planned for OSP 10, most/all of the A/A OpenStack services will be managed by systemd and will be able to start, stop, restart independently when needed. We will need to monitor and alert on services that are stopped, that will not start or that are in a constant state of restarting. Should this be added to this BZ or should there be another to track this work?
Comment 4 Lars Kellogg-Stedman 2016-09-19 14:49:48 EDT
These changes have been merged upstream (https://review.openstack.org/#/c/254788/)
Comment 6 Leonid Natapov 2016-09-26 04:22:24 EDT
Tested with openstack-tripleo-heat-templates-5.0.0-0.20160907212643.90c852e.1.el7ost.noarch
Stack deployment failed failed. 

Here is the deploy command I used:
openstack overcloud deploy --templates  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml -e monitoring-environment.yaml --control-scale 3 --compute-scale 1 --ntp-server 10.11.160.238

Here is the error I got:
------------------------------------------------------------------------------
[stack@puma42 ~]$ openstack stack failures list overcloud
WARNING: openstackclient.common.utils is deprecated and will be removed after Jun 2017. Please use osc_lib.utils
overcloud.ControllerAllNodesValidationDeployment:
  resource_type: OS::Heat::StructuredDeployments
  physical_resource_id: 540c6ff6-ee1a-4303-b9da-114edf813654
  status: CREATE_FAILED
  status_reason: |
    CREATE aborted
overcloud.ControllerNodesPostDeployment.ControllerPrePuppet.ControllerPrePuppetMaintenanceModeDeployment:
  resource_type: OS::Heat::SoftwareDeployments
  physical_resource_id: 664bab56-185f-46f8-b62b-190fb897258a
  status: CREATE_FAILED
  status_reason: |
    CREATE aborted
overcloud.ControllerNodesPostDeployment.ControllerArtifactsDeploy:
  resource_type: OS::Heat::StructuredDeployments
  physical_resource_id: 53f77add-767c-46c0-b433-43d88783af5d
  status: CREATE_FAILED
  status_reason: |
    CREATE aborted
overcloud.ComputeNodesPostDeployment.ComputeOvercloudServicesDeployment_Step3.0:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: 563feda4-77b5-46fd-a866-307c444adcfa
  status: CREATE_FAILED
  status_reason: |
    Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6
  deploy_stdout: |
    ...
    Notice: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::end]: Dependency File[/etc/sensu/handlers] has failures: true
    Notice: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::end]: Dependency File[/etc/sensu/extensions] has failures: true
    Notice: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::end]: Dependency File[/etc/sensu/mutators] has failures: true
    Notice: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::end]: Dependency File[/etc/sensu/plugins] has failures: true
    Notice: /Stage[main]/Sensu/Anchor[sensu::end]: Dependency File[/etc/sensu/conf.d] has failures: true
    Notice: /Stage[main]/Sensu/Anchor[sensu::end]: Dependency File[/etc/sensu/handlers] has failures: true
    Notice: /Stage[main]/Sensu/Anchor[sensu::end]: Dependency File[/etc/sensu/extensions] has failures: true
    Notice: /Stage[main]/Sensu/Anchor[sensu::end]: Dependency File[/etc/sensu/mutators] has failures: true
    Notice: /Stage[main]/Sensu/Anchor[sensu::end]: Dependency File[/etc/sensu/plugins] has failures: true
    Notice: Finished catalog run in 3.64 seconds
    (truncated, view all with --long)
  deploy_stderr: |
    ...
    Warning: /Stage[main]/Sensu::Redis::Config/Sensu_redis_config[overcloud-novacompute-0.localdomain]: Skipping because of failed dependencies
    Warning: /Stage[main]/Sensu::Client::Config/Sensu_client_config[overcloud-novacompute-0.localdomain]: Skipping because of failed dependencies
    Warning: /Stage[main]/Sensu::Client::Config/File[/etc/sensu/conf.d/client.json]: Skipping because of failed dependencies
    Warning: /Stage[main]/Sensu::Client::Service/Service[sensu-client]: Skipping because of failed dependencies
    Warning: /Stage[main]/Sensu::Api::Service/Service[sensu-api]: Skipping because of failed dependencies
    Warning: /Stage[main]/Sensu::Server::Service/Service[sensu-server]: Skipping because of failed dependencies
    Warning: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::begin]: Skipping because of failed dependencies
    Warning: /Package[sensu-enterprise-dashboard]: Skipping because of failed dependencies
    Warning: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::end]: Skipping because of failed dependencies
    Warning: /Stage[main]/Sensu/Anchor[sensu::end]: Skipping because of failed dependencies
    (truncated, view all with --long)
Comment 10 James Slagle 2016-10-06 13:06:53 EDT
mike, can someone from ReleaseDelivery dfg take this one and look into the downstream image builds?
Comment 23 Leonid Natapov 2016-10-31 05:56:10 EDT
openstack-tripleo-heat-templates-5.0.0-0.6.0rc3.el7ost.noarch

Availability Monitoring successfully deployed by tripleo using monitoring-environment.yaml template. sensu-client started on all overcloud nodes and configuration files was properly configured.
Comment 26 errata-xmlrpc 2016-12-14 10:19:40 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html

Note You need to log in before you can comment on or make changes to this bug.