Bug 1290251
| Summary: | [RFE][OpsTools] We need an availability monitoring solution deployed by director. | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Nick Barcet <nbarcet> |
| Component: | openstack-tripleo-heat-templates | Assignee: | Martin Magr <mmagr> |
| Status: | CLOSED ERRATA | QA Contact: | Leonid Natapov <lnatapov> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 10.0 (Newton) | CC: | dnavale, fzdarsky, jschluet, jslagle, lars, lhh, lnatapov, mburns, mmagr, oblaut, pmyers, racedoro, rhel-osp-director-maint, royoung, scohen, srevivo, tvvcox |
| Target Milestone: | rc | Keywords: | FutureFeature, Triaged |
| Target Release: | 10.0 (Newton) | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-heat-templates-5.0.0-0.20160907212643.90c852e.1.el7ost | Doc Type: | Enhancement |
| Doc Text: |
With this update, a new feature to enable connecting the overcloud to a monitoring infrastructure adds availability monitoring agents (sensu-client) to be deployed on the overcloud nodes.
To enable the monitoring agents deployment, use the environment file '/usr/share/openstack/tripleo-heat-templates/environments/monitoring-environment.yaml' and fill in the following parameters in the configuration YAML file:
MonitoringRabbitHost: host where the RabbitMQ instance for monitoring purposes is running
MonitoringRabbitPort: port on which the RabbitMQ instance for monitoring purposes is running
MonitoringRabbitUserName: username to connect to RabbitMQ instance
MonitoringRabbitPassword: password to connect to RabbitMQ instance
MonitoringRabbitVhost: RabbitMQ vhost used for monitoring purposes
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-12-14 15:19:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1379538 | ||
| Bug Blocks: | 1398468 | ||
|
Description
Nick Barcet
2015-12-09 22:57:34 UTC
*** Bug 1290250 has been marked as a duplicate of this bug. *** Under the new HA architecture that is planned for OSP 10, most/all of the A/A OpenStack services will be managed by systemd and will be able to start, stop, restart independently when needed. We will need to monitor and alert on services that are stopped, that will not start or that are in a constant state of restarting. Should this be added to this BZ or should there be another to track this work? These changes have been merged upstream (https://review.openstack.org/#/c/254788/) Tested with openstack-tripleo-heat-templates-5.0.0-0.20160907212643.90c852e.1.el7ost.noarch
Stack deployment failed failed.
Here is the deploy command I used:
openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml -e monitoring-environment.yaml --control-scale 3 --compute-scale 1 --ntp-server 10.11.160.238
Here is the error I got:
------------------------------------------------------------------------------
[stack@puma42 ~]$ openstack stack failures list overcloud
WARNING: openstackclient.common.utils is deprecated and will be removed after Jun 2017. Please use osc_lib.utils
overcloud.ControllerAllNodesValidationDeployment:
resource_type: OS::Heat::StructuredDeployments
physical_resource_id: 540c6ff6-ee1a-4303-b9da-114edf813654
status: CREATE_FAILED
status_reason: |
CREATE aborted
overcloud.ControllerNodesPostDeployment.ControllerPrePuppet.ControllerPrePuppetMaintenanceModeDeployment:
resource_type: OS::Heat::SoftwareDeployments
physical_resource_id: 664bab56-185f-46f8-b62b-190fb897258a
status: CREATE_FAILED
status_reason: |
CREATE aborted
overcloud.ControllerNodesPostDeployment.ControllerArtifactsDeploy:
resource_type: OS::Heat::StructuredDeployments
physical_resource_id: 53f77add-767c-46c0-b433-43d88783af5d
status: CREATE_FAILED
status_reason: |
CREATE aborted
overcloud.ComputeNodesPostDeployment.ComputeOvercloudServicesDeployment_Step3.0:
resource_type: OS::Heat::StructuredDeployment
physical_resource_id: 563feda4-77b5-46fd-a866-307c444adcfa
status: CREATE_FAILED
status_reason: |
Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6
deploy_stdout: |
...
Notice: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::end]: Dependency File[/etc/sensu/handlers] has failures: true
Notice: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::end]: Dependency File[/etc/sensu/extensions] has failures: true
Notice: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::end]: Dependency File[/etc/sensu/mutators] has failures: true
Notice: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::end]: Dependency File[/etc/sensu/plugins] has failures: true
Notice: /Stage[main]/Sensu/Anchor[sensu::end]: Dependency File[/etc/sensu/conf.d] has failures: true
Notice: /Stage[main]/Sensu/Anchor[sensu::end]: Dependency File[/etc/sensu/handlers] has failures: true
Notice: /Stage[main]/Sensu/Anchor[sensu::end]: Dependency File[/etc/sensu/extensions] has failures: true
Notice: /Stage[main]/Sensu/Anchor[sensu::end]: Dependency File[/etc/sensu/mutators] has failures: true
Notice: /Stage[main]/Sensu/Anchor[sensu::end]: Dependency File[/etc/sensu/plugins] has failures: true
Notice: Finished catalog run in 3.64 seconds
(truncated, view all with --long)
deploy_stderr: |
...
Warning: /Stage[main]/Sensu::Redis::Config/Sensu_redis_config[overcloud-novacompute-0.localdomain]: Skipping because of failed dependencies
Warning: /Stage[main]/Sensu::Client::Config/Sensu_client_config[overcloud-novacompute-0.localdomain]: Skipping because of failed dependencies
Warning: /Stage[main]/Sensu::Client::Config/File[/etc/sensu/conf.d/client.json]: Skipping because of failed dependencies
Warning: /Stage[main]/Sensu::Client::Service/Service[sensu-client]: Skipping because of failed dependencies
Warning: /Stage[main]/Sensu::Api::Service/Service[sensu-api]: Skipping because of failed dependencies
Warning: /Stage[main]/Sensu::Server::Service/Service[sensu-server]: Skipping because of failed dependencies
Warning: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::begin]: Skipping because of failed dependencies
Warning: /Package[sensu-enterprise-dashboard]: Skipping because of failed dependencies
Warning: /Stage[main]/Sensu::Enterprise::Dashboard/Anchor[sensu::enterprise::dashboard::end]: Skipping because of failed dependencies
Warning: /Stage[main]/Sensu/Anchor[sensu::end]: Skipping because of failed dependencies
(truncated, view all with --long)
mike, can someone from ReleaseDelivery dfg take this one and look into the downstream image builds? openstack-tripleo-heat-templates-5.0.0-0.6.0rc3.el7ost.noarch Availability Monitoring successfully deployed by tripleo using monitoring-environment.yaml template. sensu-client started on all overcloud nodes and configuration files was properly configured. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html |