Bug 1517500
Summary: | OPS Tools | Availability Monitoring | Octavia dockers monitoring support | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Alexander Stafeyev <astafeye> |
Component: | openstack-tripleo-common | Assignee: | Martin Magr <mmagr> |
Status: | CLOSED ERRATA | QA Contact: | Alexander Stafeyev <astafeye> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 12.0 (Pike) | CC: | amuller, apannu, astafeye, bcafarel, bschmaus, cgoncalves, emacchi, ihrachys, jamsmith, jbadiapa, jlibosva, jschluet, lars, lpeer, majopela, mburns, mmagr, mrunge, nyechiel, rlopez, rmccabe, scorcora, slinaber |
Target Milestone: | z2 | Keywords: | Triaged, ZStream |
Target Release: | 13.0 (Queens) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-8.0.4-4.el7ost openstack-tripleo-common-8.6.3-3.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-08-29 16:34:51 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1613662 | ||
Bug Blocks: | 1433523 |
Description
Alexander Stafeyev
2017-11-26 09:15:34 UTC
The health check is maintained by the DFG maintaining the component. A health check would probably look like these here: https://review.openstack.org/#/q/If5b77481330fa697f1bab16696acb70075052d4f I started adding health checks for containers that are missing them, so I can write patches for octavia containers. The only problem is that I have no idea how to correctly check each service if it is alive. So please answer me following questions: 1. Octavia API - on which port the api server listens, is there a special url to get health status 2. Octavia worker - does the service connect to other service or listen on any port? - is there any way to get health status from the service? 3. Octavia health manager - does the service connect to other service/listen on any port? - is there any way to get health status from the service? 4. Octavia HouseKeeper manager - does the service connect to other service/listen on any port? - is there any way to get health status from the service? Moving this BZ under DFG. This does not mean I'm not willing to work on this task. *under proper 1. Octavia API - listens on TCP 9876 (internal and public endpoints) 2. Octavia worker - connects to oslo messaging (AMQP internal:5672) - REST API calls to nova, neutron, glance 3. Octavia health manager - listens on UDP 5555 (get IP from 'o-hm0' host iface) 4. Octavia HouseKeeper manager - connects to DB server (MySQL) None of the Octavia services provide a special URL to get health status. I'm flipping status to POST as I believe all required patches have been merged and backported upstream to stable/queens. Martin, please ACK/NACK. NACK. We still need https://review.openstack.org/#/c/555252/ to be backported to stable/queens. This went of my radar. Sorry for the delay. *** Bug 1603240 has been marked as a duplicate of this bug. *** Hi, What could be proper verification steps pls? Thanks Octavia containers report healthy or unhealthy status (depending on actual health of those containers) after deployment in output of command 'docker ps --all | grep octavia'. [root@overcloud-controller-1 ~]# docker ps | grep octa 1b13b1974797 registry.access.redhat.com/rhosp13/openstack-octavia-health-manager:latest "kolla_start" 20 hours ago Up 20 hours (unhealthy) octavia_health_manager 2c6502229b83 registry.access.redhat.com/rhosp13/openstack-octavia-api:latest "kolla_start" 20 hours ago Up 20 hours (unhealthy) octavia_api 5a08540e9372 registry.access.redhat.com/rhosp13/openstack-octavia-housekeeping:latest "kolla_start" 20 hours ago Up 20 hours (unhealthy) octavia_housekeeping d47a8ded4b82 registry.access.redhat.com/rhosp13/openstack-octavia-worker:latest "kolla_start" 20 hours ago Up 20 hours (healthy) octavia_worker [root@overcloud-controller-1 ~]# docker exec octavia_health_manager /openstack/healthcheck There is no octavia-health- process with opened RabbitMQ ports (5671,5672) running in the container [root@overcloud-controller-1 ~]# docker exec octavia_api /openstack/healthcheck rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: \"/openstack/healthcheck\": stat /openstack/healthcheck: no such file or directory" [root@overcloud-controller-1 ~]# docker exec octavia_housekeeping /openstack/healthcheck There is no octavia-houseke process with opened RabbitMQ ports (5671,5672) running in the container [root@overcloud-controller-1 ~]# docker exec octavia_worker /openstack/healthcheck 172.17.1.17:5672 - users:(("octavia-worker:",pid=23,fd=8)) (undercloud) [stack@undercloud-0 ~]$ rpm -qa | grep openstack | grep trip | grep temp openstack-tripleo-heat-templates-8.0.4-10.el7ost.noarch This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory please provide draft documentation text for this bug as soon as possible. If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-". To add draft documentation text: * Select the documentation type from the "Doc Type" drop down field. * A template will be provided in the "Doc Text" field based on the "Doc Type" value selected. Enter draft text in the "Doc Text" field. *** Bug 1613662 has been marked as a duplicate of this bug. *** On my OSP13 environment, all 4 Octavia containers are healthy. [root@controller-0 heat-admin]# docker ps | grep octavia 95db292d2efc 192.168.24.1:8787/rhosp13/openstack-octavia-health-manager:2018-08-08.2 "kolla_start" 47 hours ago Up 47 hours (healthy) octavia_health_manager 838ede76313c 192.168.24.1:8787/rhosp13/openstack-octavia-api:2018-08-08.2 "kolla_start" 47 hours ago Up 47 hours (healthy) octavia_api 66396fecbe7e 192.168.24.1:8787/rhosp13/openstack-octavia-housekeeping:2018-08-08.2 "kolla_start" 47 hours ago Up 47 hours (healthy) octavia_housekeeping 37e8ee2bf056 192.168.24.1:8787/rhosp13/openstack-octavia-worker:2018-08-08.2 "kolla_start" 47 hours ago Up 47 hours (healthy) octavia_worker In comment #36 the healthcheck output message (mentioning RabbitMq) is the old one (before the fix in openstack-tripleo-common-8.6.3-3.el7ost): https://github.com/openstack/tripleo-common/commit/dc342858a74c5c89df22343b5931f821bd61e7b9#diff-437c9b0a7f17cb0002622a959732d7f6 So the container did not have the needed version apparently. That plus comment #41 showing healthy containers, I am moving this bug back to verification step [heat-admin@controller-1 ~]$ sudo -i [root@controller-1 ~]# docker ps | grep octav af135e52359d 192.168.24.1:8787/rhosp13/openstack-octavia-health-manager:2018-08-22.2 "kolla_start" 33 minutes ago Up 26 minutes (healthy) octavia_health_manager a484685b53b9 192.168.24.1:8787/rhosp13/openstack-octavia-api:2018-08-22.2 "kolla_start" 33 minutes ago Up 25 minutes (healthy) octavia_api cb0f7ee7f8c5 192.168.24.1:8787/rhosp13/openstack-octavia-housekeeping:2018-08-22.2 "kolla_start" 33 minutes ago Up 25 minutes (healthy) octavia_housekeeping 7c22d3c633a1 192.168.24.1:8787/rhosp13/openstack-octavia-worker:2018-08-22.2 "kolla_start" 33 minutes ago Up 25 minutes (healthy) octavia_worker [root@controller-1 ~]# cat /etc/yum.repos.d/latest-installed 13 -p 2018-08-22.2 [root@controller-1 ~]# Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2574 |