Bug 1718976

Summary: Keystone container missing health check
Product: Red Hat OpenStack Reporter: Martin Magr <mmagr>
Component: openstack-tripleo-heat-templatesAssignee: Emilien Macchi <emacchi>
Status: CLOSED ERRATA QA Contact: Pavan <pkesavar>
Severity: low Docs Contact:
Priority: medium    
Version: 15.0 (Stein)CC: lnatapov, mburns, nkinder, pkesavar, pkilambi, sclewis, slinaber
Target Milestone: gaKeywords: Triaged
Target Release: 15.0 (Stein)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-10.6.1-0.20190713150434.2871ce0.el8ost.noarch.rpm Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1488153 Environment:
Last Closed: 2019-09-21 11:23:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1488153    
Bug Blocks: 1631705, 1631707, 1718962, 1718970, 1718971, 1718973    

Description Martin Magr 2019-06-10 16:37:26 UTC
+++ This bug was initially created as a clone of Bug #1488153 +++

Some containers does not have default container health check implemented. We should make all the containers covered.

--- Additional comment from Leonid Natapov on 2019-05-05 08:36:39 UTC ---

Back to assign.
Here is the list of containers that don't have healthcheck. Probably will be more as I will test different tripleo templates that could add additional containers that don't appear now.

c79d409151c7  192.168.24.1:8787/rhosp15/openstack-keystone:20190426.1                 dumb-init --singl...  3 days ago  Up 3 days ago         keystone_cron

-----------------------------------------------------

Above mentioned container was covered with health check in following patch:

https://github.com/openstack/tripleo-common/blob/master/healthcheck/keystone

Please enable (or refactor and enable) health checks (or eventually close this) under your consideration. Example of enabled health check executuion:
https://github.com/openstack/tripleo-common/blob/master/container-images/tripleo_kolla_template_overrides.j2#L325,L329
https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/metrics/collectd-container-puppet.yaml#L477,L478

Comment 5 Nathan Kinder 2019-08-08 05:54:22 UTC
This appears to be related to the keystone_cron container based on the log message provided in the initial bug description.  Both of Keystone's long-running containers appear to have healthchecks implemented as seen in the following code from stable/stein:

----------------------------------------------------------------------------------------------------------------------------
keystone:

https://github.com/openstack/tripleo-heat-templates/blob/stable/stein/deployment/keystone/keystone-container-puppet.yaml#L757-L758
https://github.com/openstack/tripleo-common/blob/stable/stein/container-images/tripleo_kolla_template_overrides.j2#L436-L437
https://github.com/openstack/tripleo-common/blob/stable/stein/healthcheck/keystone

keystone_cron:

https://github.com/openstack/tripleo-heat-templates/blob/stable/stein/deployment/keystone/keystone-container-puppet.yaml#L775-L776
https://github.com/openstack/tripleo-common/blob/stable/stein/healthcheck/cron
----------------------------------------------------------------------------------------------------------------------------

The logs from the initial bug description that show keystone_cron is missing a healthcheck were obtained on May 5, 2019 (https://bugzilla.redhat.com/show_bug.cgi?id=1488153#c18).  The changes that implemented a healthcheck for keystone_cron were merged into stable/stein upstream on April 10, 2019:

    https://github.com/openstack/tripleo-common/commit/b6921c6e9645958015d3bb5c70a75c58d4ea5894
    https://github.com/openstack/tripleo-heat-templates/commit/0c31f04f4160a6422f3f8fbaa5fbcfb259f47c37

I suspect that older builds that did not contain these commits were used in the test when this issue was reported.  I have confirmed that the above commits are in the source code from the SRPMs of the following packages, which are the current versions that are available publicly in the OSP 15 Beta:

    openstack-tripleo-common-10.8.1-0.20190710191707.b6a2d65.el8ost.noarch.rpm
    openstack-tripleo-heat-templates-10.6.1-0.20190713150434.2871ce0.el8ost.noarch.rpm

I have also installed upstream Stein using TripleO, and confirmed that both keystone and keystone_cron containers are listed as "healthy" in the output of "docker ps" (I tested on CentOS 7, so docker is used instead of podman).

Marking this as MODIFIED so this can be officially verified by QE with downstream bits.

Comment 11 errata-xmlrpc 2019-09-21 11:23:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811