1637739 – nova_metadata container is in unhealthy state on undercloud and overcloud nodes

Bug 1637739 - nova_metadata container is in unhealthy state on undercloud and overcloud nodes

Summary: nova_metadata container is in unhealthy state on undercloud and overcloud nodes

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-common
Sub Component:
Version:	14.0 (Rocky)
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	14.0 (Rocky)
Assignee:	Martin Schuppert
QA Contact:	Joe H. Rahme
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1579866
TreeView+	depends on / blocked

Reported:	2018-10-09 22:05 UTC by Marius Cornea
Modified:	2019-01-11 11:54 UTC (History)
CC List:	8 users (show)
Fixed In Version:	openstack-tripleo-common-9.4.1-0.20181012010872.67bab16.el7ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1651931 (view as bug list)
Environment:
Last Closed:	2019-01-11 11:53:51 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1797514	None	None	None	2018-10-12 06:52:19 UTC
OpenStack gerrit	610890	None	None	None	2018-10-16 06:49:28 UTC
OpenStack gerrit	613971	None	None	None	2018-11-06 10:29:53 UTC
Red Hat Product Errata	RHEA-2019:0045	None	None	None	2019-01-11 11:54:00 UTC

Description Marius Cornea 2018-10-09 22:05:19 UTC

Description of problem:
nova_metadata container is in unhealthy state on undercloud and overcloud nodes:

 [root@undercloud-0 stack]# docker ps | grep nova_metadata
2ee724246dec        192.168.24.1:8787/rhosp14/openstack-nova-api:2018-10-08.4                    "kolla_start"            4 hours ago         Up 4 hours (unhealthy)                       nova_metadata

[root@controller-0 heat-admin]# docker ps | grep nova_metadata
0e3143263535        192.168.24.1:8787/rhosp14/openstack-nova-api:2018-10-08.4                    "kolla_start"            2 hours ago         Up 42 minutes (unhealthy)                       nova_metadata


Version-Release number of selected component (if applicable):
14   -p 2018-10-08.4

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP14 undercloud and overcloud
2. Check nova_metadata container status

Actual results:
It reports unhealthy state

Expected results:
It reports healthy state.

Additional info:
Attaching job artifacts

Comment 2 Martin Schuppert 2018-10-11 13:30:35 UTC

Wrong healthcheck script is used in the nova_metadata container. 

(undercloud) [stack@undercloud-0 ~]$ docker exec -it -u root nova_metadata /bin/bash                                                                                                                                                                                                      
()[root@undercloud-0 /]# cd openstack/
()[root@undercloud-0 openstack]# ls -la
total 0
drwxr-xr-x. 2 root root 25 Oct 10 02:10 .
drwxr-xr-x. 1 root root 81 Oct 11 08:53 ..
lrwxrwxrwx. 1 root root 56 Oct 10 02:10 healthcheck -> /usr/share/openstack-tripleo-common/healthcheck/nova-api

Comment 3 Martin Schuppert 2018-10-12 06:52:20 UTC

While we have a dedicated nova_metadata healthcheck script, the nova_metadata and nova_api container the same image and the current nova api healtcheck script still checks the non wsgi implementation.

Currently we have:
~~~
# cat healthcheck                                                                                                                                                                                                                                          
#!/bin/sh

. ${HEALTHCHECK_SCRIPTS:-/usr/share/openstack-tripleo-common/healthcheck}/common.sh


if ps -ef | grep --quiet nova-metadata; then
  bind_host=$(get_config_val /etc/nova/nova.conf DEFAULT metadata_listen 127.0.0.1)
  bind_port=$(get_config_val /etc/nova/nova.conf DEFAULT metadata_listen_port 8775)
  check_url="http://${bind_host}:${bind_port}/"
else
  check_url=$(get_url_from_vhost /etc/httpd/conf.d/10-nova_api_wsgi.conf)
fi

healthcheck_curl ${check_url}
~~~

Proposed change in https://review.openstack.org/609927 .  This changes the nova_api healthcheck script to check the metadata wsgi vhost config for details instead of the details in nova.conf.

Comment 12 Joe H. Rahme 2018-11-23 15:08:24 UTC

Verified the status on both the undercloud and the controller node:

[stack@undercloud-0 ~]$ docker ps | grep nova_metadata                                                                                                           
519f3f1040fc        192.168.24.1:8787/rhosp14/openstack-nova-api:2018-11-09.3                    "kolla_start"            7 days ago          Up 7 days (healthy)                       nova_metadata


[heat-admin@controller-0 ~]$ sudo docker ps | grep nova_metadata
bcd148c24426        192.168.24.1:8787/rhosp14/openstack-nova-api:2018-11-09.3                    "kolla_start"            7 days ago          Up 7 days (healthy)                       nova_metadata

Comment 15 errata-xmlrpc 2019-01-11 11:53:51 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045

Note You need to log in before you can comment on or make changes to this bug.