Bug 1496485 - OSP11 -> OSP12 upgrade: heat-api-cloudwatch service running after upgrade
Summary: OSP11 -> OSP12 upgrade: heat-api-cloudwatch service running after upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 12.0 (Pike)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: beta
: 12.0 (Pike)
Assignee: Marios Andreou
QA Contact: Marius Cornea
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-27 14:37 UTC by Jose Luis Franco
Modified: 2018-02-05 19:15 UTC (History)
15 users (show)

Fixed In Version: openstack-tripleo-heat-templates-7.0.3-0.20171023134947.8da5e1f.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-13 22:11:04 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1713531 0 None None None 2017-09-27 14:37:06 UTC
Launchpad 1720865 0 None None None 2017-10-06 12:19:30 UTC
OpenStack gerrit 511155 0 None None None 2017-10-11 07:16:11 UTC
Red Hat Product Errata RHEA-2017:3462 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 12.0 Enhancement Advisory 2018-02-16 01:43:25 UTC

Description Jose Luis Franco 2017-09-27 14:37:07 UTC
Description of problem:
As described in https://bugs.launchpad.net/tripleo/+bug/1713531, when upgrading from Ocata to Pike (containerized) the environment ends up with the process heat-api-cloudwatch up and running under httpd in Pike upgraded overcloud intead of being containarized.

Version-Release number of selected component (if applicable):
(undercloud) [stack@undercloud ~]$ rpm -qa | grep heat-templates
openstack-tripleo-heat-templates-7.0.1-0.20170927010252.a58332e.el7.centos.noarch


How reproducible:
100%

Steps to Reproduce:
1. Deploy an overcloud:

openstack overcloud deploy   --libvirt-type qemu   --ntp-server pool.ntp.org   --templates /home/stack/tht-ocata/   -e /home/stack/tht-ocata/overcloud-resource-registry-puppet.yaml

2. Check that the service is not started:

[heat-admin@overcloud-controller-0 ~]$ sudo systemctl list-units | grep heat
  session-5.scope                                                   loaded active running   Session 5 of user heat-admin
  openstack-heat-api-cfn.service                                    loaded active running   Openstack Heat CFN-compatible API Service
  openstack-heat-api-cloudwatch.service                             loaded active running   OpenStack Heat CloudWatch API Service
  openstack-heat-api.service                                        loaded active running   OpenStack Heat API Service
  openstack-heat-engine.service                                     loaded active running   Openstack Heat Engine Service
  user-1000.slice                                                   loaded active active    User Slice of heat-admin

HTTP Heat Services: None
[heat-admin@overcloud-controller-0 ~]$ sudo httpd -t -D DUMP_VHOSTS | grep heat
[heat-admin@overcloud-controller-0 ~]$

3. Upgrade to Pike (containerized):
3.1. Download master tht to tht-master

3.2. Specify docker registry in docker_registry.yaml file:
parameter_defaults:
  DockerNamespace: 192.168.24.1:8787/tripleoupstream
  DockerNamespaceIsRegistry: true
EOF

3.3. Download container images:
openstack overcloud container image upload --config-file /usr/share/openstack-tripleo-common/container-images/overcloud_containers.yaml

3.4 Prepara container image definition yaml file:
openstack overcloud container image prepare \
--namespace tripleoupstream \
--tag latest \
--env-file docker-centos-tripleoupstream.yaml

3.5 Upgrade via command:
export THT=/home/stack/tht-master
(undercloud) [stack@undercloud ~]$ openstack overcloud deploy --templates $THT \
 --libvirt-type qemu \
 --ntp-server pool.ntp.org \
 -e $THT/environments/docker.yaml \
 -e $THT/environments/major-upgrade-composable-steps-docker.yaml \
 -e docker-centos-tripleoupstream.yaml \
 -e docker_registry.yaml \
 -e upgrade_repos.yaml

4. Service heat_api_cloudwatch is running under apache:

[heat-admin@overcloud-controller-0 ~]$ sudo httpd -t -D DUMP_VHOSTS | grep heat
192.168.24.7:8003      overcloud-controller-0.internalapi.localdomain (/etc/httpd/conf.d/10-heat_api_cloudwatch_wsgi.conf:6)

[heat-admin@overcloud-controller-0 ~]$ sudo systemctl status httpd
● httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/httpd.service.d
           └─openstack-dashboard.conf
   Active: active (running) since Tue 2017-09-26 13:46:07 UTC; 1h 53min ago
     Docs: man:httpd(8)
           man:apachectl(8)
 Main PID: 115195 (httpd)
   Status: "Total requests: 0; Current requests/sec: 0; Current traffic:   0 B/sec"
   Memory: 355.0M
   CGroup: /system.slice/httpd.service
           ├─115195 /usr/sbin/httpd -DFOREGROUND
           ├─115243 cinder_wsgi     -DFOREGROUND
           ├─115244 cinder_wsgi     -DFOREGROUND
           ├─115245 heat_api_cloudw -DFOREGROUND   <<<<<<<<<<
           ├─115247 /usr/sbin/httpd -DFOREGROUND
           ├─115248 /usr/sbin/httpd -DFOREGROUND
           ├─115249 /usr/sbin/httpd -DFOREGROUND
           ├─115250 /usr/sbin/httpd -DFOREGROUND
           ├─115251 /usr/sbin/httpd -DFOREGROUND
           ├─115253 /usr/sbin/httpd -DFOREGROUND
           ├─115263 /usr/sbin/httpd -DFOREGROUND
           └─115264 /usr/sbin/httpd -DFOREGROUND

5. But no heat service is running under systemd:
[heat-admin@overcloud-controller-0 ~]$ sudo systemctl list-units | grep heat
  session-27.scope                                                              loaded active running   Session 27 of user heat-admin
  user-1000.slice                                                               loaded active active    User Slice of heat-admin

6. All heat services are running under containers except heat-api-cloudwatch:

[heat-admin@overcloud-controller-0 ~]$ sudo docker ps | grep heat
085051b5dfbb        tripleoupstream/centos-binary-heat-api:latest                    "kolla_start"            About an hour ago   Up About an hour (healthy)                          heat_api_cron
39e2b83a8bfc        tripleoupstream/centos-binary-heat-api-cfn:latest                "kolla_start"            About an hour ago   Up About an hour (healthy)                          heat_api_cfn
b80786761582        tripleoupstream/centos-binary-heat-engine:latest                 "kolla_start"            About an hour ago   Up About an hour (healthy)                          heat_engine
e9b2ddafcd7b        tripleoupstream/centos-binary-heat-api:latest                    "kolla_start"            About an hour ago   Up About an hour (healthy)                          heat_api

Actual results:


Expected results:


Additional info:

Comment 1 Jose Luis Franco 2017-09-27 15:36:45 UTC
These are the puppet logs from the moment heat-api-cloudwatch is re-configured as an httpd service during the upgrade:
http://pastebin.test.redhat.com/519887

Sep 26 13:53:23 localhost os-collect-config: "Notice: /Stage[main]/Heat::Api_cloudwatch/Service[heat-api-cloudwatch]: Triggered 'refresh' from 1 events",
Sep 26 13:53:23 localhost os-collect-config: "Notice: /Stage[main]/Heat::Deps/Anchor[heat::service::end]: Triggered 'refresh' from 1 events",
Sep 26 13:53:23 localhost os-collect-config: "Notice: /Stage[main]/Tripleo::Profile::Base::Kernel/Kmod::Load[nf_conntrack_proto_sctp]/Exec[modprobe nf_conntrack_proto_sctp]/returns: executed successfully",
Sep 26 13:53:23 localhost os-collect-config: "Notice: /Stage[main]/Apache/Apache::Vhost[default]/Concat[15-default.conf]/File[/etc/httpd/conf.d/15-default.conf]/ensure: removed",
Sep 26 13:53:23 localhost os-collect-config: "Notice: /Stage[main]/Heat::Wsgi::Apache_api_cloudwatch/Heat::Wsgi::Apache[api_cloudwatch]/Openstacklib::Wsgi::Apache[heat_api_cloudwatch_wsgi]/File[/var/www/cgi-bin/heat]/ensure: created",
Sep 26 13:53:23 localhost os-collect-config: "Notice: /Stage[main]/Heat::Wsgi::Apache_api_cloudwatch/Heat::Wsgi::Apache[api_cloudwatch]/Openstacklib::Wsgi::Apache[heat_api_cloudwatch_wsgi]/File[heat_api_cloudwatch_wsgi]/ensure: defined content as '{md5}2eb19266988f424046d53acfbcf01c2c'",
Sep 26 13:53:23 localhost os-collect-config: "Notice: /Stage[main]/Heat::Wsgi::Apache_api_cloudwatch/Heat::Wsgi::Apache[api_cloudwatch]/Openstacklib::Wsgi::Apache[heat_api_cloudwatch_wsgi]/Apache::Vhost[heat_api_cloudwatch_wsgi]/Concat[10-heat_api_cloudwatch_wsgi.conf]/File[/etc/httpd/conf.d/10-heat_api_cloudwatch_wsgi.conf]/ensure: defined content as '{md5}bd90943ed4380f5332bf94387bd4fe06'",
Sep 26 13:53:23 localhost os-collect-config: "Notice: /Stage[main]/Apache::Service/Service[httpd]/ensure: ensure changed 'stopped' to 'running'",

Also, it might be useful noticing that there is not heat template for the heat-api-cloudwatch service under /tripleo-heat-templates/docker/services: 
https://github.com/openstack/tripleo-heat-templates/tree/master/docker/services

Comment 2 Marios Andreou 2017-09-28 12:12:31 UTC
Hi Jose, the behaviour you're describing in comment #0 sounds right and it is what is currently specified by the tripleo-heat-templates - that is resource registry pointing to the puppet/services/heat-api-cloudwatch.yaml https://github.com/openstack/tripleo-heat-templates/blob/e1a9638732290c247e5dac10392bc8702b531981/overcloud-resource-registry-puppet.j2.yaml#L136 and HeatApiCloudwatch is included in the default controller role services

As you're pointing out in comment #1 there is no /docker/services/heat-api-cloudwatch.yaml i.e. the heat-api-cloudwatch.yaml service is not containerized.

I see that this service was converted to run under httpd with httpd with https://review.openstack.org/#/c/440977/4/puppet/services/heat-api-cloudwatch.yaml@78 on 10 March. That change is not in Ocata so indeed as per your comment #0 you have systemd openstack-heat-api-cloudwatch.service loaded active running before the upgrade, then on Pike you have that service stopped and disabled here https://github.com/openstack/tripleo-heat-templates/blob/e1a9638732290c247e5dac10392bc8702b531981/puppet/services/heat-api-cloudwatch.yaml#L138-L141. Instead heat-api-cloudwatch is being served by httpd.

So to be clear, this BZ is for "why don't we have containerized heat-api-cloudwatch-api lets do it" right? If so we should reach out to deployment/containers team and see what they think/feasible/how much work etc.

Holding on marking triaged incase we go to secondary not primary in tracking.

Comment 3 Marios Andreou 2017-09-29 12:26:45 UTC
we discussed this on upgrades scrum yesterday - Jose is going to find out more about why this service was not containerized as per comment #2. 

If it was 'just forgotten' then we can use this BZ to track the effort. 

If there is some legitimate reason we can either close this bz or track whatever the longer term effort is to overcome the problem, if that is possible. 

We can mark triaged once the get the answer.

Comment 4 Jose Luis Franco 2017-10-02 13:17:01 UTC
As discussed in today's scrum meeting there is not a clear idea on how to proceed with the service upon upgrade. The service seems to be deprecated: "This feature will be deprecated or removed during the Havana cycle as we move to using Ceilometer as a metric/alarm service instead [1]", however it is deployed by default in the roles data. 

Also, if it is deprecated, what should be the right way to proceed with it during upgrade? Should the configuration be migrated to the corresponding service (Ceilometer) and stop heat-api-cloudwatch? Or, should it be stopped? (currently the service goes from running as a systemd service to an apache service when upgrading from Ocata to Pike). Or, is the current behavior the right one?

Can someone from the CloudApps or Telemetry DFG's solve our doubts?


[1] https://wiki.openstack.org/wiki/Heat/Using-CloudWatch

Comment 5 Mike Orazi 2017-10-05 15:44:03 UTC
Cleaning up the needinfos a bit.

Comment 6 Mike Orazi 2017-10-05 15:46:55 UTC
It looks like aschultz has to remove the services from being deployed,but there should be a corresponding upgrade task to clean up as well.

Comment 7 Alex Schultz 2017-10-05 16:06:07 UTC
The patch to remove the services from the roles is https://review.openstack.org/#/c/508964/. This does not cleanup the existing running services which would need to be done via the upgrade processes.  For that I'd have to defer to the Upgrades & CloudApp DFGs on those bits.

Comment 8 Marios Andreou 2017-10-06 12:40:55 UTC
 thanks Alex going to post something to remove cloudwatch api by default (but allow operator to keep it if wanted)  we can track both of those things here so added the other launchpad to trackers too

Comment 9 Marios Andreou 2017-10-19 09:23:41 UTC
Moving this to POST... there are two reviews in trackers... the one at https://review.openstack.org/#/c/508964/ removes the service from the roles files on master. We didn't backport that one to Pike and I don't think we need to/should - adding needinfo Alex what do you think? If you agree we can remove it from trackers so its clearer for the release delivery folks what need to go into the package build

Comment 10 Alex Schultz 2017-10-19 14:04:16 UTC
Yes I think that makes sense. For pike we want to clean it up so we need to leave it in the roles so it gets properly disabled.

Comment 13 Marius Cornea 2017-11-08 16:17:21 UTC
After upgrade:

[root@controller-0 heat-admin]# docker ps | grep heat
171049f0028a        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-heat-api-docker:20171103.1                  "kolla_start"            37 minutes ago      Up 37 minutes (healthy)                       heat_api_cron
54d6481ca142        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-heat-api-cfn-docker:20171103.1              "kolla_start"            37 minutes ago      Up 37 minutes (healthy)                       heat_api_cfn
529c66916005        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-heat-engine-docker:20171103.1               "kolla_start"            37 minutes ago      Up 37 minutes (healthy)                       heat_engine
344195defd4f        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-heat-api-docker:20171103.1                  "kolla_start"            37 minutes ago      Up 37 minutes (healthy)                       heat_api
[root@controller-0 heat-admin]# sudo httpd -t -D DUMP_VHOSTS | grep heat
[root@controller-0 heat-admin]#

Comment 16 errata-xmlrpc 2017-12-13 22:11:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462


Note You need to log in before you can comment on or make changes to this bug.