Bug 1439615 - docker service disabled on compute node after upgrading overcloud to containarized services
Summary: docker service disabled on compute node after upgrading overcloud to containa...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-containers
Version: 12.0 (Pike)
Hardware: All
OS: Linux
urgent
high
Target Milestone: beta
: 12.0 (Pike)
Assignee: Jiri Stransky
QA Contact: Marius Cornea
Andrew Burden
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-06 10:24 UTC by Artem Hrechanychenko
Modified: 2017-12-13 19:14 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-13 19:14:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1680395 0 None None None 2017-04-06 10:24:41 UTC
Red Hat Product Errata RHEA-2017:3457 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 12.0 Containers Enhancement Advisory 2017-12-14 04:45:51 UTC

Description Artem Hrechanychenko 2017-04-06 10:24:41 UTC
Description of problem:
After upgrading overcloud to containerized services using overcloud deploy .... -e ~/containers-default-parameters.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps-docker.yaml
on compute node docker service in dead state.

Version-Release number of selected component (if applicable):
openstack Pike

How reproducible:


Steps to Reproduce:
1 Deploy undercloud + 1 controller + 1 compute
  1.1) wget https://raw.githubusercontent.com/openstack/tripleo-quickstart/master/quickstart.sh
  1.2) bash quickstart.sh --install-deps
  1.3) bash quickstart.sh --working-dir /var/tmp/foo --teardown all --tags all --release master-tripleo-ci $HOST

2) grab overcloud deployment command from overcloud_deploy.log
openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates --libvirt-type qemu --control-flavor oooq_control --compute-flavor oooq_compute --ceph-storage-flavor oooq_ceph --block-storage-flavor oooq_blockstorage --swift-storage-flavor oooq_objectstorage --timeout 90 -e /home/stack/cloud-names.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e /home/stack/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/low-memory-usage.yaml -e /home/stack/enable-tls.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/tls-endpoints-public-ip.yaml -e /home/stack/inject-trust-anchor.yaml --validation-warnings-fatal --ntp-server pool.ntp.org

3) on undercloud node:
   3.1) sudo chown :stack /var/run/docker.sock
   3.2) # download container images
   openstack overcloud container image upload --verbose --config-file /usr/share/tripleo-common/contrib/overcloud_containers.yaml.
   3.2.1) Check docker images on local docker registry using "docker images"

   3.3) # create an envrionment file to make overcloud fetch the images from the undercloud
# (192.168.24.1 is undercloud IP that must be pingable from the overcloud)
   echo > ~/containers-default-parameters.yaml 'parameter_defaults:
    DockerNamespace: 192.168.24.1:8787/tripleoupstream
    DockerNamespaceIsRegistry: true
   '
   3.4) Run upgrading overcloud to containerized services
   openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates --libvirt-type qemu --control-flavor oooq_control --compute-flavor oooq_compute --ceph-storage-flavor oooq_ceph --block-storage-flavor oooq_blockstorage --swift-storage-flavor oooq_objectstorage --timeout 90 -e /home/stack/cloud-names.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e /home/stack/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/low-memory-usage.yaml -e /home/stack/enable-tls.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/tls-endpoints-public-ip.yaml -e /home/stack/inject-trust-anchor.yaml --validation-warnings-fatal --ntp-server pool.ntp.org -e ~/containers-default-parameters.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps-docker.yaml

  3.5) Check docker service, images, service containers on compute and controller node
  3.6) Run tempest smoke suite

Actual results:
Docker service on compute node was dead.

Expected results:
 All services moved to docker containers, tempest test passed.

Additional info:
Undercloud related info

http://pastebin.test.redhat.com/472515

Controller related info
http://pastebin.test.redhat.com/472516

Compute related info
http://pastebin.test.redhat.com/472518

Comment 1 Omri Hochman 2017-07-11 14:42:49 UTC
it might be that the compute-nodes are not being upgraded entirely, and therefore the services are not switching from BM to containers. 

It might eventually be under the Upgrade:DFG to deal with.

Comment 2 Jiri Stransky 2017-07-11 14:44:20 UTC
Indeed the default roles_data excludes computes from the main upgrade step.

https://github.com/openstack/tripleo-heat-templates/blob/24a5fd643919bd3197d1ccc7f70273a9a70511e9/roles_data.yaml#L143

Excluding compute from the main step is probably correct and we should implement the compute part of the upgrade as a separate part of the workflow.

Comment 3 Martin André 2017-10-20 12:13:18 UTC
Moving this to the Upgrades DFG.

Comment 4 Jiri Stransky 2017-10-25 10:39:00 UTC
This was reported back in April when i was prototyping the upgrade to containerized deployments, and the compute upgrade (via upgrade-non-controller.sh) wasn't done at all, so the computes just didn't upgrade.

I think with the way Upgrades DFG has been progressing on the upgrades implementation, the compute node upgrades should now be working via upgrade-non-controller.sh, including enablement of docker service on computes.

Most likely this doesn't need any action on dev side and we can just retest.

Comment 5 Marius Cornea 2017-11-08 16:34:25 UTC
After upgrade:

[root@compute-0 heat-admin]# docker ps
CONTAINER ID        IMAGE                                                                                               COMMAND             CREATED             STATUS                     PORTS               NAMES
b9f97c08326f        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-cron-docker:20171103.1                 "kolla_start"       7 minutes ago       Up 7 minutes                                   logrotate_crond
14d575d1d464        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-nova-compute-docker:20171103.1         "kolla_start"       7 minutes ago       Up 7 minutes (unhealthy)                       nova_migration_target
2792baedf241        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-nova-compute-docker:20171103.1         "kolla_start"       7 minutes ago       Up 7 minutes (healthy)                         nova_compute
bfb10f54c32a        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-nova-libvirt-docker:20171103.1         "kolla_start"       10 minutes ago      Up 10 minutes                                  nova_libvirt
d35b17844688        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-nova-libvirt-docker:20171103.1         "kolla_start"       10 minutes ago      Up 10 minutes                                  nova_virtlogd
7188ed13743e        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-ceilometer-compute-docker:20171103.1   "kolla_start"       33 minutes ago      Up 32 minutes                                  ceilometer_agent_compute

Comment 10 errata-xmlrpc 2017-12-13 19:14:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3457


Note You need to log in before you can comment on or make changes to this bug.