Bug 1568234

Summary: FFU: ceph upgrade fails with: Error response from daemon: Container $id is not running
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: rhosp-directorAssignee: Angus Thomas <athomas>
Status: CLOSED DUPLICATE QA Contact: Amit Ugol <augol>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 13.0 (Queens)CC: dbecker, johfulto, mburns, morazi, rhel-osp-director-maint
Target Milestone: beta   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-17 13:30:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ceph-install-workflow.log none

Description Marius Cornea 2018-04-17 02:53:52 UTC
Description of problem:
FFU: ceph upgrade fails with:

2018-04-16 21:34:45,589 p=17752 u=mistral |  failed: [192.168.24.19] (item={'caps': {'mds': u'', 'osd': u'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics', 'mon': u'allow r', 'mgr': u'allow *'}, 'mode': u'0600', 'key': u'AQBDB9VaAAAAABAAbj8ZspoUOJqC/b5Xxef46Q==', 'name': u'client.openstack'}) => {"changed": true, "cmd": ["docker", "exec", "ceph-create-keys", "ceph-authtool", "--create-keyring", "/etc/ceph/ceph.client.openstack.keyring", "--name", "client.openstack", "--add-key", "AQBDB9VaAAAAABAAbj8ZspoUOJqC/b5Xxef46Q==", "--cap", "mds", "", "--cap", "osd", "allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics", "--cap", "mon", "allow r", "--cap", "mgr", "allow *"], "delta": "0:04:59.711107", "end": "2018-04-17 01:34:45.556518", "item": {"caps": {"mds": "", "mgr": "allow *", "mon": "allow r", "osd": "allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics"}, "key": "AQBDB9VaAAAAABAAbj8ZspoUOJqC/b5Xxef46Q==", "mode": "0600", "name": "client.openstack"}, "msg": "non-zero return code", "rc": 126, "start": "2018-04-17 01:29:45.845411", "stderr": "", "stderr_lines": [], "stdout": "rpc error: code = 2 desc = oci runtime error: exec failed: container \"c4195dd3e10d517202ac1816440dde6ef9ea06265c8a12425738373cf9500c43\" does not exist", "stdout_lines": ["rpc error: code = 2 desc = oci runtime error: exec failed: container \"c4195dd3e10d517202ac1816440dde6ef9ea06265c8a12425738373cf9500c43\" does not exist"]}
2018-04-16 21:34:45,908 p=17752 u=mistral |  failed: [192.168.24.19] (item={'caps': {'mds': u'allow *', 'osd': u'allow rw', 'mon': u'allow r, allow command \\\\\\"auth del\\\\\\", allow command \\\\\\"auth caps\\\\\\", allow command \\\\\\"auth get\\\\\\", allow command \\\\\\"auth get-or-create\\\\\\"', 'mgr': u'allow *'}, 'name': u'client.manila', 'key': u'AQBTJdVaAAAAABAAc795oVkA+TVmDwIKPPGEMQ==', 'mode': u'0600'}) => {"changed": true, "cmd": ["docker", "exec", "ceph-create-keys", "ceph-authtool", "--create-keyring", "/etc/ceph/ceph.client.manila.keyring", "--name", "client.manila", "--add-key", "AQBTJdVaAAAAABAAc795oVkA+TVmDwIKPPGEMQ==", "--cap", "mds", "allow *", "--cap", "osd", "allow rw", "--cap", "mon", "allow r, allow command \\\\\\\"auth del\\\\\\\", allow command \\\\\\\"auth caps\\\\\\\", allow command \\\\\\\"auth get\\\\\\\", allow command \\\\\\\"auth get-or-create\\\\\\\"", "--cap", "mgr", "allow *"], "delta": "0:00:00.058573", "end": "2018-04-17 01:34:45.878909", "item": {"caps": {"mds": "allow *", "mgr": "allow *", "mon": "allow r, allow command \\\\\\\"auth del\\\\\\\", allow command \\\\\\\"auth caps\\\\\\\", allow command \\\\\\\"auth get\\\\\\\", allow command \\\\\\\"auth get-or-create\\\\\\\"", "osd": "allow rw"}, "key": "AQBTJdVaAAAAABAAc795oVkA+TVmDwIKPPGEMQ==", "mode": "0600", "name": "client.manila"}, "msg": "non-zero return code", "rc": 1, "start": "2018-04-17 01:34:45.820336", "stderr": "Error response from daemon: Container c4195dd3e10d517202ac1816440dde6ef9ea06265c8a12425738373cf9500c43 is not running", "stderr_lines": ["Error response from daemon: Container c4195dd3e10d517202ac1816440dde6ef9ea06265c8a12425738373cf9500c43 is not running"], "stdout": "", "stdout_lines": []}
2018-04-16 21:34:46,345 p=17752 u=mistral |  failed: [192.168.24.19] (item={'caps': {'mds': u'', 'osd': u'allow rwx', 'mon': u'allow rw', 'mgr': u'allow *'}, 'mode': u'0600', 'key': u'AQBDB9VaAAAAABAAqIZ4gnzHPSMX8Gza0I60kg==', 'name': u'client.radosgw'}) => {"changed": true, "cmd": ["docker", "exec", "ceph-create-keys", "ceph-authtool", "--create-keyring", "/etc/ceph/ceph.client.radosgw.keyring", "--name", "client.radosgw", "--add-key", "AQBDB9VaAAAAABAAqIZ4gnzHPSMX8Gza0I60kg==", "--cap", "mds", "", "--cap", "osd", "allow rwx", "--cap", "mon", "allow rw", "--cap", "mgr", "allow *"], "delta": "0:00:00.058600", "end": "2018-04-17 01:34:46.318686", "item": {"caps": {"mds": "", "mgr": "allow *", "mon": "allow rw", "osd": "allow rwx"}, "key": "AQBDB9VaAAAAABAAqIZ4gnzHPSMX8Gza0I60kg==", "mode": "0600", "name": "client.radosgw"}, "msg": "non-zero return code", "rc": 1, "start": "2018-04-17 01:34:46.260086", "stderr": "Error response from daemon: Container c4195dd3e10d517202ac1816440dde6ef9ea06265c8a12425738373cf9500c43 is not running", "stderr_lines": ["Error response from daemon: Container c4195dd3e10d517202ac1816440dde6ef9ea06265c8a12425738373cf9500c43 is not running"], "stdout": "", "stdout_lines": []}

Version-Release number of selected component (if applicable):
ceph-ansible-3.1.0-0.1.beta6.el7cp.noarch.rpm 

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP10 with 3 controllers + 2 computes + 3 ceph nodes
2. Upgrade to OSP13 by the FFU path
3. Rnu the ceph upgrade step:
#!/bin/bash
openstack overcloud deploy \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack overcloud \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
--control-scale 3 \
--control-flavor controller \
--compute-scale 2 \
--compute-flavor compute \
--ceph-storage-scale 3 \
--ceph-storage-flavor ceph \
-e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml \
-e /home/stack/virt/internal.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/virt/network/network-environment.yaml \
-e /home/stack/virt/enable-tls.yaml \
-e /home/stack/virt/inject-trust-anchor.yaml \
-e /home/stack/virt/public_vip.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/tls-endpoints-public-ip.yaml \
-e /home/stack/virt/hostnames.yml \
-e /home/stack/virt/debug.yaml \
-e /home/stack/virt/docker-images.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/updates/update-from-ceph-newton.yaml \
-e /home/stack/ceph-ansible-env.yaml \

Actual results:
ceph-ansible playbooks fail. Attaching var/log/mistral/ceph-install-workflow.log

Expected results:
Upgrade doesn't fail.

Additional info:

Comment 2 Marius Cornea 2018-04-17 02:56:20 UTC
Created attachment 1422873 [details]
ceph-install-workflow.log

Comment 3 John Fulton 2018-04-17 13:30:21 UTC

*** This bug has been marked as a duplicate of bug 1568157 ***