Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1613532

Summary: [OSP13z2] Manila-share container fails to start on controller, missing kolla src-ceph files
Product: Red Hat OpenStack Reporter: Vince Green <vgreen>
Component: openstack-tripleo-heat-templatesAssignee: Goutham Pacha Ravi <gouthamr>
Status: CLOSED DUPLICATE QA Contact: Vince Green <vgreen>
Severity: urgent Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: gouthamr, mburns, tbarron
Target Milestone: z2   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-07 21:48:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vince Green 2018-08-07 18:50:48 UTC
Description of problem:
After deploying puddle OSP13z2 puddle candidate 2018-08-03.3 from Jenkins, the manila-share docker container doesn't start on any controller.  Snippet of the failure:

Aug  6 10:40:28 controller-0 docker(openstack-manila-share-docker-0)[127388]: INFO: running container openstack-manila-share-docker-0 for the first time
Aug  6 10:40:28 controller-0 systemd: Started libcontainer container 1e1c4ed88dcbd455c0af7bb975fe9a09f19df574d7191e59f154e90fefd97d21.
Aug  6 10:40:28 controller-0 systemd: Starting libcontainer container 1e1c4ed88dcbd455c0af7bb975fe9a09f19df574d7191e59f154e90fefd97d21.
Aug  6 10:40:28 controller-0 haproxy[10]: Connect from 192.168.24.14:34846 to 192.168.24.14:35357 (keystone_admin/HTTP)
Aug  6 10:40:28 controller-0 journal: + sudo -E kolla_set_configs
Aug  6 10:40:29 controller-0 journal: INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
Aug  6 10:40:29 controller-0 journal: INFO:__main__:Validating config file
Aug  6 10:40:29 controller-0 journal: INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
Aug  6 10:40:29 controller-0 journal: INFO:__main__:Copying service configuration files
Aug  6 10:40:29 controller-0 journal: INFO:__main__:Deleting /etc/manila/manila.conf
Aug  6 10:40:29 controller-0 journal: INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/manila/manila.conf to /etc/manila/manila.conf
Aug  6 10:40:29 controller-0 journal: INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/my.cnf.d/tripleo.cnf to /etc/my.cnf.d/tripleo.cnf
Aug  6 10:40:29 controller-0 journal: ERROR:__main__:MissingRequiredSource: /var/lib/kolla/config_files/src-ceph/ file is not found
Aug  6 10:40:29 controller-0 docker(openstack-manila-share-docker-0)[127388]: INFO: monitor cmd exit code = 1
Aug  6 10:40:29 controller-0 docker(openstack-manila-share-docker-0)[127388]: INFO: stdout/stderr: nsenter: cannot open /proc/127513/ns/ipc: No such file or directory
Aug  6 10:40:29 controller-0 dockerd-current: time="2018-08-06T14:40:29.13570903Z" level=error msg="containerd: deleting container" error="exit status 1: \"container 1e1c4ed88dcbd455c0af7bb975fe9a09f19df574d7191e59f154e90fefd97d21 does not exist\\none or more of the container deletions failed\\n\""

Version-Release number of selected component (if applicable):
puppet-manila-12.4.0-2.el7ost.noarch
python2-manilaclient-1.21.0-1.el7ost.noarch
openstack-manila-6.0.1-2.el7ost.noarch
python-manila-6.0.1-2.el7ost.noarch

How reproducible:
100% on OSP13 puddle 2018-08-03.3

Steps to Reproduce:
1. Deploy with jenkins for OSP13 puddle 2018-08-03.3 with similar parameters:
https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/OSPD-Customized-Deployment-virt/5614/parameters/
2. Log into controllers, run sudo docker ps -all | grep manila to notice openstack-manila-share-docker-0 doesn't start.

Actual results:
Docker manila-share container doesn't start on any controller.

Expected results:
Docker manila-share container starts on a controller.

Additional info:
[heat-admin@controller-0 ~]$ sudo pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-1 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Tue Aug  7 17:41:35 2018
Last change: Tue Aug  7 16:26:31 2018 by root via cibadmin on controller-0

12 nodes configured
38 resources configured

Online: [ controller-0 controller-1 controller-2 ]
GuestOnline: [ galera-bundle-0@controller-0 galera-bundle-1@controller-1 galera-bundle-2@controller-2 rabbitmq-bundle-0@controller-0 rabbitmq-bundle-1@controller-1 rabbitmq-bundle-2@controller-2 redis-bundle-0@controller-0 redis-bundle-1@controller-1 redis-bundle-2@controller-2 ]

Full list of resources:

 Docker container set: rabbitmq-bundle [192.168.24.1:8787/rhosp13/openstack-rabbitmq:pcmklatest]
   rabbitmq-bundle-0	(ocf::heartbeat:rabbitmq-cluster):	Started controller-0
   rabbitmq-bundle-1	(ocf::heartbeat:rabbitmq-cluster):	Started controller-1
   rabbitmq-bundle-2	(ocf::heartbeat:rabbitmq-cluster):	Started controller-2
 Docker container set: galera-bundle [192.168.24.1:8787/rhosp13/openstack-mariadb:pcmklatest]
   galera-bundle-0	(ocf::heartbeat:galera):	Master controller-0
   galera-bundle-1	(ocf::heartbeat:galera):	Master controller-1
   galera-bundle-2	(ocf::heartbeat:galera):	Master controller-2
 Docker container set: redis-bundle [192.168.24.1:8787/rhosp13/openstack-redis:pcmklatest]
   redis-bundle-0	(ocf::heartbeat:redis):	Master controller-0
   redis-bundle-1	(ocf::heartbeat:redis):	Slave controller-1
   redis-bundle-2	(ocf::heartbeat:redis):	Slave controller-2
 ip-192.168.24.7	(ocf::heartbeat:IPaddr2):	Started controller-0
 ip-10.0.0.108	(ocf::heartbeat:IPaddr2):	Started controller-1
 ip-172.17.1.12	(ocf::heartbeat:IPaddr2):	Started controller-2
 ip-172.17.1.10	(ocf::heartbeat:IPaddr2):	Started controller-0
 ip-172.17.3.13	(ocf::heartbeat:IPaddr2):	Started controller-1
 ip-172.17.4.10	(ocf::heartbeat:IPaddr2):	Started controller-2
 Docker container set: haproxy-bundle [192.168.24.1:8787/rhosp13/openstack-haproxy:pcmklatest]
   haproxy-bundle-docker-0	(ocf::heartbeat:docker):	Started controller-0
   haproxy-bundle-docker-1	(ocf::heartbeat:docker):	Started controller-1
   haproxy-bundle-docker-2	(ocf::heartbeat:docker):	Started controller-2
 Docker container: openstack-cinder-volume [192.168.24.1:8787/rhosp13/openstack-cinder-volume:pcmklatest]
   openstack-cinder-volume-docker-0	(ocf::heartbeat:docker):	Started controller-0
 Docker container: openstack-manila-share [192.168.24.1:8787/rhosp13/openstack-manila-share:pcmklatest]
   openstack-manila-share-docker-0	(ocf::heartbeat:docker):	Stopped

Failed Actions:
* openstack-manila-share-docker-0_start_0 on controller-1 'unknown error' (1): call=115, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Tue Aug  7 16:26:31 2018', queued=0ms, exec=1510ms
* openstack-manila-share-docker-0_start_0 on controller-2 'unknown error' (1): call=115, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Tue Aug  7 16:26:32 2018', queued=0ms, exec=1664ms
* openstack-manila-share-docker-0_start_0 on controller-0 'unknown error' (1): call=117, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Tue Aug  7 16:26:34 2018', queued=0ms, exec=1670ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 3 Goutham Pacha Ravi 2018-08-07 21:48:01 UTC
This BZ was raised on a downstream candidate build for OSP 13. The failure manifests because a few dependencies for another bug (https://bugzilla.redhat.com/show_bug.cgi?id=1597379) were not completely added to the candidate build. 

So, marking this a duplicate of the other bug. The fix is available in openstack-tripleo-heat-templates-8.0.4-16.el7ost, which should be spun up into the next candidate for OSP 13.

*** This bug has been marked as a duplicate of bug 1597379 ***