Bug 1751245

Summary: [Scale Up] ERROR configuring crond
Product: Red Hat OpenStack Reporter: mlammon
Component: openstack-tripleo-heat-templatesAssignee: Emilien Macchi <emacchi>
Status: CLOSED ERRATA QA Contact: Sasha Smolyak <ssmolyak>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 15.0 (Stein)CC: apevec, aschultz, bfournie, emacchi, jcoufal, lhh, mburns, michele, ohochman, ramishra, scorcora, shdunne, ssmolyak
Target Milestone: rcKeywords: Regression, TestBlocker, Triaged
Target Release: 15.0 (Stein)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-10.6.1-0.20190909163923.999c846.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-21 11:24:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
overcloud_scale_up_log none

Description mlammon 2019-09-11 13:55:11 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Deployed 3 controller, 1 compute, 3 ceph using whole disk image
2. Scale up 1 compute
3.

Actual results:
overcloud scale up fails to complete configuration


Env:
python3-heat-agent-ansible-1.8.1-0.20190523210450.1e15344.el8ost.noarch
openstack-heat-engine-12.0.1-0.20190805120452.3476f1d.el8ost.noarch
puppet-heat-14.4.1-0.20190420110320.4425351.el8ost.noarch
python3-heat-agent-hiera-1.8.1-0.20190523210450.1e15344.el8ost.noarch
openstack-heat-common-12.0.1-0.20190805120452.3476f1d.el8ost.noarch
openstack-heat-monolith-12.0.1-0.20190805120452.3476f1d.el8ost.noarch
python3-tripleoclient-heat-installer-11.5.1-0.20190829110437.9b9b5aa.el8ost.noarch
python3-heatclient-1.17.0-0.20190312144725.8af5deb.el8ost.noarch
python3-heat-agent-1.8.1-0.20190523210450.1e15344.el8ost.noarch
python3-heat-agent-json-file-1.8.1-0.20190523210450.1e15344.el8ost.noarch
python3-heat-agent-docker-cmd-1.8.1-0.20190523210450.1e15344.el8ost.noarch
heat-cfntools-1.4.2-6.el8ost.noarch
python3-heat-agent-puppet-1.8.1-0.20190523210450.1e15344.el8ost.noarch
openstack-heat-agents-1.8.1-0.20190523210450.1e15344.el8ost.noarch
openstack-heat-api-12.0.1-0.20190805120452.3476f1d.el8ost.noarch
python3-heat-agent-apply-config-1.8.1-0.20190523210450.1e15344.el8ost.noarch
openstack-tripleo-heat-templates-10.6.1-0.20190905170437.b33b839.el8ost.noarch
python3-mistral-lib-1.1.0-0.20190312192103.bac92db.el8ost.noarch
puppet-mistral-14.4.1-0.20190420123026.2394250.el8ost.noarch
python3-mistralclient-3.8.1-0.20190516100359.1712bd4.el8ost.noarch

(undercloud) [stack@undercloud-0 ~]$ openstack overcloud status
+-----------+---------------------+---------------------+-------------------+
| Plan Name |       Created       |       Updated       | Deployment Status |
+-----------+---------------------+---------------------+-------------------+
| overcloud | 2019-09-10 23:11:46 | 2019-09-10 23:11:46 |   DEPLOY_FAILED   |
+-----------+---------------------+---------------------+-------------------+

(undercloud) [stack@undercloud-0 ~]$ openstack server list
+--------------------------------------+--------------+--------+------------------------+----------------+------------+
| ID                                   | Name         | Status | Networks               | Image          | Flavor     |
+--------------------------------------+--------------+--------+------------------------+----------------+------------+
| 16161332-98e7-4ff1-9268-4981d0c54480 | compute-1    | ACTIVE | ctlplane=192.168.24.39 | overcloud-full | compute    |
| a5d7f337-92d6-4ed3-89a3-ada65928be13 | controller-1 | ACTIVE | ctlplane=192.168.24.28 | overcloud-full | controller |
| ec5d7e02-3b19-4e16-8424-f751be3aa479 | ceph-2       | ACTIVE | ctlplane=192.168.24.35 | overcloud-full | ceph       |
| 6caed296-1c40-42c2-a35e-160c322ba0dd | controller-2 | ACTIVE | ctlplane=192.168.24.45 | overcloud-full | controller |
| be54b642-c7ee-4b56-8041-ca4e031f3a42 | controller-0 | ACTIVE | ctlplane=192.168.24.38 | overcloud-full | controller |
| f9de5176-2137-4e06-96eb-b7f0e4648ef2 | ceph-1       | ACTIVE | ctlplane=192.168.24.43 | overcloud-full | ceph       |
| 93b3c5a4-0c67-4498-a9e7-6f61665b7d25 | compute-0    | ACTIVE | ctlplane=192.168.24.33 | overcloud-full | compute    |
| 0ac0e02b-0dc2-4b18-a24c-ea526322e814 | ceph-0       | ACTIVE | ctlplane=192.168.24.12 | overcloud-full | ceph       |
+--------------------------------------+--------------+--------+------------------------+----------------+------------+

 [root@undercloud-0 mistral]# grep -r ERROR /var/lib/mistral/overcloud/ansible.log
        "2019-09-10 22:35:16,390 ERROR: 48637 -- ['/usr/bin/podman', 'run', '--user', 'root', '--name', 'container-puppet-crond', '--env', 'PUPPET_TAGS=file,file_line,concat,augeas,cron', '--env', 'NAME=crond', '--env', 'HOSTNAME=compute-0', '--env', 'NO_ARCHIVE=', '--env', 'STEP=6', '--env', 'NET_HOST=true', '--log-driver', 'json-file', '--volume', '/etc/localtime:/etc/localtime:ro', '--volume', '/tmp/tmpcx8l5vjz:/etc/config.pp:ro', '--volume', '/etc/puppet/:/tmp/puppet-etc/:ro', '--volume', '/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume', '/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume', '/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume', '/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume', '/var/lib/config-data:/var/lib/config-data/:rw', '--volume', '/var/lib/container-puppet/puppetlabs/facter.conf:/etc/puppetlabs/facter/facter.conf:ro', '--volume', '/var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro', '--volume', '/dev/log:/dev/log:rw', '--log-opt', 'path=/var/log/containers/stdouts/container-puppet-crond.log', '--security-opt', 'label=disable', '--volume', '/usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro', '--entrypoint', '/var/lib/container-puppet/container-puppet.sh', '--net', 'host', '--volume', '/etc/hosts:/etc/hosts:ro', '--volume', '/var/lib/container-puppet/container-puppet.sh:/var/lib/container-puppet/container-puppet.sh:ro', '192.168.24.1:8787/rhosp15/openstack-cron:20190904.3'] run failed after error creating container storage: the container name \"container-puppet-crond\" is already in use by \"de08e0f99bc58991787b216c70ab2171bf56095dc92e6607ccef5002119ed6e9\". You have to remove that container to be able to reuse that name.: that name is already in use",
        "2019-09-10 22:35:19,662 ERROR: 48637 -- ['/usr/bin/podman', 'start', '-a', 'container-puppet-crond'] run failed after unable to find container container-puppet-crond: no container with name or ID container-puppet-crond found: no such container",
        "2019-09-10 22:35:22,840 ERROR: 48637 -- ['/usr/bin/podman', 'start', '-a', 'container-puppet-crond'] run failed after unable to find container container-puppet-crond: no container with name or ID container-puppet-crond found: no such container",
        "2019-09-10 22:35:22,840 ERROR: 48637 -- Failed running container for crond",
        "2019-09-10 22:35:39,754 ERROR: 48632 -- ERROR configuring crond



Expected results:
Successful scale up


Additional info:
I saw this late last week but taken me this long to get back to this error. It may or may not be related too https://bugzilla.redhat.com/show_bug.cgi?id=1750481 
but at the time we trying to see if (https://bugzilla.redhat.com/show_bug.cgi?id=1747885)  would resolve the issue.  This test environment has the FIV
of bz 1747885.  openstack-tripleo-heat-templates-10.6.1-0.20190905170437.b33b839.el8ost.noarch

Comment 1 mlammon 2019-09-11 14:14:07 UTC
sos report
http://rhos-release.virt.bos.redhat.com/log/bz1751245

Comment 2 mlammon 2019-09-11 14:17:54 UTC
Created attachment 1614110 [details]
overcloud_scale_up_log

Comment 4 Emilien Macchi 2019-09-11 15:31:12 UTC
what version of podman are you running on the overcloud?

Comment 5 mlammon 2019-09-11 15:39:51 UTC
[stack@undercloud-0 ~]$ sudo podman --version
podman version 1.0.5

Comment 7 Emilien Macchi 2019-09-11 16:12:10 UTC
I asked on the overcloud nodes, not the undercloud.

Comment 8 Emilien Macchi 2019-09-11 16:13:01 UTC
Please close the bug if you're not using podman version 1.0.5 on the overcloud nodes (again not the undercloud).

Comment 9 mlammon 2019-09-11 19:00:33 UTC
Apologies, I misread 
I checked controller node and its 1.0.5
[root@controller-0 ~]# podman -version
podman version 1.0.5

Comment 12 Bob Fournier 2019-09-11 23:11:17 UTC
Seems that the log message - "run failed after error creating container storage: the container name \"container-puppet-crond\" is already in use by \"de08e0f99bc58991787b216c70ab2171bf56095dc92e6607ccef5002119ed6e9\" "

is showing the same issue with container storage leakage as https://bugzilla.redhat.com/show_bug.cgi?id=1747885#c8 (and https://github.com/containers/libpod/issues/3906 as Emilien indicated). Even with podman 1.0.5, which adds the 
"podman rm --storage" command, don't you have to run that command to mitigate this issue?

Comment 20 Lon Hohberger 2019-09-12 13:21:58 UTC
What openstack-tripleo-heat-templates build were you using?

Comment 21 Emilien Macchi 2019-09-12 13:27:38 UTC
I tested OSP15 scale out withhttps://review.opendev.org/#/c/680623 and it passed.

Comment 22 mlammon 2019-09-12 19:20:55 UTC
I hit the selinux bug trying to test this one
https://bugzilla.redhat.com/show_bug.cgi?id=1751300

I will need to re-test with compose which fixes 1751300

Comment 25 Sasha Smolyak 2019-09-15 14:01:34 UTC
Mistral log is clear after scale up, scale up passed, no errors of crond or any other. Verified.

Comment 30 errata-xmlrpc 2019-09-21 11:24:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811