Hide Forgot
Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Deployed 3 controller, 1 compute, 3 ceph using whole disk image 2. Scale up 1 compute 3. Actual results: overcloud scale up fails to complete configuration Env: python3-heat-agent-ansible-1.8.1-0.20190523210450.1e15344.el8ost.noarch openstack-heat-engine-12.0.1-0.20190805120452.3476f1d.el8ost.noarch puppet-heat-14.4.1-0.20190420110320.4425351.el8ost.noarch python3-heat-agent-hiera-1.8.1-0.20190523210450.1e15344.el8ost.noarch openstack-heat-common-12.0.1-0.20190805120452.3476f1d.el8ost.noarch openstack-heat-monolith-12.0.1-0.20190805120452.3476f1d.el8ost.noarch python3-tripleoclient-heat-installer-11.5.1-0.20190829110437.9b9b5aa.el8ost.noarch python3-heatclient-1.17.0-0.20190312144725.8af5deb.el8ost.noarch python3-heat-agent-1.8.1-0.20190523210450.1e15344.el8ost.noarch python3-heat-agent-json-file-1.8.1-0.20190523210450.1e15344.el8ost.noarch python3-heat-agent-docker-cmd-1.8.1-0.20190523210450.1e15344.el8ost.noarch heat-cfntools-1.4.2-6.el8ost.noarch python3-heat-agent-puppet-1.8.1-0.20190523210450.1e15344.el8ost.noarch openstack-heat-agents-1.8.1-0.20190523210450.1e15344.el8ost.noarch openstack-heat-api-12.0.1-0.20190805120452.3476f1d.el8ost.noarch python3-heat-agent-apply-config-1.8.1-0.20190523210450.1e15344.el8ost.noarch openstack-tripleo-heat-templates-10.6.1-0.20190905170437.b33b839.el8ost.noarch python3-mistral-lib-1.1.0-0.20190312192103.bac92db.el8ost.noarch puppet-mistral-14.4.1-0.20190420123026.2394250.el8ost.noarch python3-mistralclient-3.8.1-0.20190516100359.1712bd4.el8ost.noarch (undercloud) [stack@undercloud-0 ~]$ openstack overcloud status +-----------+---------------------+---------------------+-------------------+ | Plan Name | Created | Updated | Deployment Status | +-----------+---------------------+---------------------+-------------------+ | overcloud | 2019-09-10 23:11:46 | 2019-09-10 23:11:46 | DEPLOY_FAILED | +-----------+---------------------+---------------------+-------------------+ (undercloud) [stack@undercloud-0 ~]$ openstack server list +--------------------------------------+--------------+--------+------------------------+----------------+------------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+--------------+--------+------------------------+----------------+------------+ | 16161332-98e7-4ff1-9268-4981d0c54480 | compute-1 | ACTIVE | ctlplane=192.168.24.39 | overcloud-full | compute | | a5d7f337-92d6-4ed3-89a3-ada65928be13 | controller-1 | ACTIVE | ctlplane=192.168.24.28 | overcloud-full | controller | | ec5d7e02-3b19-4e16-8424-f751be3aa479 | ceph-2 | ACTIVE | ctlplane=192.168.24.35 | overcloud-full | ceph | | 6caed296-1c40-42c2-a35e-160c322ba0dd | controller-2 | ACTIVE | ctlplane=192.168.24.45 | overcloud-full | controller | | be54b642-c7ee-4b56-8041-ca4e031f3a42 | controller-0 | ACTIVE | ctlplane=192.168.24.38 | overcloud-full | controller | | f9de5176-2137-4e06-96eb-b7f0e4648ef2 | ceph-1 | ACTIVE | ctlplane=192.168.24.43 | overcloud-full | ceph | | 93b3c5a4-0c67-4498-a9e7-6f61665b7d25 | compute-0 | ACTIVE | ctlplane=192.168.24.33 | overcloud-full | compute | | 0ac0e02b-0dc2-4b18-a24c-ea526322e814 | ceph-0 | ACTIVE | ctlplane=192.168.24.12 | overcloud-full | ceph | +--------------------------------------+--------------+--------+------------------------+----------------+------------+ [root@undercloud-0 mistral]# grep -r ERROR /var/lib/mistral/overcloud/ansible.log "2019-09-10 22:35:16,390 ERROR: 48637 -- ['/usr/bin/podman', 'run', '--user', 'root', '--name', 'container-puppet-crond', '--env', 'PUPPET_TAGS=file,file_line,concat,augeas,cron', '--env', 'NAME=crond', '--env', 'HOSTNAME=compute-0', '--env', 'NO_ARCHIVE=', '--env', 'STEP=6', '--env', 'NET_HOST=true', '--log-driver', 'json-file', '--volume', '/etc/localtime:/etc/localtime:ro', '--volume', '/tmp/tmpcx8l5vjz:/etc/config.pp:ro', '--volume', '/etc/puppet/:/tmp/puppet-etc/:ro', '--volume', '/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume', '/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume', '/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume', '/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume', '/var/lib/config-data:/var/lib/config-data/:rw', '--volume', '/var/lib/container-puppet/puppetlabs/facter.conf:/etc/puppetlabs/facter/facter.conf:ro', '--volume', '/var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro', '--volume', '/dev/log:/dev/log:rw', '--log-opt', 'path=/var/log/containers/stdouts/container-puppet-crond.log', '--security-opt', 'label=disable', '--volume', '/usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro', '--entrypoint', '/var/lib/container-puppet/container-puppet.sh', '--net', 'host', '--volume', '/etc/hosts:/etc/hosts:ro', '--volume', '/var/lib/container-puppet/container-puppet.sh:/var/lib/container-puppet/container-puppet.sh:ro', '192.168.24.1:8787/rhosp15/openstack-cron:20190904.3'] run failed after error creating container storage: the container name \"container-puppet-crond\" is already in use by \"de08e0f99bc58991787b216c70ab2171bf56095dc92e6607ccef5002119ed6e9\". You have to remove that container to be able to reuse that name.: that name is already in use", "2019-09-10 22:35:19,662 ERROR: 48637 -- ['/usr/bin/podman', 'start', '-a', 'container-puppet-crond'] run failed after unable to find container container-puppet-crond: no container with name or ID container-puppet-crond found: no such container", "2019-09-10 22:35:22,840 ERROR: 48637 -- ['/usr/bin/podman', 'start', '-a', 'container-puppet-crond'] run failed after unable to find container container-puppet-crond: no container with name or ID container-puppet-crond found: no such container", "2019-09-10 22:35:22,840 ERROR: 48637 -- Failed running container for crond", "2019-09-10 22:35:39,754 ERROR: 48632 -- ERROR configuring crond Expected results: Successful scale up Additional info: I saw this late last week but taken me this long to get back to this error. It may or may not be related too https://bugzilla.redhat.com/show_bug.cgi?id=1750481 but at the time we trying to see if (https://bugzilla.redhat.com/show_bug.cgi?id=1747885) would resolve the issue. This test environment has the FIV of bz 1747885. openstack-tripleo-heat-templates-10.6.1-0.20190905170437.b33b839.el8ost.noarch
sos report http://rhos-release.virt.bos.redhat.com/log/bz1751245
Created attachment 1614110 [details] overcloud_scale_up_log
what version of podman are you running on the overcloud?
[stack@undercloud-0 ~]$ sudo podman --version podman version 1.0.5
I asked on the overcloud nodes, not the undercloud.
Please close the bug if you're not using podman version 1.0.5 on the overcloud nodes (again not the undercloud).
Apologies, I misread I checked controller node and its 1.0.5 [root@controller-0 ~]# podman -version podman version 1.0.5
Seems that the log message - "run failed after error creating container storage: the container name \"container-puppet-crond\" is already in use by \"de08e0f99bc58991787b216c70ab2171bf56095dc92e6607ccef5002119ed6e9\" " is showing the same issue with container storage leakage as https://bugzilla.redhat.com/show_bug.cgi?id=1747885#c8 (and https://github.com/containers/libpod/issues/3906 as Emilien indicated). Even with podman 1.0.5, which adds the "podman rm --storage" command, don't you have to run that command to mitigate this issue?
What openstack-tripleo-heat-templates build were you using?
I tested OSP15 scale out withhttps://review.opendev.org/#/c/680623 and it passed.
I hit the selinux bug trying to test this one https://bugzilla.redhat.com/show_bug.cgi?id=1751300 I will need to re-test with compose which fixes 1751300
Mistral log is clear after scale up, scale up passed, no errors of crond or any other. Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811