DescriptionDamien Ciabrini
2019-08-02 14:25:15 UTC
Description of problem:
In Openstack we're running podman containers in conjunction with systemd.
- we create our containers with "podman create"
- we have dedicated systemd service files to run "podman start" and "podman stop" actions
- the systemd service file almost always have auto restart configured
When I wanted to clean up my env with "podman rm --all", many containers got deleted and
some probably didn't because systemd restarted them after podman stopped them for removal.
After that, I move systemd control out of the way, and redid a "podman rm --all".
After I could verify that no containers were declared anymore, I tried to recreate
an openstack container by reusing an old name, and that failed:
sudo podman create --name swift_object_expirer --label config_id=tripleo_step4 --label container_name=swift_object_expirer --label managed_by=paunch --label config_data="{\"environment\": [\"KOLLA_CONFIG_STRATEGY=COPY_ALWAYS\", \"TRIPLEO_CONFIG_HASH=d5ca1b2ea06f89838383da462022f0bc\"], \"image\": \"192.168.122.8:8787/rhosp15/openstack-swift-proxy-server:latest\", \"net\": \"host\", \"restart\": \"always\", \"user\": \"swift\", \"volumes\": [\"/etc/hosts:/etc/hosts:ro\", \"/etc/localtime:/etc/localtime:ro\", \"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\", \"/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro\", \"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro\", \"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\", \"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\", \"/dev/log:/dev/log\", \"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\", \"/etc/puppet:/etc/puppet:ro\", \"/var/lib/kolla/config_files/swift_object_expirer.json:/var/lib/kolla/config_files/config.json:ro\", \"/var/lib/config-data/puppet-generated/swift/:/var/lib/kolla/config_files/src:ro\", \"/srv/node:/srv/node\", \"/dev:/dev\", \"/var/cache/swift:/var/cache/swift\", \"/var/log/containers/swift:/var/log/swift:z\"]}" --conmon-pidfile=/var/run/swift_object_expirer.pid --detach=true --log-driver json-file --log-opt path=/var/log/containers/stdouts/swift_object_expirer.log --env=KOLLA_CONFIG_STRATEGY=COPY_ALWAYS --env=TRIPLEO_CONFIG_HASH=d5ca1b2ea06f89838383da462022f0bc --net=host --user=swift --volume=/etc/hosts:/etc/hosts:ro --volume=/etc/localtime:/etc/localtime:ro --volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro --volume=/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro --volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro --volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro --volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro --volume=/dev/log:/dev/log --volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro --volume=/etc/puppet:/etc/puppet:ro --volume=/var/lib/kolla/config_files/swift_object_expirer.json:/var/lib/kolla/config_files/config.json:ro --volume=/var/lib/config-data/puppet-generated/swift/:/var/lib/kolla/config_files/src:ro --volume=/srv/node:/srv/node --volume=/dev:/dev --volume=/var/cache/swift:/var/cache/swift --volume=/var/log/containers/swift:/var/log/swift:z 192.168.122.8:8787/rhosp15/openstack-swift-proxy-server:latest
error creating container storage: the container name "swift_object_expirer" is already in use by "99116fcf5d846f3e31ab58790d8a373b6487f977e3de0ef02206503a12aeafea". You have to remove that container to be able to reuse that name.: that name is already in use
Oddly, podman complains that the name is already in use but I can't see it in "podman ps"
[stack@standalone ~]$ sudo podman ps -a | grep swift_object_expirer
[stack@standalone ~]$ sudo podman ps -a | grep 99116fcf
[stack@standalone ~]$ sudo podman ps -a --sync | grep swift_object_expirer
[stack@standalone ~]$ sudo podman ps -a
I do see however a reference to this previously deleted container in podman internals:
[stack@standalone ~]$ sudo grep 99116fcf5d846f3e31ab58790d8a373b6487f977e3de0ef02206503a12aeafea /var/lib/containers/storage/overlay-containers/containers.json | dd bs=1 count=1000
[{"id":"99116fcf5d846f3e31ab58790d8a373b6487f977e3de0ef02206503a12aeafea","names":["swift_object_expirer"],"image":"7e61e00939d6a871c1bd6cf03cd629b93411ac62d88f227fca042dc178b97f9d","layer":"9a9ee0c7ac735551e2b7644343af266588a869817445ed63e7b05b3f29f96389","metadata":"{\"image-name\":\"192.168.122.8:8787/rhosp15/openstack-swift-proxy-server:latest\",\"image-id\":\"7e61e00939d6a871c1bd6cf03cd629b93411ac62d88f227fca042dc178b97f9d\",\"name\":\"swift_object_expirer\",\"created-at\":1564736475}","created":"2019-08-02T09:01:15.154547489Z","flags":{"MountLabel":"system_u:object_r:container_file_t:s0:c246,c518","ProcessLabel":"system_u:system_r:container_t:s0:c246,c518"}},{"id":"a29a5209d2b1e56954498211eb6cc7e3d961e4c766a3c67d7f7e893910679d77","names":["openstack-cinder-volume-podman-0"],"image":"61477c7e159872f9248cc6778fdcddd88c744eb819dc7db4404fd92ba8095597","layer":"ce83d4e6630ff1f8425b7eb836150a1fc4f43a4a8fce181af582495e5e0b2f6d","metadata":"{\"image-name\":\"192.168.122.8:8787/rhosp15/op
[...]
Forcing the deletion of that lingering container definition didn't work either.
[stack@standalone ~]$ sudo podman rm swift_object_expirer
unable to find container swift_object_expirer: no container with name or ID swift_object_expirer found: no such container
[stack@standalone ~]$ sudo podman rm --force swift_object_expirer
unable to find container swift_object_expirer: no container with name or ID swift_object_expirer found: no such container
When querying buildah for the existence of that container, I do get some results out of it (not sure why):
root@standalone ~]# buildah containers --all | grep swift_object_expirer
99116fcf5d84 7e61e00939d6 192.168.122.8:8787/rhosp15/openstack-swift-proxy-server:latest swift_object_expirer
But I couldn't use buildah to delete that old container, because other internal state have been deleted during "podman rm --all"
[root@standalone ~]# buildah rm swift_object_expirer
error removing container "swift_object_expirer": error reading build container: error reading "/var/lib/containers/storage/overlay-containers/99116fcf5d846f3e31ab58790d8a373b6487f977e3de0ef02206503a12aeafea/userdata/buildah.json": open /var/lib/containers/storage/overlay-containers/99116fcf5d846f3e31ab58790d8a373b6487f977e3de0ef02206503a12aeafea/userdata/buildah.json: no such file or directory
[root@standalone ~]# find /var/lib/containers/storage/overlay-containers/99116fcf5d846f3e31ab58790d8a373b6487f977e3de0ef02206503a12aeafea/userdata/
/var/lib/containers/storage/overlay-containers/99116fcf5d846f3e31ab58790d8a373b6487f977e3de0ef02206503a12aeafea/userdata/
/var/lib/containers/storage/overlay-containers/99116fcf5d846f3e31ab58790d8a373b6487f977e3de0ef02206503a12aeafea/userdata/artifacts
/var/lib/containers/storage/overlay-containers/99116fcf5d846f3e31ab58790d8a373b6487f977e3de0ef02206503a12aeafea/userdata/shm
buildah insist in parsing buildah.json, so even replacing it with an empty file in not enough to workaround the issue.
Version-Release number of selected component (if applicable):
podman-1.0.3-1.git9d78c0c.module+el8.0.0.z+3717+fdd07b7c.x86_64
How reproducible:
Random
Steps to Reproduce:
1. sudo podman rm --all
2. concurrently restart some of the existing container that are being stopped
3. podman may complain that it can't remove some containers because one of their layers
are in use by other container. If so, repeat from 1
Actual results:
Sometimes all containers are deleted, but some old containers info stay /var/lib/containers/storage/overlay-containers/containers.json, which prevent recreating new containers with the same names
Expected results:
after a successful "podman rm --all", one should be able to recreate a container with a previously used name
Additional info:
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2019:3403