DescriptionMichele Baldessari
2019-08-12 09:15:57 UTC
Created attachment 1602796[details]
strace of podman run
Description of problem:
After some destructive testing involving many reboots of a controller node, some of which via hard reset, podman got into a completely borked state. Namely every podman run command claims that there is no image:
# podman run -d --net=host --name=test 192.168.24.1:8787/rhosp15/openstack-haproxy:pcmklatest
must provide image ID and image name to use an image: invalid argument
This does not happen normally and it took a few reboots to get into this state, but in this state not a single run command works:
[root@controller-0 ~]# rpm -q podman kernel runc
podman-1.0.3-1.git9d78c0c.module+el8.0.0.z+3717+fdd07b7c.x86_64
kernel-4.18.0-80.7.1.el8_0.x86_64
runc-1.0.0-55.rc5.dev.git2abd837.module+el8.0.0+3049+59fd2bba.x86_64
Also note that we reproduced this also with a testing version of runc:
runc-1.0.0-60.rc8.rhaos4.2.git3cbe540.el8.x86_64
[root@controller-0 ~]# podman image inspect 41bfdd5a7361
error parsing image data "41bfdd5a7361b1ecd6233d67bd163008cb407f9098c99fb5e625f9918b1558ef": readlink /var/lib/containers/storage/overlay/l/7G7QCIMC7D5MK7NQXQC4WXJTV7: no such file or directory
Notice the uppercase there which seems a bit suspicious?
This seems very similar to https://github.com/code-ready/crc/issues/325 ?
Am attaching strace from the run command, the bolt db and ls -lR from /var/lib/containers
What sprints to the eye is that on a working node we have:
[root@controller-1 ~]# ls -l /var/lib/containers/storage/overlay/l/ |wc -l
170
Whereas on a broken node we have:
[root@controller-0 ~]# ls /var/lib/containers/storage/overlay/l/
[root@controller-0 ~]#
Comment 1Michele Baldessari
2019-08-12 09:16:43 UTC
Thanks Michele for your feedback. Checked the code in vendor/github.com/containers/storage/drivers/overlay/overlay.go from podman-1.4.2-5.module+el8.1.0+4240+893c1ab8.src.rpm. The patches are already included. So set this to verified.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2019:3403