Bug 1921128
| Summary: | [gss][podman]Getting the error while starting container "Error: readlink /var/lib/containers/storage/overlay/l/XXX no such file or directory" | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Geo Jose <gjose> | |
| Component: | podman | Assignee: | Jindrich Novy <jnovy> | |
| Status: | CLOSED ERRATA | QA Contact: | Alex Jia <ajia> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 8.2 | CC: | ajia, akupczyk, bbaude, bhubbard, bniver, cchen, ceph-eng-bugs, ddarrah, dornelas, dwalsh, dzafman, gabrioux, gsitlani, jligon, jnovy, kchai, lithomas, lmiksik, lsm5, mheon, nojha, pthomas, rzarzyns, sseshasa, tsweeney, umohnani, vrothber, vumrao, ypu | |
| Target Milestone: | rc | |||
| Target Release: | 8.4 | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | podman-3.0.1-6.el8 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1940493 (view as bug list) | Environment: | ||
| Last Closed: | 2021-05-18 15:34:30 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1186913, 1823899, 1940493, 1952246 | |||
|
Description
Geo Jose
2021-01-27 14:55:49 UTC
Did the machine crash, while committing an image? We have seen similar failings like this before when a image does not get fully written based on a crash. or shutdown. I'll ask Valentin to look at this as he seemed to have some thoughts during scrum today. I think it's the same issue as https://github.com/containers/podman/issues/5986. Let's continue the conversation upstream and report the solution here. I have been looking at this issue and had a conversation with Dan. For now, we suggest using the mentioned workaround and re-pull the image. We will tackle the root cause soon but this will take time. The problem is storage corruption when the pull process is being killed. There is a certain time window in which this can lead to data corruption as shown here and in the linked upstream issue. The fix for this has made it into v3.0.1 for RHEL 8.4, moving to POST. Confirming https://github.com/containers/storage/pull/822 is already applied in the current version of podman in 8.4.0. Can we get qa ack please? I would say containers/storage needs to be updated in buildah/podman and skopeo. I can't reproduce this bug followed by steps in https://github.com/containers/podman/issues/5986, and I got different error like this 'Error: error creating container storage: the container name "atomix-1" is already in use by "ed8ea5031a2111261fa70e56c9440aa27ada62c2a3495b2d944a14868d32bd32". You have to remove that container to be able to reuse that name.: that name is already in use', and podman ps -a show nothing and also can't remove this container by podman rm -f, this issue is found on podman-1.9.3-2.module+el8.2.1+6867+366c07d6 and podman-3.0.1-3.module+el8.4.0+10198+36d1d0e3, as usual, you need to run tests several times then hit this issue. Deploy rhel-guest-image-8.4-756.x86_64.qcow2 as libvirt VM then run tests inside the VM [root@atomic-host-test-4109 ~]# rpm -q podman podman-3.0.1-3.module+el8.4.0+10198+36d1d0e3.x86_64 [root@atomic-host-test-4109 ~]# podman run --rm -d --name atomix-1 -p 5679:5679 -it -v /opt/onos/config:/etc/atomix/conf -v /var/lib/atomix-1/data:/var/lib/atomix/data:Z atomix/atomix:3.1.5 --config /etc/atomix/conf/atomix-1.conf --ignore-resources --data-dir /var/lib/atomix/data --log-level WARN ed8ea5031a2111261fa70e56c9440aa27ada62c2a3495b2d944a14868d32bd32 [root@atomic-host-test-4109 ~]# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ed8ea5031a21 docker.io/atomix/atomix:3.1.5 --config /etc/ato... 4 seconds ago Up 4 seconds ago 0.0.0.0:5679->5679/tcp atomix-1 Destroy above running VM then start it again [root@hp-dl360g9-04 ~]# virsh destroy ajia-8.4.0 Domain ajia-8.4.0 destroyed [root@hp-dl360g9-04 ~]# virsh start ajia-8.4.0 Domain ajia-8.4.0 started [root@atomic-host-test-4109 ~]# podman ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@atomic-host-test-4109 ~]# podman run --rm -d --name atomix-1 -p 5679:5679 -it -v /opt/onos/config:/etc/atomix/conf -v /var/lib/atomix-1/data:/var/lib/atomix/data:Z atomix/atomix:3.1.5 --config /etc/atomix/conf/atomix-1.conf --ignore-resources --data-dir /var/lib/atomix/data --log-level WARN Error: error creating container storage: the container name "atomix-1" is already in use by "ed8ea5031a2111261fa70e56c9440aa27ada62c2a3495b2d944a14868d32bd32". You have to remove that container to be able to reuse that name.: that name is already in use [root@atomic-host-test-4109 ~]# podman ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES NOTE: sometimes, can remove atomix-1 container, although podman ps -a show nothing [root@atomic-host-test-4109 ~]# podman rm atomix-1 atomix-1 [root@atomic-host-test-4109 ~]# podman run --rm -d --name atomix-1 -p 5679:5679 -it -v /opt/onos/config:/etc/atomix/conf -v /var/lib/atomix-1/data:/var/lib/atomix/data:Z atomix/atomix:3.1.5 --config /etc/atomix/conf/atomix-1.conf --ignore-resources --data-dir /var/lib/atomix/data --log-level WARN dd83b6ff7efd6769582349a27f57695d5caf6a37ad4cba81e6a2abbd9c2113ef [root@atomic-host-test-4109 ~]# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@atomic-host-test-4109 ~]# podman ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@atomic-host-test-4109 ~]# podman run --rm -d --name atomix-1 -p 5679:5679 -it -v /opt/onos/config:/etc/atomix/conf -v /var/lib/atomix-1/data:/var/lib/atomix/data:Z atomix/atomix:3.1.5 --config /etc/atomix/conf/atomix-1.conf --ignore-resources --data-dir /var/lib/atomix/data --log-level WARN 20f074d16b72a013f480b447cefa86b25436e5527d9b52773af8aafb4c41e02a [root@atomic-host-test-4109 ~]# podman ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 20f074d16b72 docker.io/atomix/atomix:3.1.5 --config /etc/ato... 2 seconds ago Up 3 seconds ago 0.0.0.0:5679->5679/tcp atomix-1 NOTE: repeat running above steps several times, the atomix-1 container is running again. Alex, can you please re-test with the current versions attached to the advisory? podman-3.0.1-6.module+el8.4.0+10398+842aaf04 is the actual version which has all important bits vendored in to address this. Thanks. https://errata.devel.redhat.com/advisory/65330/builds (In reply to Jindrich Novy from comment #26) > Alex, can you please re-test with the current versions attached to the > advisory? podman-3.0.1-6.module+el8.4.0+10398+842aaf04 is the actual version > which has all important bits vendored in to address this. Thanks. > > https://errata.devel.redhat.com/advisory/65330/builds I gave 3 times tests w/ podman-3.0.1-6.module+el8.4.0+10398+842aaf04 followed by steps in Comment 25, I can successfully start previous atomix-1 container again after destroying and starting VM. Is this test enought for you? if so, I will close this bug as VERIFIED, thanks! Yes Alex, unless Geo objects, I think this is sufficient to consider this one verified. (In reply to Jindrich Novy from comment #28) > Yes Alex, unless Geo objects, I think this is sufficient to consider this > one verified. Thank you Jindrich and move this bug to VERIFIED state now. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: container-tools:rhel8 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:1796 |