Description of problem:
Originally filed as BZ1748682 against Openshift 4.2.
When a `podman pull` command is interrupted (in the case of the original bug, by a reboot) while it is in certain critical sections, the containers/storage library will attempt to clean up the partially-completed operation if the pull is re-ran. This code path can attempt to take a lock multiple times, resulting in a deadlock that will prevent almost all Podman commands from running (and any command requiring c/storage - Buildah, CRI-O, Skopeo will also be unable to stop.
Version-Release number of selected component (if applicable):
Podman 1.4.2-stable2 (should reproduce on any released Podman)
How reproducible:
Fairly reproducible when `podman pull` is run as a systemd service on bootup, but this is a race condition with a fairly slim window - will be difficult to hit in normal use.
Steps to Reproduce:
1. podman pull <image>
2. Interrupt previous `podman pull` command while pull is in progress - SIGKILL should work
3. Re-run `podman pull <image>`
Actual results:
Second `podman pull` command freezes. Until it is closed, Podman, Skopeo, and CRI-O cannot launch successfully. Killing the frozen process will restore operation, but running the same `podman pull` again can potentially freeze again.
Expected results:
Second `podman pull` completes normally
Additional info:
Likelihood of triggering is low under normal circumstances, but Openshift found a fairly reliable way of doing it.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2020:0348