Bug 1888988

Summary:	Error refreshing container XXX: error acquiring lock 0 for container
Product:	Red Hat Enterprise Linux 8	Reporter:	Devon <dshumake>
Component:	podman	Assignee:	Matthew Heon <mheon>
Status:	CLOSED CURRENTRELEASE	QA Contact:	atomic-bugs <atomic-bugs>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	8.4	CC:	bbaude, dornelas, dwalsh, egolov, hasuzuki, ian, jligon, jnovy, lsm5, mheon, tsweeney, vrothber
Target Milestone:	rc	Flags:	pm-rhel: mirror+
Target Release:	8.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-03-22 20:29:42 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1186913

Description Devon 2020-10-16 17:11:24 UTC

Description of problem:

Customer is having this issue across multiple systems, seems to come and go but is happening almost every day.

[mysql@sgdevcdb01 ~]$ podman ps
ERRO[0000] Error refreshing container 290c2c0d0036af629f37f09ad1e3824404015b83781e9ec30e4a10c794417032: error acquiring lock 3 for container 290c2c0d0036af629f37f09ad1e3824404015b83781e9ec30e4a10c794417032: file exists
ERRO[0000] Error refreshing container 56d0386ec1658e9777d47c5b2df6f5d181f6352f63a414bc8d593c266915a52a: error acquiring lock 4 for container 56d0386ec1658e9777d47c5b2df6f5d181f6352f63a414bc8d593c266915a52a: file exists
ERRO[0000] Error refreshing volume mysql-banavim: error acquiring lock 1 for volume mysql-banavim: file exists
ERRO[0000] Error refreshing volume mysql-rhdev: error acquiring lock 2 for volume mysql-rhdev: file exists
CONTAINER ID  IMAGE  COMMAND  CREATED  STATUS  PORTS  NAMES
[mysql@sgdevcdb01 ~]$ podman ps --log-level = debug
Error: `podman ps` takes no arguments
[mysql@sgdevcdb01 ~]$ podman ps --log-level=debug
DEBU[0000] Ignoring lipod.conf EventsLogger setting "journald". Use containers.conf if you want to change this setting and remove libpod.conf files.
DEBU[0000] Reading configuration file "/usr/share/containers/containers.conf"
DEBU[0000] Merged system config "/usr/share/containers/containers.conf": &{{[] [] container-default [] host [CAP_AUDIT_WRITE CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_FSETID CAP_KILL CAP_MKNOD CAP_NET_BIND_SERVICE CAP_NET_RAW CAP_SETFCAP CAP_SETGID CAP_SETPCAP CAP_SETUID CAP_SYS_CHROOT] [] []  [] [] [] false [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] false false false  private k8s-file -1 slirp4netns false 2048 private /usr/share/containers/seccomp.json 65536k private host 65536} {false systemd [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] [/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] ctrl-p,ctrl-q true /tmp/run-27/libpod/tmp/events/events.log file [/usr/share/containers/oci/hooks.d] docker:// /pause k8s.gcr.io/pause:3.2 /usr/libexec/podman/catatonit shm   false 2048 runc map[crun:[/usr/bin/crun /usr/sbin/crun /usr/local/bin/crun /usr/local/sbin/crun /sbin/crun /bin/crun /run/current-system/sw/bin/crun] kata:[/usr/bin/kata-runtime /usr/sbin/kata-runtime /usr/local/bin/kata-runtime /usr/local/sbin/kata-runtime /sbin/kata-runtime /bin/kata-runtime /usr/bin/kata-qemu /usr/bin/kata-fc] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] missing [] [crun runc] [crun] {false false false false false false}  false 3 /var/lib/mysql/.local/share/containers/storage/libpod 10 /tmp/run-27/libpod/tmp /var/lib/mysql/.local/share/containers/storage/volumes} {[/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin] podman /etc/cni/net.d/}}
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /var/lib/mysql/.local/share/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /var/lib/mysql/.local/share/containers/storage
DEBU[0000] Using run root /tmp/run-27/containers
DEBU[0000] Using static dir /var/lib/mysql/.local/share/containers/storage/libpod
DEBU[0000] Using tmp dir /tmp/run-27/libpod/tmp
DEBU[0000] Using volume path /var/lib/mysql/.local/share/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] Not configuring container store
DEBU[0000] Initializing event backend file
DEBU[0000] using runtime "/usr/bin/runc"
WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument
WARN[0000] Error initializing configured OCI runtime kata: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] Ignoring lipod.conf EventsLogger setting "journald". Use containers.conf if you want to change this setting and remove libpod.conf files.
DEBU[0000] Reading configuration file "/usr/share/containers/containers.conf"
DEBU[0000] Merged system config "/usr/share/containers/containers.conf": &{{[] [] container-default [] host [CAP_AUDIT_WRITE CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_FSETID CAP_KILL CAP_MKNOD CAP_NET_BIND_SERVICE CAP_NET_RAW CAP_SETFCAP CAP_SETGID CAP_SETPCAP CAP_SETUID CAP_SYS_CHROOT] [] []  [] [] [] false [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] false false false  private k8s-file -1 slirp4netns false 2048 private /usr/share/containers/seccomp.json 65536k private host 65536} {false systemd [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] [/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] ctrl-p,ctrl-q true /tmp/run-27/libpod/tmp/events/events.log file [/usr/share/containers/oci/hooks.d] docker:// /pause k8s.gcr.io/pause:3.2 /usr/libexec/podman/catatonit shm   false 2048 runc map[crun:[/usr/bin/crun /usr/sbin/crun /usr/local/bin/crun /usr/local/sbin/crun /sbin/crun /bin/crun /run/current-system/sw/bin/crun] kata:[/usr/bin/kata-runtime /usr/sbin/kata-runtime /usr/local/bin/kata-runtime /usr/local/sbin/kata-runtime /sbin/kata-runtime /bin/kata-runtime /usr/bin/kata-qemu /usr/bin/kata-fc] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] missing [] [crun runc] [crun] {false false false false false false}  false 3 /var/lib/mysql/.local/share/containers/storage/libpod 10 /tmp/run-27/libpod/tmp /var/lib/mysql/.local/share/containers/storage/volumes} {[/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin] podman /etc/cni/net.d/}}
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /var/lib/mysql/.local/share/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /var/lib/mysql/.local/share/containers/storage
DEBU[0000] Using run root /tmp/run-27/containers
DEBU[0000] Using static dir /var/lib/mysql/.local/share/containers/storage/libpod
DEBU[0000] Using tmp dir /tmp/run-27/libpod/tmp
DEBU[0000] Using volume path /var/lib/mysql/.local/share/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] No store required. Not opening container store.
DEBU[0000] Initializing event backend file
DEBU[0000] using runtime "/usr/bin/runc"
WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument
WARN[0000] Error initializing configured OCI runtime kata: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] Failed to add podman to systemd sandbox cgroup: exec: "dbus-launch": executable file not found in $PATH
INFO[0000] running as rootless
DEBU[0000] Ignoring lipod.conf EventsLogger setting "journald". Use containers.conf if you want to change this setting and remove libpod.conf files.
DEBU[0000] Reading configuration file "/usr/share/containers/containers.conf"
DEBU[0000] Merged system config "/usr/share/containers/containers.conf": &{{[] [] container-default [] host [CAP_AUDIT_WRITE CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_FSETID CAP_KILL CAP_MKNOD CAP_NET_BIND_SERVICE CAP_NET_RAW CAP_SETFCAP CAP_SETGID CAP_SETPCAP CAP_SETUID CAP_SYS_CHROOT] [] []  [] [] [] false [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] false false false  private k8s-file -1 slirp4netns false 2048 private /usr/share/containers/seccomp.json 65536k private host 65536} {false systemd [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] [/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] ctrl-p,ctrl-q true /tmp/run-27/libpod/tmp/events/events.log file [/usr/share/containers/oci/hooks.d] docker:// /pause k8s.gcr.io/pause:3.2 /usr/libexec/podman/catatonit shm   false 2048 runc map[crun:[/usr/bin/crun /usr/sbin/crun /usr/local/bin/crun /usr/local/sbin/crun /sbin/crun /bin/crun /run/current-system/sw/bin/crun] kata:[/usr/bin/kata-runtime /usr/sbin/kata-runtime /usr/local/bin/kata-runtime /usr/local/sbin/kata-runtime /sbin/kata-runtime /bin/kata-runtime /usr/bin/kata-qemu /usr/bin/kata-fc] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] missing [] [crun runc] [crun] {false false false false false false}  false 3 /var/lib/mysql/.local/share/containers/storage/libpod 10 /tmp/run-27/libpod/tmp /var/lib/mysql/.local/share/containers/storage/volumes} {[/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin] podman /etc/cni/net.d/}}
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /var/lib/mysql/.local/share/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /var/lib/mysql/.local/share/containers/storage
DEBU[0000] Using run root /tmp/run-27/containers
DEBU[0000] Using static dir /var/lib/mysql/.local/share/containers/storage/libpod
DEBU[0000] Using tmp dir /tmp/run-27/libpod/tmp
DEBU[0000] Using volume path /var/lib/mysql/.local/share/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] No store required. Not opening container store.
DEBU[0000] Initializing event backend file
WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument
WARN[0000] Error initializing configured OCI runtime kata: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] using runtime "/usr/bin/runc"
DEBU[0000] Setting maximum workers to 8
CONTAINER ID  IMAGE  COMMAND  CREATED  STATUS  PORTS  NAMES

This looks similar to the following fedora bug:

    f30: podman fails to start my toolbox containers · Issue #3381 · containers/podman · GitHub
    URL:   https://github.com/containers/podman/issues/3381

I dont see anything currently open in bugzilla similar to this.

Version-Release number of selected component (if applicable):

podman-1.9.3-2.module+el8.2.1+6867+366c07d6.x86_64

$  podman info --debug
debug:
  compiler: gc
  gitCommit: ""
  goVersion: go1.13.4
  podmanVersion: 1.9.3
host:
  arch: amd64
  buildahVersion: 1.14.9
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.17-1.module+el8.2.1+6771+3533eb4c.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.17, commit: 3c703d9f178a3a53966e1d5c03d0275ea6cb36a0'
  cpus: 2
  distribution:
    distribution: '"rhel"'
    version: "8.2"
  eventLogger: file
  hostname: sgprodcap01
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 101
      size: 1
    - container_id: 1
      host_id: 886432
      size: 55537
    uidmap:
    - container_id: 0
      host_id: 101
      size: 1
    - container_id: 1
      host_id: 886432
      size: 55537
  kernel: 4.18.0-193.el8.x86_64
  memFree: 2368806912
  memTotal: 6047600640
  ociRuntime:
    name: runc
    package: runc-1.0.0-66.rc10.module+el8.2.1+6465+1a51e8b6.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  os: linux
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.0.1-1.module+el8.2.1+6595+03641d72.x86_64
    version: |-
      slirp4netns version 1.0.1
      commit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4
      libslirp: 4.3.0
  swapFree: 1719660544
  swapTotal: 1719660544
  uptime: 448h 3m 7.07s (Approximately 18.67 days)
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /opt/jboss/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.0.0-2.module+el8.2.1+6465+1a51e8b6.x86_64
      Version: |-
        fuse-overlayfs: version 1.0.0
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  graphRoot: /opt/jboss/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 1
  runRoot: /tmp/run-101/containers
  volumePath: /opt/jboss/.local/share/containers/storage/volumes

How reproducible:

Fairly, it occurs almost ever day but do not know the steps to recreate it.


Actual results:

ERRO[0000] Error refreshing container 290c2c0d0036af629f37f09ad1e3824404015b83781e9ec30e4a10c794417032: error acquiring lock 3 for container 

Expected results:

Able to recover the lock for the container or clean up the locks so the error does not occur.

Additional info:

Comment 1 Matthew Heon 2020-10-19 13:20:16 UTC

Please try the `podman system renumber` command (ideally when all containers are not running, but it should be fine as long as no other Podman commands are run at the same time). This will reallocate locks to remove any potential conflicts that could be causing this.

Assuming this does fix the issue, I'd be very interested to hear if the customer encounters this again - if the lock allocator is making duplicate allocations that's a large problem.

Comment 2 Tom Sweeney 2020-10-19 19:04:57 UTC

I've cc'd Valentin in case he's any locking thought and I might move this his way based on customer's response.

Comment 4 Devon 2020-11-04 20:13:42 UTC

Hello,

It looks like the issue did reoccur on the system after the podman system renumber was run, I got them to collect and attach a sosreport which I attached to this case. Let me know if you need any additional information from them and I can also collect that from them as well.

Thank you

Comment 5 Daniel Walsh 2021-01-29 10:28:38 UTC

Matt any further thoughts on this?

Comment 6 Matthew Heon 2021-01-29 14:19:08 UTC

Is the customer seeing this issue after a reboot? Podman is detecting a system reboot and performing post-reboot logic; if this is not being seen after an actual system reboot, then something is wiping Podman's state. Could be the Systemd tmpfiles issue that Dan tracked down earlier.

Comment 7 Devon 2021-02-03 18:35:45 UTC

Looks like they are stating that this occurring is completely random and not surrounding reboots. I am not seeing the systemd tmpfiles issue linked in this bug, but is there a way that we can attempt to set a workaround/test to see if this still occurs or if you can attach the bug I can take a closer look on the case/info that the customer attached to see if there is anything indicating that is happening.

Comment 8 Matthew Heon 2021-02-03 18:55:13 UTC

The issue was originally reported upstream at https://github.com/containers/podman/issues/7852 - our solution was to add a single file [1] in `/usr/lib/tmpfiles.d/`. Newer Podman releases (I believe RHEL 8.3.1 and up - 8.4.0 and up for certain) will include this in the Podman package, but manually adding it prior to that could help identify if this is the ussye.

[1] https://raw.githubusercontent.com/containers/podman/master/contrib/tmpfile/podman.conf

Comment 9 Devon 2021-02-08 19:16:06 UTC

Looks like that fixed it:

[mysql@sgdevcdb01 ~]$ podman ps
ERRO[0000] Error refreshing container 290c2c0d0036af629f37f09ad1e3824404015b83781e9ec30e4a10c794417032: error acquiring lock 0 for container 290c2c0d0036af629f37f09ad1e3824404015b83781e9ec30e4a10c794417032: file exists
ERRO[0000] Error refreshing container 56d0386ec1658e9777d47c5b2df6f5d181f6352f63a414bc8d593c266915a52a: error acquiring lock 1 for container 56d0386ec1658e9777d47c5b2df6f5d181f6352f63a414bc8d593c266915a52a: file exists
ERRO[0000] Error refreshing volume mysql-banavim: error acquiring lock 2 for volume mysql-banavim: file exists
ERRO[0000] Error refreshing volume mysql-rhdev: error acquiring lock 3 for volume mysql-rhdev: file exists
CONTAINER ID  IMAGE  COMMAND  CREATED  STATUS  PORTS  NAMES
[mysql@sgdevcdb01 ~]$ podman ps

File creation:

[fibanez@sgdevcdb01 ~]$ sudo cp -p /usr/lib/tmpfiles.d/tmp.conf /etc/tmpfiles.d/

Add the following exclusions to /etc/tmpfiles.d/tmp.conf:
x /tmp/[0-9]*
x /tmp/containers-user-[0-9]*
x /tmp/run-[0-9]*


[mysql@sgdevcdb01 ~]$ podman ps
CONTAINER ID  IMAGE                                     COMMAND     CREATED       STATUS             PORTS                    NAMES
56d0386ec165  registry.redhat.io/rhel8/mysql-80:latest  run-mysqld  5 months ago  Up 12 seconds ago  0.0.0.0:33061->3306/tcp  mysql-rhdev
290c2c0d0036  registry.redhat.io/rhel8/mysql-80:latest  run-mysqld  5 months ago  Up 4 seconds ago   0.0.0.0:33060->3306/tcp  mysql-banavim

Thanks a ton, I think this looks to be a good workaround for the time being.

Comment 14 Evgeni Golov 2021-05-16 18:58:28 UTC

I believe the fix shipped is not sufficient, as it only excludes podman-run-* from the tmp reaper of systemd, but podman still creates things in /tmp/run-*

See BZ#1960948

Comment 15 Tom Sweeney 2021-05-17 15:10:37 UTC

Matt, Dan, Valentin,

DOes the fix for this need to be adjusted?

Comment 16 Matthew Heon 2021-06-30 13:16:47 UTC

That additional path is fixed upstream as of https://github.com/containers/podman/commit/9a02b50551d73c1427d158cca85d020fc71e27a7 which will ship in RHEL 8.5.0 (Podman 3.3.x).

I think it's a separate BZ considering that the path in question is different. This can remain closed.

Comment 17 Ian Spence 2021-11-25 18:52:16 UTC

Just adding additional information for any googlers that wind up here.

If you're using EL 8.5 or newer and yet still having the issues described here, do a `podman system reset` (this will delete everything related to podman!) and review any config files it warns you about for changes that differ from the defaults.

I had a storage config file that I'm fairly certain I didn't create but it had some bad values that was causing the same "error acquiring lock # for container [xxx]: file exists" errors.

More info: https://github.com/containers/podman/issues/11539

Comment 18 Red Hat Bugzilla 2023-09-15 00:49:48 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days