Bug 1888988
Summary: | Error refreshing container XXX: error acquiring lock 0 for container | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Devon <dshumake> |
Component: | podman | Assignee: | Matthew Heon <mheon> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | atomic-bugs <atomic-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 8.4 | CC: | bbaude, dornelas, dwalsh, egolov, hasuzuki, ian, jligon, jnovy, lsm5, mheon, tsweeney, vrothber |
Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
Target Release: | 8.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-03-22 20:29:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1186913 |
Description
Devon
2020-10-16 17:11:24 UTC
Please try the `podman system renumber` command (ideally when all containers are not running, but it should be fine as long as no other Podman commands are run at the same time). This will reallocate locks to remove any potential conflicts that could be causing this. Assuming this does fix the issue, I'd be very interested to hear if the customer encounters this again - if the lock allocator is making duplicate allocations that's a large problem. I've cc'd Valentin in case he's any locking thought and I might move this his way based on customer's response. Hello, It looks like the issue did reoccur on the system after the podman system renumber was run, I got them to collect and attach a sosreport which I attached to this case. Let me know if you need any additional information from them and I can also collect that from them as well. Thank you Matt any further thoughts on this? Is the customer seeing this issue after a reboot? Podman is detecting a system reboot and performing post-reboot logic; if this is not being seen after an actual system reboot, then something is wiping Podman's state. Could be the Systemd tmpfiles issue that Dan tracked down earlier. Looks like they are stating that this occurring is completely random and not surrounding reboots. I am not seeing the systemd tmpfiles issue linked in this bug, but is there a way that we can attempt to set a workaround/test to see if this still occurs or if you can attach the bug I can take a closer look on the case/info that the customer attached to see if there is anything indicating that is happening. The issue was originally reported upstream at https://github.com/containers/podman/issues/7852 - our solution was to add a single file [1] in `/usr/lib/tmpfiles.d/`. Newer Podman releases (I believe RHEL 8.3.1 and up - 8.4.0 and up for certain) will include this in the Podman package, but manually adding it prior to that could help identify if this is the ussye. [1] https://raw.githubusercontent.com/containers/podman/master/contrib/tmpfile/podman.conf Looks like that fixed it: [mysql@sgdevcdb01 ~]$ podman ps ERRO[0000] Error refreshing container 290c2c0d0036af629f37f09ad1e3824404015b83781e9ec30e4a10c794417032: error acquiring lock 0 for container 290c2c0d0036af629f37f09ad1e3824404015b83781e9ec30e4a10c794417032: file exists ERRO[0000] Error refreshing container 56d0386ec1658e9777d47c5b2df6f5d181f6352f63a414bc8d593c266915a52a: error acquiring lock 1 for container 56d0386ec1658e9777d47c5b2df6f5d181f6352f63a414bc8d593c266915a52a: file exists ERRO[0000] Error refreshing volume mysql-banavim: error acquiring lock 2 for volume mysql-banavim: file exists ERRO[0000] Error refreshing volume mysql-rhdev: error acquiring lock 3 for volume mysql-rhdev: file exists CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [mysql@sgdevcdb01 ~]$ podman ps File creation: [fibanez@sgdevcdb01 ~]$ sudo cp -p /usr/lib/tmpfiles.d/tmp.conf /etc/tmpfiles.d/ Add the following exclusions to /etc/tmpfiles.d/tmp.conf: x /tmp/[0-9]* x /tmp/containers-user-[0-9]* x /tmp/run-[0-9]* [mysql@sgdevcdb01 ~]$ podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 56d0386ec165 registry.redhat.io/rhel8/mysql-80:latest run-mysqld 5 months ago Up 12 seconds ago 0.0.0.0:33061->3306/tcp mysql-rhdev 290c2c0d0036 registry.redhat.io/rhel8/mysql-80:latest run-mysqld 5 months ago Up 4 seconds ago 0.0.0.0:33060->3306/tcp mysql-banavim Thanks a ton, I think this looks to be a good workaround for the time being. I believe the fix shipped is not sufficient, as it only excludes podman-run-* from the tmp reaper of systemd, but podman still creates things in /tmp/run-* See BZ#1960948 Matt, Dan, Valentin, DOes the fix for this need to be adjusted? That additional path is fixed upstream as of https://github.com/containers/podman/commit/9a02b50551d73c1427d158cca85d020fc71e27a7 which will ship in RHEL 8.5.0 (Podman 3.3.x). I think it's a separate BZ considering that the path in question is different. This can remain closed. Just adding additional information for any googlers that wind up here. If you're using EL 8.5 or newer and yet still having the issues described here, do a `podman system reset` (this will delete everything related to podman!) and review any config files it warns you about for changes that differ from the defaults. I had a storage config file that I'm fairly certain I didn't create but it had some bad values that was causing the same "error acquiring lock # for container [xxx]: file exists" errors. More info: https://github.com/containers/podman/issues/11539 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |