Bug 2157930

Summary: podman exec fails with Error: an exec session with ID already exists: exec session already exists
Product: Red Hat Enterprise Linux 8 Reporter: David Hill <dhill>
Component: podmanAssignee: Jindrich Novy <jnovy>
Status: CLOSED ERRATA QA Contact: Alex Jia <ajia>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.2CC: ajia, bbaude, cpippin, dornelas, dwalsh, jligon, jnovy, kir, lsm5, mboddu, mheon, pjagtap, pthomas, romain.geissler, tsweeney, umohnani, ypu
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: ---Flags: pm-rhel: mirror+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: podman-4.4.0-0.8.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2165695 2165697 2165698 2166091 (view as bug list) Environment:
Last Closed: 2023-05-16 08:22:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2165695, 2165697, 2165698, 2166091, 2166429, 2166434    

Description David Hill 2023-01-03 15:32:26 UTC
Description	
What problem/issue/behavior are you having trouble with?  What do you expect to see?
OpenStack controllers run podman 
podman-3.0.1-10.module+el8.4.0+16976+f10e9028.x86_64

on rare occasions podman exec fails, running this for 12h:

# while podman exec openstack-cinder-volume-podman-0 /usr/bin/true ; do true ; done ; date
Error: an exec session with ID 76c9620ed4af0ff8210204a9c281b3bae1cdb184ff65807c0b9b96892aca591c already exists: exec session already exists
Mon Dec 19 04:23:39 UTC 2022


# while podman exec openstack-cinder-volume-podman-0 /usr/bin/true ; do true ; done ; date
Error: an exec session with ID 482b9a38261b19c18c5862d455a7d640d492d7c0eedd7e2fe732fdf43cb48067 already exists: exec session already exists
Sun Dec 18 23:15:50 UTC 2022

this causes the pacemaker heartbeat to fail on our OpenStack controllers, causing some services to restart, see

https://github.com/ClusterLabs/resource-agents/blob/7bfc7bb27a160152a51ee9e2dea052567594daa0/heartbeat/podman#L202


What is the business impact? Please also provide timeframe information.
impacts only our control plane, no business impact
1 service is killed per week.

Where are you experiencing the behavior? What environment?
rnd and test.
production runs podman-1.6.4-19.module+el8.2.0+11121+714aca16.x86_64 and we never had such issue there.

When does the behavior occur? Frequency? Repeatedly? At certain times?
very low frequency

https://github.com/containers/storage/pull/1337 solves the problem , we tested a scratch build with that commit and it solves the issue.

Comment 7 David Hill 2023-01-09 15:51:23 UTC
I think it just breaks the customer's monitoring ... nothing is broken on our side .

Comment 31 Romain Geissler 2023-02-06 16:45:22 UTC
Hi,

Jumping in this bugzilla which doesn't affect me, but I do care about other currently unreleased commits in podman's branch v4.2.0-rhel. Do you know roughly when a patched podman 4.2 package will be released for RHEL 8 fixing this issue (and thus also containing the the other unreleased commits of branch v4.2.0-rhel) ? And also, do you know if you will also release a RHEL 9 patched package, since this bugzilla is only about RHEL 8 ?

Thanks,
Romain

Comment 34 Alex Jia 2023-02-10 13:32:55 UTC
Sanity tests are passed for podman-4.4.0-1.module+el8.8.0+18060+3f21f2cc.x86_64.

Comment 38 Kir Kolyshkin 2023-03-09 02:52:18 UTC
Alas, I have no resources to further pursue https://github.com/containers/storage/pull/1337 and https://github.com/containers/common/pull/1155 at this time, nor can I understand the effect of https://github.com/containers/podman/pull/15788 to fix this issue.

The proper fix, I guess, is to check whether a newly generated ID is not used, and retry otherwise.

Comment 43 errata-xmlrpc 2023-05-16 08:22:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: container-tools:rhel8 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2758

Comment 44 Red Hat Bugzilla 2023-09-19 04:32:04 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days