Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1991528

Summary:	podman pod rm --force failed with device or resource busy when set --cgroup-manager to cgroupfs
Product:	Red Hat Enterprise Linux 9	Reporter:	Joy Pu <ypu>
Component:	podman	Assignee:	Matthew Heon <mheon>
Status:	CLOSED CURRENTRELEASE	QA Contact:	atomic-bugs <atomic-bugs>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	9.0	CC:	bbaude, dwalsh, jligon, jnovy, lsm5, mheon, pthomas, tsweeney, umohnani
Target Milestone:	beta	Keywords:	Reopened
Target Release:	---	Flags:	pm-rhel: mirror+
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-07-10 19:28:50 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Joy Pu 2021-08-09 11:10:26 UTC

Description of problem:
podman pod rm --force failed with error meesage:
Error: error removing pod ff010ae9ed3136dc32caf113d232496164a085aeb058529296e3bbc146d2d089 conmon cgroup: remove /sys/fs/cgroup/libpod_parent/ff010ae9ed3136dc32caf113d232496164a085aeb058529296e3bbc146d2d089/conmon: remove /sys/fs/cgroup/libpod_parent/ff010ae9ed3136dc32caf113d232496164a085aeb058529296e3bbc146d2d089/conmon: device or resource busy


Version-Release number of selected component (if applicable):
podman-3.3.0-0.15.module+el9beta+12090+32d0f3c8.x86_64

How reproducible:
around 60%

Steps to Reproduce:
Find this when I run e2e test case "podman pod container --infra=false doesn't share SELinux labels". You can clone upstream podman to your local and run the test case with CGROUP_MANAGER="cgroupfs" and PODMAN_BINARY=`which podman`. The steps in the test are:
1. Create a pod with --infra=false
#  /usr/bin/podman --storage-opt vfs.imagestore=/tmp/podman/imagecachedir --root /tmp/podman_test431871916/crio --runroot /tmp/podman_test431871916/crio-run --runtime crun --conmon /usr/bin/conmon --cni-config-dir /etc/cni/net.d --cgroup-manager cgroupfs --tmpdir /tmp/podman_test431871916 --events-backend file --storage-driver vfs pod create --infra=false

2. Run two containers to checkthe attr is different
/usr/bin/podman --storage-opt vfs.imagestore=/tmp/podman/imagecachedir --root /tmp/podman_test431871916/crio --runroot /tmp/podman_test431871916/crio-run --runtime crun --conmon /usr/bin/conmon --cni-config-dir /etc/cni/net.d --cgroup-manager cgroupfs --tmpdir /tmp/podman_test431871916 --events-backend file --storage-driver vfs run --pod 5a218fbc12c7ecc5d8cda5b2531a7420688eb56bfe1ee421dda75766f1e295ec quay.io/libpod/alpine:latest cat /proc/self/attr/current

 /usr/bin/podman --storage-opt vfs.imagestore=/tmp/podman/imagecachedir --root /tmp/podman_test431871916/crio --runroot /tmp/podman_test431871916/crio-run --runtime crun --conmon /usr/bin/conmon --cni-config-dir /etc/cni/net.d --cgroup-manager cgroupfs --tmpdir /tmp/podman_test431871916 --events-backend file --storage-driver vfs run --pod 5a218fbc12c7ecc5d8cda5b2531a7420688eb56bfe1ee421dda75766f1e295ec quay.io/libpod/alpine:latest cat /proc/self/attr/current

3. Remove the pod with --force
Running: /usr/bin/podman --storage-opt vfs.imagestore=/tmp/podman/imagecachedir --root /tmp/podman_test431871916/crio --runroot /tmp/podman_test431871916/crio-run --runtime crun --conmon /usr/bin/conmon --cni-config-dir /etc/cni/net.d --cgroup-manager cgroupfs --tmpdir /tmp/podman_test431871916 --events-backend file --storage-driver vfs pod rm 5a218fbc12c7ecc5d8cda5b2531a7420688eb56bfe1ee421dda75766f1e295ec --force


Actual results:
pod rm command failed with error

Expected results:
pod can be remove as expected.

Additional info:

The same test pass 100% with --cgroupfs-manager systemd.

Comment 1 Matthew Heon 2021-08-09 13:22:05 UTC

I'll take this one. I was just digging around in the code around this. We have code that prevents this on CGroups v1 systems (the issue which we encountered before, and which I'm almost certain is happening here, is that the cleanup process is launching and occupying the conmon cgroup, preventing its deletion; the solution was to set a PID limit on said cgroup before stopping the containers of the pod, to prevent the cleanup process from being launched). I presume that cgroupfs has changed sufficiently from v1 to v2 that said code does not work on RHEL9 (and likely will not work on RHEL8 + cgroupsV2 + cgroupfs either).

Comment 5 RHEL Program Management 2023-02-09 07:27:47 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 8 Tom Sweeney 2023-05-04 20:53:11 UTC

@mheon any progress on this one?

Comment 10 Tom Sweeney 2023-07-10 17:59:56 UTC

reping @mheon

Comment 11 Matthew Heon 2023-07-10 19:28:50 UTC

The code in question appears to have been entirely removed while I was not working on this bug (was replaced as part of the effort to add resource limits to pods), so I think we can call this done. Wish I could say this was intentional and I was waiting for the code to be refactored out of existence, but this just fell lower in priority than other bugs long enough that the code changed around it.

Going to CLOSED CURRENTRELEASE given the complete removal of affected codepaths.