Bug 2231983

Summary: Starting a container via the socket causes all socket calls to hang after that (systemd 252-16 and podman 4.6.0-1)
Product: Red Hat Enterprise Linux 9 Reporter: Carlos Rodriguez-Fernandez <carlosrodrifernandez>
Component: systemdAssignee: systemd maint <systemd-maint>
Status: CLOSED DUPLICATE QA Contact: Frantisek Sumsal <fsumsal>
Severity: medium Docs Contact:
Priority: unspecified    
Version: CentOS StreamCC: bstinson, dtardon, jwboyer, systemd-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-16 08:36:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Carlos Rodriguez-Fernandez 2023-08-14 22:42:00 UTC
Description of problem:
Starting a container via the socket causes all socket calls to hang after that. This happens specifically with systemd 252-16 and podman 4.6.0-1

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:

The environment is a CentOS Stream 9 VM (e.g. [qcow2](https://cloud.centos.org/centos/9-stream/x86_64/images/CentOS-Stream-GenericCloud-9-20220829.0.x86_64.qcow2)). On a non-root user.

1. Install specifically this version (the version 4.6.0-3 has other issue https://bugzilla.redhat.com/show_bug.cgi?id=2231975): `sudo dnf install podman-2:4.6.0-1.el9`
2. Ensure systemd 252-15 is the one installed: `sudo dnf install systemd-252-15.el9`
3. Enable socket `systemctl --user enable --now podman.socket`
4. Start nginx `podman --url unix://run/user/$(id -u)/podman/podman.sock run --name nginx-test -p 8080:80 -d docker.io/nginx`
5. List `podman --url unix://run/user/$(id -u)/podman/podman.sock ps`. Run it multiple times. They all work.
6. Remove test container: `podman container rm -f nginx-test`
7. Update systemd to 252-16: `sudo dnf install systemd-252-16.el9`
8. Reboot
9. Start nginx `podman --url unix://run/user/$(id -u)/podman/podman.sock run --name nginx-test -p 8080:80 -d docker.io/nginx`
10. List `podman --url unix://run/user/$(id -u)/podman/podman.sock ps`. This one hangs.

Sometimes, the step 10 has to be run multiple times before it hangs for good.

What I'm seeing is that the processes started in step 9 are included in the `podman.service` cgroup, and even though podman itself already exited, the `podman.service` appears as "active (running)", which seems to prevent systemd from starting podman to attend the subsequent socket requests.

```
podman.service - Podman API Service
     Loaded: loaded (/usr/lib/systemd/user/podman.service; disabled; preset: disabled)
     Active: active (running) since Mon 2023-08-14 18:03:31 EDT; 9min ago
TriggeredBy: ● podman.socket
       Docs: man:podman-system-service(1)
    Process: 1516 ExecStart=/usr/bin/podman $LOGGING system service (code=exited, status=0/SUCCESS)
   Main PID: 1516 (code=exited, status=0/SUCCESS)
      Tasks: 13 (limit: 10774)
     Memory: 30.2M
        CPU: 75ms
     CGroup: /user.slice/user-1000.slice/user/app.slice/podman.service
             ├─1526 /usr/bin/slirp4netns --disable-host-loopback --mtu=65520 --enable-sandbox --enable-seccomp --enable-ipv6 -c -r 3 -e 4 --netns-type=path /run/user/1000/n>
             ├─1529 rootlessport
             └─1538 rootlessport-child

```

Actual results:
It hangs when listing containers using the socket

Expected results:
It lists containers.

Additional info:
I submitted the ticket upstream as well: https://github.com/containers/podman/issues/19625

Comment 1 Carlos Rodriguez-Fernandez 2023-08-15 14:34:27 UTC
A podman developer believes this is a systemd issue, not a podman issue.

I opened the issue in systemd upstream: https://github.com/systemd/systemd/issues/28843

The podman developer also thinks it is the same issue which was found in 253.5 as reported here: https://github.com/systemd/systemd/issues/27953,
and fixed here: https://github.com/systemd/systemd/pull/28035

Comment 2 David Tardon 2023-08-16 08:36:12 UTC

*** This bug has been marked as a duplicate of bug 2225667 ***

Comment 3 Carlos Rodriguez-Fernandez 2023-08-17 04:57:29 UTC
I confirm this is working now with systemd 252-17 in CentOS Stream 9. Thank you!