Bug 1767663

Summary: Regression: rootless: podman run --rm hangs
Product: Red Hat Enterprise Linux 8 Reporter: Ed Santiago <santiago>
Component: podmanAssignee: Giuseppe Scrivano <gscrivan>
Status: CLOSED CURRENTRELEASE QA Contact: Yuhui Jiang <yujiang>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.2CC: bbaude, ddarrah, dwalsh, gscrivan, jligon, jnovy, lsm5, mheon, pehunt, pthomas, tsweeney, weshen, ypu, yujiang
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-04-14 19:44:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
conmon-002da2522941fde456f5213c8a5a96c9836c2592 none

Description Ed Santiago 2019-10-31 23:23:17 UTC
This is almost, but not exactly, the same as bug 1743685

    $ podman run --rm alpine sh -c true [hangs forever]

^C yields:

    ^CERRO[0041] Error forwarding signal 2 to container 5ccff5432565165150f72d4eb2ccbead2382ec928278a005838066ecd85aec9e: container has already been removed

    ERRO[0041] Error forwarding signal 15 to container 5ccff5432565165150f72d4eb2ccbead2382ec928278a005838066ecd85aec9e: container has already been removed

The big difference compared to bug 1743685 is the 'sh -c': plain 'true' works:

    $ podman run --rm alpine true   [works fine]

...so does 'date', eveh with 'sh -c':

    $ podman run --rm alpine sh -c date
    Thu Oct 31 23:21:24 UTC 2019

podman-1.6.2-1.module+el8.2.0+4541+ec984734.x86_64
conmon-2.0.2-0.1.dev.git422ce21.module+el8.1.1+4407+ac444e5d.x86_64

Comment 1 Daniel Walsh 2019-11-01 12:41:20 UTC
These hangs almost always are something to do with conmon interaction or a leaked Notify-socket?

Comment 3 Giuseppe Scrivano 2019-11-02 19:16:37 UTC
I think it is solved by https://github.com/containers/conmon/commit/067c0a5ca47aa261e99521b32d6c74a8588b918c

To be sure it is the same issue, Ed could you try the reproducer specified in the commit message above (podman --runtime /bin/false run --rm  alpine true)?

Comment 4 Ed Santiago 2019-11-04 01:38:44 UTC
> Ed could you try the reproducer specified in the commit message above

With `--runtime /bin/false`, the above command hangs both as root and rootless. ^C behavior is different: it just shows ^C, without the 'Error forwarding signal'.

Unfortunately I don't have a rhel8 build environment and can't easily build a patched conmon to test with.

Thank you for recognizing this and for the pointer.

Comment 5 Giuseppe Scrivano 2019-11-06 12:03:11 UTC
Created attachment 1633253 [details]
conmon-002da2522941fde456f5213c8a5a96c9836c2592

I've attached a binary build of conmon based on 002da2522941fde456f5213c8a5a96c9836c2592

Could you verify if it solves the issue you are seeing?

Comment 6 Ed Santiago 2019-11-06 12:50:18 UTC
Giuseppe, that build gives me:

   $ podman run --rm alpine sh -c true
   Error: could not get runtime: please update to v1.0.0 or later: outdated conmon version

I tried copying it into /usr/libexec/podman/conmon and also /usr/bin/conmon; both with 'conmon' package installed and uninstalled. SELinux file context is correct. 

I will leave my virt running for a little while so I can retest more quickly.

Comment 7 Ed Santiago 2019-11-06 23:05:39 UTC
I cannot reproduce the hang with podman-1.6.2-5.module+el8.2.0+4584+0d586e68 (all other rpms being equal). I cannot explain this.

(Oh, and followup to comment 6: that was PEBKAC: I had not chmod'ed the new conmon. Still, that conmon did not fix the hang).

Comment 10 Giuseppe Scrivano 2019-11-19 13:44:30 UTC
I think the new hang is fixed with https://github.com/containers/libpod/pull/4461

Can we try a build with that fix included?