Bug 1980522
| Summary: | zombied processes due to failed readiness/liveness probes | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Albert Cardenas <acardena> |
| Component: | Containers | Assignee: | Tom Sweeney <tsweeney> |
| Status: | CLOSED DUPLICATE | QA Contact: | pmali |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.6 | CC: | aos-bugs, dwalsh, gwest, jokerman, pehunt |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-08 19:24:27 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Albert Cardenas
2021-07-08 18:04:32 UTC
we're tracking this in 1967808 *** This bug has been marked as a duplicate of bug 1967808 *** So I think this bug is actually distinct from https://bugzilla.redhat.com/show_bug.cgi?id=1967808, and actually is fixed by the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1952137: 4.6.36. 1967808 seems more about a bunk container. the pid 1 in the container is not reaping processes, and we're not yet being smart about handling the reparenting that's happening 1952137 was about excessive processes being created when a node is under load. Before, CRI-O would call conmon which would call runc, and those conmon processes would not be correctly reaped by CRI-O. The fixes for that https://github.com/cri-o/cri-o/pull/4943 and https://github.com/cri-o/cri-o/pull/4999 cut conmon out from the middle. This results in fewer processes being created, and also better handling of CRI-O's children to prevent zombies. In investigation of this bug in general, we attempted to create many containers with many exec probes (1000 deployments) to overwhelm the system. Without the fixes for 1952137, many zombies were created. after updating to 4.6.36, there were no detectable zombie processes. Thus, I am updating the bug to which this is duplicated *** This bug has been marked as a duplicate of bug 1952137 *** |