Bug 1802687
Summary: | A pod that gradually leaks memory causes node to become unreachable for 10 minutes | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Clayton Coleman <ccoleman> | |
Component: | Node | Assignee: | Ryan Phillips <rphillips> | |
Status: | CLOSED ERRATA | QA Contact: | Sunil Choudhary <schoudha> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.4 | CC: | aos-bugs, cfillekes, fdeutsch, hannsj_uhl, Holger.Wolf, jokerman, jshepherd, lxia, rphillips, schoudha, surbania, tdale, vlaad, wabouham, walters, wking, wsun, zyu | |
Target Milestone: | --- | |||
Target Release: | 4.4.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: The kubepods.slice memory cgroup was not being set correctly, and was being set to max memory on the node.
Consequence: This would cause the Kubelet to not reserve memory and CPU resources for system components (including kubelet and crio) causing kernel pauses and other OOM conditions in the cloud.
Fix: The kubepods.slice memory limit is now set correctly.
Result: Pods should be evicted when using [max-memory] - [system reservation].
|
Story Points: | --- | |
Clone Of: | 1800319 | |||
: | 1825989 (view as bug list) | Environment: | ||
Last Closed: | 2020-05-21 18:09:33 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1800319, 1806786, 1808429, 1810136 | |||
Bug Blocks: | 1765215, 1766237, 1801826, 1801829, 1808444, 1814187, 1814804 |
Description
Clayton Coleman
2020-02-13 17:35:07 UTC
Today we don't take control over the OOM handling. To the best of my knowledge, if one has pods configured without hard limits (common) then what's going to happen is the default OOM killer will be invoked and it can kill any process. For most of our core processes, systemd will restart them if they're killed, but we don't regularly test that. Adding reservations makes it less likely we'll overcommit in a situation with hard limits. The recent trend has been userspace policy driven OOM handling, e.g. https://source.android.com/devices/tech/perf/lmkd and most recently for us: https://fedoraproject.org/wiki/Changes/EnableEarlyoom That one's about swap but it's certainly possible to have issues even without swap. https://github.com/facebookincubator/oomd is also relevant. This all said - let's get a bit more data here about what's happening; in particular which process is being killed. *** Bug 1809606 has been marked as a duplicate of this bug. *** *** Bug 1814187 has been marked as a duplicate of this bug. *** *** Bug 1811924 has been marked as a duplicate of this bug. *** |