Bug 1669311
Summary: | A ovs process gets killed when oom-killer is invoked, leaving it in bad state. | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ryan Howe <rhowe> | |
Component: | Networking | Assignee: | Casey Callendrello <cdc> | |
Networking sub component: | openshift-sdn | QA Contact: | zhaozhanqi <zzhao> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | medium | CC: | anusaxen, aos-bugs, cdc, danw, glamb, hpolava, jolee, openshift-bugs-escalate | |
Version: | 3.10.0 | |||
Target Milestone: | --- | |||
Target Release: | 3.10.z | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1671820 1671822 (view as bug list) | Environment: | ||
Last Closed: | 2019-03-14 02:15:34 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1671820, 1671822 |
Description
Ryan Howe
2019-01-24 22:59:22 UTC
We should also configure `ovs-ctl status` as a liveness probe. Assigning to phil. In speaking with our ovs contact, this is operating as designed. Adding a liveness probe, while possible, is likely not going to get the desired results since when it restarts vswitchd the same resource pressure will exist and OOM will likely be invoked again. Ultimately, either more resources are need or reduced load. Relaxed resource limits and added a liveness probe. The extent that this is useful will become apparent when it is tried on the problem cluster. https://github.com/openshift/cluster-network-operator/pull/80 You'll need to fix this in 3.10 and 3.11, too. (In reply to Ryan Howe from comment #0) > Steps to Reproduce: > > Invoke oom-killer > > kernel: Out of memory: Kill process 6779 (ovs-vswitchd) score 992 or > sacrifice child > kernel: Killed process 6779 (ovs-vswitchd) total-vm:443008kB, > anon-rss:46600kB, file-rss:13548kB, shmem-rss:0kB In what context did you encounter this exactly? OVS normally runs a monitor process that should restart ovs-vswitchd if it dies or is killed. There was a bug at one point where ovs-vswitchd was beeing OOMkilled *at startup*, but that should be fixed with current openshift-ansible. 1671820 if for the fix in 3.11, the fix will be cherry-picked to fix the bug here. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0405 |