1762748 – [4.2] oauth-proxy container OOM killed

Bug 1762748 - [4.2] oauth-proxy container OOM killed

Summary: [4.2] oauth-proxy container OOM killed

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	apiserver-auth
Sub Component:
Version:	3.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	4.2.z
Assignee:	Standa Laznicka
QA Contact:	Anping Li
Docs Contact:
URL:
Whiteboard:
Depends On:	1759169
Blocks:	1762750
TreeView+	depends on / blocked

Reported:	2019-10-17 11:45 UTC by Standa Laznicka
Modified:	2019-12-11 22:36 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1759169
Clones:	1762750 (view as bug list)
Environment:
Last Closed:	2019-12-11 22:36:06 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift oauth-proxy pull 139	0	'None'	closed	[release-4.2] Bug 1762748: Fix high memory usage	2020-07-09 01:42:42 UTC
Red Hat Product Errata	RHBA-2019:4093	0	None	None	None	2019-12-11 22:36:16 UTC

Description Standa Laznicka 2019-10-17 11:45:14 UTC

This bug was initially created as a copy of Bug #1759169


Description of problem:
kibana-proxy container from kibana pods were killed for OOM by just a simple HTTP GET to kibana route or to kibana svc:

Sep 30 11:38:17 node-3 kernel: oauth-proxy invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=968
Sep 30 11:38:17 node-3 kernel: oauth-proxy cpuset=docker-a65ce7857b306c3eef1d020828f8705adf9ef84c256d011ad0ea9725c881473c.scope mems_allowed=0
Sep 30 11:38:17 node-3 kernel: CPU: 3 PID: 43442 Comm: oauth-proxy Kdump: loaded Tainted: G        W      ------------ T 3.10.0-957.el7.x86_64 #1
Sep 30 11:38:17 node-3 kernel: Hardware name: Red Hat OpenStack Compute, BIOS 1.11.0-2.el7 04/01/2014
Sep 30 11:38:17 node-3 kernel: Call Trace:
Sep 30 11:38:17 node-3 kernel: [<ffffffff99d61dc1>] dump_stack+0x19/0x1b
Sep 30 11:38:17 node-3 kernel: [<ffffffff99d5c7ea>] dump_header+0x90/0x229
Sep 30 11:38:17 node-3 kernel: [<ffffffff997b9dc6>] ? find_lock_task_mm+0x56/0xc0
Sep 30 11:38:17 node-3 kernel: [<ffffffff997ba274>] oom_kill_process+0x254/0x3d0
Sep 30 11:38:17 node-3 kernel: [<ffffffff99900a2c>] ? selinux_capable+0x1c/0x40
Sep 30 11:38:17 node-3 kernel: [<ffffffff99834f16>] mem_cgroup_oom_synchronize+0x546/0x570
Sep 30 11:38:17 node-3 kernel: [<ffffffff99834390>] ? mem_cgroup_charge_common+0xc0/0xc0
Sep 30 11:38:17 node-3 kernel: [<ffffffff997bab04>] pagefault_out_of_memory+0x14/0x90
Sep 30 11:38:17 node-3 kernel: [<ffffffff99d5acf2>] mm_fault_error+0x6a/0x157
Sep 30 11:38:17 node-3 kernel: [<ffffffff99d6f7a8>] __do_page_fault+0x3c8/0x500
Sep 30 11:38:17 node-3 kernel: [<ffffffff99d6f9c6>] trace_do_page_fault+0x56/0x150
Sep 30 11:38:17 node-3 kernel: [<ffffffff99d6ef42>] do_async_page_fault+0x22/0xf0
Sep 30 11:38:17 node-3 kernel: [<ffffffff99d6b788>] async_page_fault+0x28/0x30
Sep 30 11:38:17 node-3 kernel: Task in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poddc5d931a_dfa3_11e9_aa40_fa163ea64b17.slice/
docker-a65ce7857b306c3eef1d020828f8705adf9ef84c256d011ad0ea9725c881473c.scope killed as a result of limit of /kubepods.slice/kubepods-burstable.s
lice/kubepods-burstable-poddc5d931a_dfa3_11e9_aa40_fa163ea64b17.slice/docker-a65ce7857b306c3eef1d020828f8705adf9ef84c256d011ad0ea9725c881473c.sco
pe
Sep 30 11:38:17 node-3 kernel: memory: usage 524288kB, limit 524288kB, failcnt 1834
Sep 30 11:38:17 node-3 kernel: memory+swap: usage 524288kB, limit 524288kB, failcnt 0
Sep 30 11:38:17 node-3 kernel: kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Sep 30 11:38:17 node-3 kernel: Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poddc5d931a_dfa3_11e9_aa40_fa163ea64b17.slice/docker-a65ce7857b306c3eef1d020828f8705adf9ef84c256d011ad0ea9725c881473c.scope: cache:1584KB rss:522704KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:522704KB inactive_file:844KB active_file:740KB unevictable:0KB
Sep 30 11:38:17 node-3 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Sep 30 11:38:17 node-3 kernel: [43367] 1000080000 43367   246541   132083     299        0           968 oauth-proxy
Sep 30 11:38:17 node-3 kernel: Memory cgroup out of memory: Kill process 13784 (oauth-proxy) score 1977 or sacrifice child
Sep 30 11:38:17 node-3 kernel: Killed process 43367 (oauth-proxy) total-vm:986164kB, anon-rss:521632kB, file-rss:6700kB, shmem-rss:0kB
Sep 30 11:38:18 node-3 atomic-openshift-node: I0930 11:38:18.444147     691 kubelet.go:1865] SyncLoop (PLEG): "logging-kibana-6-bx8qz_openshift-logging(dc5d931a-dfa3-11e9-aa40-fa163ea64b17)", event: &pleg.PodLifecycleEvent{ID:"dc5d931a-dfa3-11e9-aa40-fa163ea64b17", Type:"ContainerDied", Data:"a65ce7857b306c3eef1d020828f8705adf9ef84c256d011ad0ea9725c881473c"}
Sep 30 11:38:18 node-3 atomic-openshift-node: I0930 11:38:18.853633     691 kuberuntime_manager.go:513] Container {Name:kibana-proxy Image:registry.redhat.io/openshift3/oauth-proxy:v3.11.117 Command:[] Args:[--upstream-ca=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt --https-address=:3000 -provider=openshift -client-id=kibana-proxy -client-secret-file=/secret/oauth-secret -cookie-secret-file=/secret/session-secret -upstream=http://localhost:5601 -scope=user:info user:check-access user:list-projects --tls-cert=/secret/server-cert --tls-key=/secret/server-key -pass-access-token -skip-provider-button] WorkingDir: Ports:[{Name:oaproxy HostPort:0 ContainerPort:3000 Protocol:TCP HostIP:}] EnvFrom:[] Env:[{Name:OAP_DEBUG Value:true ValueFrom:nil} {Name:OCP_AUTH_PROXY_MEMORY_LIMIT Value: ValueFrom:&EnvVarSource{FieldRef:nil,ResourceFieldRef:&ResourceFieldSelector{ContainerName:kibana-proxy,Resource:limits.memory,Divisor:0,},ConfigMapKeyRef:nil,SecretKeyRef:nil,}}] Resources:{Limits:map[memory:{i:{value:536870912 scale:0} d:{Dec:<nil>} s: Format:BinarySI}] Requests:map[cpu:{i:{value:100 scale:-3} d:{Dec:<nil>} s:100m Format:DecimalSI} memory:{i:{value:268435456 scale:0} d:{Dec:<nil>} s: Format:BinarySI}]} VolumeMounts:[{Name:kibana-proxy ReadOnly:true MountPath:/secret SubPath: MountPropagation:<nil>} {Name:aggregated-logging-kibana-token-pc9zf ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:&Capabilities{Add:[],Drop:[KILL MKNOD SETGID SETUID],},Privileged:nil,SELinuxOptions:nil,RunAsUser:*1000080000,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Sep 30 11:38:18 node-3 atomic-openshift-node: I0930 11:38:18.854051     691 kuberuntime_manager.go:757] checking backoff for container "kibana-proxy" in pod "logging-kibana-6-bx8qz_openshift-logging(dc5d931a-dfa3-11e9-aa40-fa163ea64b17)"




Version-Release number of selected component (if applicable):
OCPv3.11.117 ; kibana oauth proxy image: registry.redhat.io/openshift3/oauth-proxy:v3.11.117


How reproducible:
always: while true; do curl <kibana route or kibana svc>; sleep 1; done


Steps to Reproduce:
1. while true; do curl <kibana route or kibana svc>; sleep 1; done
2.
3.


Actual results:
kibana-proxy container were killed for OOM after few minutes; increasing the requests/limits it just takes more time for the RAM to be exhausted


Expected results:
kibana-proxy not OOM killed and RAM freed

Additional info:

Comment 2 Anping Li 2019-12-02 22:09:30 UTC

Verified in openshift/ose-oauth-proxy@sha256:2e2af0408544a55cce1ba4dbb3d436518151d6a8d553c96e5c323b24c1d54510. Run 6 hours. the kibana proxy pod wasn't restarted.


[anli@preserve-docker-slave 42]$ head kibana_top.logs 
POD                       NAME           CPU(cores)   MEMORY(bytes)   
kibana-5c4555bb8b-zk8xw   kibana-proxy   0m           8Mi             
kibana-5c4555bb8b-zk8xw   kibana         5m           89Mi            
POD                       NAME           CPU(cores)   MEMORY(bytes)   
kibana-5c4555bb8b-zk8xw   kibana         5m           89Mi            
kibana-5c4555bb8b-zk8xw   kibana-proxy   0m           8Mi             
POD                       NAME           CPU(cores)   MEMORY(bytes)   
kibana-5c4555bb8b-zk8xw   kibana         5m           89Mi            
kibana-5c4555bb8b-zk8xw   kibana-proxy   0m           8Mi             
POD                       NAME           CPU(cores)   MEMORY(bytes)   

[anli@preserve-docker-slave 42]$ tail kibana_
kibana_access.sh  kibana_check.sh   kibana_top.logs   
[anli@preserve-docker-slave 42]$ tail kibana_top.logs 
kibana-5c4555bb8b-zk8xw   kibana         8m           94Mi            
POD                       NAME           CPU(cores)   MEMORY(bytes)   
kibana-5c4555bb8b-zk8xw   kibana         8m           94Mi            
kibana-5c4555bb8b-zk8xw   kibana-proxy   0m           13Mi            
POD                       NAME           CPU(cores)   MEMORY(bytes)   
kibana-5c4555bb8b-zk8xw   kibana         8m           94Mi            
kibana-5c4555bb8b-zk8xw   kibana-proxy   0m           13Mi            
POD                       NAME           CPU(cores)   MEMORY(bytes)   
kibana-5c4555bb8b-zk8xw   kibana         8m           94Mi            
kibana-5c4555bb8b-zk8xw   kibana-proxy   0m           13Mi

Comment 4 errata-xmlrpc 2019-12-11 22:36:06 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:4093

Note You need to log in before you can comment on or make changes to this bug.