1572587 – prometheus pods getting oomkilled @ 100 node scale

Bug 1572587 - prometheus pods getting oomkilled @ 100 node scale

Summary: prometheus pods getting oomkilled @ 100 node scale

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	3.10.0
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	3.10.0
Assignee:	Dan Mace
QA Contact:	Mike Fiedler
Docs Contact:
URL:
Whiteboard:	aos-scalability-310
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-04-27 11:25 UTC by Jeremy Eder
Modified:	2018-12-20 21:46 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Environment:
Last Closed:	2018-12-20 21:12:30 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	https://github.com/openshift openshift-ansible pull 8514	0	None	None	None	2020-06-30 06:14:23 UTC

Comment 1 Simon Pasquier 2018-04-27 12:51:45 UTC

Depending on the actual memory limits, you may or may not affected by this but other people have reported memory leaks with Prometheus 2.2.1 [1]. There's a opened PR [2] that seems to fix the problem but it may have surfaced other issues.

[1] https://github.com/prometheus/prometheus/issues/4095
[2] https://github.com/prometheus/prometheus/pull/4013

Comment 2 Jeremy Eder 2018-04-27 16:09:08 UTC

# oc edit cm -n openshift-monitoring cluster-monitoring-config

add the resources line below.

    prometheusK8s:                                                                                                                                                          
      baseImage: quay.io/prometheus/prometheus                                                                                                                              
      resources: {}

wait patiently. mine took about 5 minutes to stop/start both prometheus-k8s-N pods

# oc get pod -n openshift-monitoring prometheus-k8s-1 -o yaml

Now you will see 

    name: prometheus                                                                                                                                                        
    resources:                                                                                                                                                              
      requests:                                                                                                                                                             
        memory: 2Gi     

The RSS of these prometheus processes is just about 2.2G right now, each using 0.2 cores (scale labl env, 100 node cluster, 600 pods).

I think we need to disable the limits in-product, or at least bump them to something like 30G (number taken from starter clusters, see attached image).

Comment 4 Dan Mace 2018-05-18 17:00:19 UTC

https://github.com/openshift/openshift-ansible/pull/8442

Comment 5 Dan Mace 2018-05-21 17:16:31 UTC

(In reply to Dan Mace from comment #4)
> https://github.com/openshift/openshift-ansible/pull/8442

New upstream fix: https://github.com/openshift/cluster-monitoring-operator/pull/19

Will also require an openshift-ansible PR to a new cluster-monitoring-operator release, which I'll link here.

Comment 6 Dan Mace 2018-05-23 14:48:59 UTC

Fix for this is ready, still trying to get a new cluster-monitoring-operator release pushed so I can open a new openshift-ansible PR.

Comment 7 Dan Mace 2018-05-24 11:55:24 UTC

https://github.com/openshift/openshift-ansible/pull/8514

Comment 10 Mike Fiedler 2018-06-05 18:57:45 UTC

Moving back to ASSIGNED based on comment 9.

Comment 11 Dan Mace 2018-06-06 17:02:00 UTC

This can be tested with the release of https://github.com/openshift/openshift-ansible/pull/8591

Comment 12 Wei Sun 2018-06-08 02:01:29 UTC

The PR has been merged to openshift-ansible-3.10.0-0.63.0,please check

Comment 13 Mike Fiedler 2018-06-11 15:46:22 UTC

Verified on 3.10.0-0.64.0.  prometheus-operator now using PV/PVC for persistence and resource limits have been removed from deployments/statefulsets/daemonsets.

Note You need to log in before you can comment on or make changes to this bug.