Bug 1576543
| Summary: | prometheus-operator pods getting OoM killed @ 750 nodes | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jiří Mencák <jmencak> | ||||
| Component: | Monitoring | Assignee: | Frederic Branczyk <fbranczy> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Mike Fiedler <mifiedle> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 3.10.0 | CC: | aos-bugs, byron.collins, dmace, jeder, juzhao, lcosic, mifiedle, spasquie | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.2.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | aos-scalability-310 | ||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-10-16 06:27:40 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Jiří Mencák
2018-05-09 17:19:11 UTC
We have done various scalability changes for 4.0, this needs to be re-assessed in the 4.0 scope. @Mike Could you also help to test this aos-scalability bug? Did not find this issue in one smaller cluster, not sure if it would be happen in a larger cluster Marking verified on 4.2. There won't be another 750+ node cluster run until post-4.2 and a new bz can be opened then if there is an issue. In a 250 node cluster on GCP, prometheus-operator is using 80Mb VSZ and 2MB RSS Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |