Bug 1707488
| Summary: | containerized RGW default memory too high | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Tim Wilkinson <twilkins> |
| Component: | Ceph-Ansible | Assignee: | Guillaume Abrioux <gabrioux> |
| Status: | CLOSED ERRATA | QA Contact: | Ameena Suhani S H <amsyedha> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.0 | CC: | amsyedha, aschoen, bengland, ceph-eng-bugs, gmeno, jharriga, karan, mbenjamin, mkogan, nthomas, tserlin, vereddy |
| Target Milestone: | z2 | Keywords: | Reopened |
| Target Release: | 4.1 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | ceph-ansible-4.0.29-1.el8cp, ceph-ansible-4.0.29-1.el7cp | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-09-30 17:24:49 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Comment 7
Giridhar Ramaraju
2019-08-05 13:09:13 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri about comment 1: you said you set the CGroup memory limit to 2GB, but the OOM kill happened at 6 GB. Why wasn't it killed at 2 GB? Also, why didn't the clients fail over to a different RGW server and continue running? Perhaps a load balancer wasn't used? about comment 3: if "we cannot identify a reliable memory limit" then the proposed workaround is not really preventing the problem from occurring later, just postponing it, right. We have to know ahead of time how much memory RGW requires for a variety of reasons. cc'ing Karan Singh, who has worked with RGW in some really large configurations (1 billion objects). https://docs.google.com/document/d/1uKq5TLZFDc5IWVCa5EekWQU6eoB5QOmBXE05FVpy6QU/edit Karan, any sign of RGW daemon memory usage growth during your tests? Matt, what's the next step here? Matt / Mkogon I am in the middle of ingesting 10 Billion objects (as I write this email, 800 Million has been successfully ingested) if you guys want me to capture this data point, you need to provide me the instructions to capture this. Currently, I do not get any metrics with the name of RGW memory in Prometheus. If you like i can give you SSH access to the env Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 4.1 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4144 |