Bug 1868773 - [libvirt]: OCP 4.6 installation fails due to OOM in the bootstrap node
Summary: [libvirt]: OCP 4.6 installation fails due to OOM in the bootstrap node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Multi-Arch
Version: 4.6
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: 4.6.0
Assignee: Prashanth Sundararaman
QA Contact: Jeremy Poulin
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-13 19:09 UTC by Prashanth Sundararaman
Modified: 2020-10-27 16:28 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:28:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
journal logs (847.86 KB, text/plain)
2020-08-13 19:33 UTC, Prashanth Sundararaman
no flags Details
top output (69.73 KB, text/plain)
2020-08-13 19:33 UTC, Prashanth Sundararaman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 4080 0 None closed Bug 1868773: Libvirt: Bump bootstrap memory to 4G 2020-10-05 17:33:40 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:28:52 UTC

Description Prashanth Sundararaman 2020-08-13 19:09:24 UTC
Description of problem:
I have noticed with 4.6 that libvirt installations consistently fail because bootkube never finishes. This is because the OOM killer kicks in during the bootkube process. On examining the top command the kube-apiserver process takes 65% of memory.

The memory size for the bootstrap node is 2G and this was sufficient till now and even with 4.5 deploys I see that we do come pretty close to hitting the limit but looks like it is just enough. In 4.6 looks like the memory consumption of kube-apiserver increased just a little to push it over the edge. 

This affects the e2e-libvirt CI job and multi-arch CI jobs which run on libvirt.

It would be pretty simple to just bump the memory for the bootstrap node, but we first want to make sure this is not an issue.

Version-Release number of selected component (if applicable):
4.6

How reproducible:
Always on libvirt deploys

Comment 1 Prashanth Sundararaman 2020-08-13 19:33:06 UTC
Created attachment 1711375 [details]
journal logs

Comment 2 Prashanth Sundararaman 2020-08-13 19:33:32 UTC
Created attachment 1711376 [details]
top output

Comment 7 errata-xmlrpc 2020-10-27 16:28:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.