Bug 1498213 - Increase ARP cache size on loadbalancers
Summary: Increase ARP cache size on loadbalancers
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.7.0
Assignee: Jiří Mencák
QA Contact: Johnny Liu
URL:
Whiteboard: aos-scalability-37
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-03 18:21 UTC by Jiří Mencák
Modified: 2017-12-12 06:03 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
This is a known issue as openshift tuned profiles were never set on RHEL Atomic Host, only RHEL.
Clone Of:
Environment:
Last Closed: 2017-11-28 22:14:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 0 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Description Jiří Mencák 2017-10-03 18:21:16 UTC
Description of problem:
On RHEL Atomic Host the ARP garbage collection thresholds are too low causing problems with OCP HA deployments with 1k+ nodes.

Version-Release number of selected component (if applicable):
All

How reproducible:
Always

Steps to Reproduce:
1. Install RHEL Atomic Host OCP HA cluster with a loadbalancer and query sysctl values for net.ipv[46].neigh.default.gc_thresh[1-3]

Actual results:
net.ipv4.neigh.default.gc_thresh1 = 128
net.ipv4.neigh.default.gc_thresh2 = 512
net.ipv4.neigh.default.gc_thresh3 = 1024
net.ipv6.neigh.default.gc_thresh1 = 128
net.ipv6.neigh.default.gc_thresh2 = 512
net.ipv6.neigh.default.gc_thresh3 = 1024

Expected results:
net.ipv4.neigh.default.gc_thresh1 = 8192
net.ipv4.neigh.default.gc_thresh2 = 32768
net.ipv4.neigh.default.gc_thresh3 = 65536
net.ipv6.neigh.default.gc_thresh1 = 8192
net.ipv6.neigh.default.gc_thresh2 = 32768
net.ipv6.neigh.default.gc_thresh3 = 65536

Additional info:
https://github.com/openshift/openshift-ansible/pull/5645

Comment 1 Johnny Liu 2017-10-13 05:41:43 UTC
Verified this bug with openshift-ansible-3.7.0-0.148.0.git.0.b35eb14.el7.noarch, and PASS.

After installation, go to check:
on LB host:
# sysctl -a |grep "neigh.default.gc_thresh"
net.ipv4.neigh.default.gc_thresh1 = 8192
net.ipv4.neigh.default.gc_thresh2 = 32768
net.ipv4.neigh.default.gc_thresh3 = 65536
net.ipv6.neigh.default.gc_thresh1 = 8192
net.ipv6.neigh.default.gc_thresh2 = 32768
net.ipv6.neigh.default.gc_thresh3 = 65536


On node host:
#  sysctl -a |grep "neigh.default.gc_thresh"
net.ipv4.neigh.default.gc_thresh1 = 8192
net.ipv4.neigh.default.gc_thresh2 = 32768
net.ipv4.neigh.default.gc_thresh3 = 65536
net.ipv6.neigh.default.gc_thresh1 = 8192
net.ipv6.neigh.default.gc_thresh2 = 32768
net.ipv6.neigh.default.gc_thresh3 = 65536

Comment 4 errata-xmlrpc 2017-11-28 22:14:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188


Note You need to log in before you can comment on or make changes to this bug.