Bug 1620556 - [3.10.14] ovs Pods OOMKilled on baremetal nodes [NEEDINFO]
Summary: [3.10.14] ovs Pods OOMKilled on baremetal nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.10.z
Assignee: Dan Williams
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-23 07:32 UTC by Jaspreet Kaur
Modified: 2019-06-11 09:31 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-11 09:30:48 UTC
Target Upstream Version:
cdc: needinfo? (jkaur)
dcbw: needinfo? (jkaur)
dcbw: needinfo? (jkaur)
dcbw: needinfo? (jkaur)


Attachments (Terms of Use)
ovs pid maps (cat /proc/`pidof ovs-vswitchd`/maps) (15.48 KB, text/plain)
2018-08-27 08:17 UTC, jtudelag
no flags Details
ovs pid smaps (cat /proc/<pidof ovs-vswitchd>/smaps) (99.40 KB, text/plain)
2018-09-21 12:43 UTC, jtudelag
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0786 None None None 2019-06-11 09:30:59 UTC

Description Jaspreet Kaur 2018-08-23 07:32:52 UTC
Description of problem: After deploying OCP 3.10.14,  ovs Pods are being killed by the OOMKiller, specially at first start, after deploying the cluster.

We are hitting this issue [1] , which is marked as resolved, but clearly is not.

The limits of the DaemonSet that run the OVS Pods, are very small, can be found here [2] :
limits:
           cpu: 200m
            memory: 400Mi 

Clearly these limits are to low when running OCP on big baremetal nodes.


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1571379
[2] https://github.com/openshift/openshift-ansible/blob/release-3.10/roles/openshift_sdn/files/sdn-ovs.yaml#L96


Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results: Even with [1] fix issue is still seen.

Expected results: The issue should not be seen.

Additional info:

3 different options :

1. Set bigger limits, 1000m and 1Gi?
2. Make the installer intelligent enough to calculate/tune these values based on the node size.
3. Allow to set them explicitly in the inventory.
* e.g. openshift_node_ovs_cpu_limit: 1000m, openshift_node_ovs_memory_limit: 1Gi 

Related ovs bug :

https://bugzilla.redhat.com/show_bug.cgi?id=1572797

Comment 1 jtudelag 2018-08-23 08:14:45 UTC
> rpm -q openshift-ansible
openshift-ansible-3.10.21-1.git.0.6446011.el7.noarch

> -q ansible
ansible-2.4.4.0-1.el7ae.noarch

> ansible --version
ansible 2.4.4.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/jorget/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, May 31 2018, 09:41:32) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]

> rpm -qa | grep openshift
openshift-ansible-playbooks-3.10.21-1.git.0.6446011.el7.noarch
atomic-openshift-hyperkube-3.10.14-1.git.0.ba8ae6d.el7.x86_64
atomic-openshift-excluder-3.10.14-1.git.0.ba8ae6d.el7.noarch
openshift-ansible-roles-3.10.21-1.git.0.6446011.el7.noarch
atomic-openshift-docker-excluder-3.10.14-1.git.0.ba8ae6d.el7.noarch
atomic-openshift-3.10.14-1.git.0.ba8ae6d.el7.x86_64
atomic-openshift-clients-3.10.14-1.git.0.ba8ae6d.el7.x86_64
openshift-ansible-3.10.21-1.git.0.6446011.el7.noarch
openshift-ansible-docs-3.10.21-1.git.0.6446011.el7.noarch
atomic-openshift-node-3.10.14-1.git.0.ba8ae6d.el7.x86_64

> oc version
oc v3.10.14
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://XXXXXXXXX:8443
openshift v3.10.14
kubernetes v1.10.0+b81c8f8

Comment 2 Casey Callendrello 2018-08-23 12:57:29 UTC
We had a similar issue in https://bugzilla.redhat.com/show_bug.cgi?id=1571379 -- we fixed a bug where ovs-vswitchd was using 8 MiB per core.

This was merged here - https://github.com/openshift/openshift-ansible/pull/8166/commits/6d9ad9d1ac4c95ea38a8b1aa7d94ac698724c755

How many cores and how much ram does the node have? 

Ideally we can actually clamp the memory usage, but if we're not able, we can add an override to openshift_ansible and give some guidelines.

Comment 3 jtudelag 2018-08-23 14:19:43 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1620556#c2, here you are:

> free -h
              total        used        free      shared  buff/cache   available
Mem:           377G        8.9G        360G         27M        7.8G        367G
Swap:            0B          0B          0B

> lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                64
On-line CPU(s) list:   0-63
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             2
NUMA node(s):          4
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 85
Model name:            Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz
Stepping:              4
CPU MHz:               2600.000
BogoMIPS:              5200.00
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              1024K
L3 cache:              22528K
NUMA node0 CPU(s):     0-7,32-39
NUMA node1 CPU(s):     8-15,40-47
NUMA node2 CPU(s):     16-23,48-55
NUMA node3 CPU(s):     24-31,56-63

Comment 4 Dan Williams 2018-08-24 16:02:06 UTC
Could I get:

rpm -qv openvswitch

?

Comment 5 Dan Williams 2018-08-24 16:09:26 UTC
Do the OVS pods get killed immediately, or does the OOM take some time?  If you are able, could you grab:

/proc/`pidof ovs-vswitchd`/maps

Comment 6 jtudelag 2018-08-27 08:08:15 UTC
(In reply to Dan Williams from comment #4)
> Could I get:
> 
> rpm -qv openvswitch
> 
> ?

As I already said, we are using OCP 3.10.14:

$ oc -n openshift-sdn rsh ovs-drtgw  
sh-4.2# 
$ rpm -qv openvswitch 
openvswitch-2.9.0-47.el7fdp.2.x86_64

Comment 7 jtudelag 2018-08-27 08:17:24 UTC
Created attachment 1478888 [details]
ovs pid maps (cat /proc/`pidof ovs-vswitchd`/maps)

Comment 8 jtudelag 2018-08-27 08:23:27 UTC
(In reply to Dan Williams from comment #5)
> Do the OVS pods get killed immediately, or does the OOM take some time?  If
> you are able, could you grab:

I would same it takes some time, on a cluster with 8 nodes (All exactly the same HW, big baremetal), after the installer finishes OK,  some of the OVS Pods started fine, while others were killed by OOMKiller repeatedly. 

I first tried to delete manually and in order, the OVS and then the SDN pod of each of the impacted nodes, with no luck. After that I increased both Mem and CPU limits to 1CPU and 1Gi. After that, I deleted manually all OVS and SDN Pods, and they all started fine.

> 
> /proc/`pidof ovs-vswitchd`/maps

I just attached the logs you are requesting, gathered inside one of the ovs Pods. Dont know if related, but keep in mind this log is from an OVS Pod running with 1CPU and 1Gi limit.

Comment 9 Dan Williams 2018-09-07 16:19:07 UTC
Could I get /proc/<pidof ovs-vswitchd>/smaps from the system?

Comment 10 Dan Williams 2018-09-20 19:55:54 UTC
Ping on this issue?  smaps from ovs-vswitchd will help debug the issue further.

Comment 11 jtudelag 2018-09-21 12:43:22 UTC
Created attachment 1485488 [details]
ovs pid smaps (cat /proc/<pidof ovs-vswitchd>/smaps)

Comment 22 errata-xmlrpc 2019-06-11 09:30:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0786


Note You need to log in before you can comment on or make changes to this bug.