1564944 – Ensure pods from default, openshift-infra, and logging namespaces are spread evenly across infra structure nodes

Bug 1564944 - Ensure pods from default, openshift-infra, and logging namespaces are spread evenly across infra structure nodes

Summary: Ensure pods from default, openshift-infra, and logging namespaces are spread ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Logging
Sub Component:
Version:	3.9.0
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	3.9.z
Assignee:	ewolinet
QA Contact:	Anping Li
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1563852 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-04-09 01:50 UTC by Peter Portante
Modified:	2021-01-18 05:25 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Environment:
Last Closed:	2018-05-17 06:43:35 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:1566	0	None	None	None	2018-05-17 06:44:33 UTC

Description Peter Portante 2018-04-09 01:50:10 UTC

We need to ensure pods from the default, openshift-infra, and logging namespaces are spread evenly across infra structure nodes.  What can happen is that on a 3 node infra structure setup, 2 of 3 ES pods can land on one node, while we might have 3 registry pods on another, with 2 of 3 cassandra pods on yet another.

These infrastructure pods are expecting to NOT compete for resources on one infra node with other pods from their scale group.

Consider using a host port mapping, like what the router does in the default namespace to ensure each pod from a group lands on separate nodes.

Comment 1 Jeff Cantrill 2018-04-09 13:01:14 UTC

You mean logging pods or all infra pods?  The later implies we may wish to clone this issue to ensure other teams make the requested changes

Comment 2 ewolinet 2018-04-09 14:55:41 UTC

Could we potentially overlap the two? 

We could configure pod anti-affinity in our DCs and use a default label that matches the other components of the same type (e.g. ES has anti-affinity with other ES, Kibana with other kibana). We could use the preferred rule so that we don't break for clusters that are too small [1].

We could provide a means to specify additional match expressions so that admins could balance out with other infra node pods.

[1] https://docs.openshift.com/container-platform/3.9/admin_guide/scheduling/pod_affinity.html#admin-guide-sched-affinity-examples2-pods

Comment 3 Peter Portante 2018-04-09 15:26:57 UTC

(In reply to Jeff Cantrill from comment #1)
> You mean logging pods or all infra pods?

All infra pods.

(In reply to ewolinet from comment #2)
> Could we potentially overlap the two? 

It is not clear pod affinity/anit-affinity actually helps us, but using host/port mappings will do what we need.

Comment 4 ewolinet 2018-04-09 19:00:37 UTC

https://github.com/openshift/openshift-ansible/pull/7864

Comment 5 ewolinet 2018-04-18 21:08:00 UTC

3.9 Cherrypick https://github.com/openshift/openshift-ansible/pull/8031

Comment 6 Jeff Cantrill 2018-04-19 17:46:25 UTC

*** Bug 1563852 has been marked as a duplicate of this bug. ***

Comment 10 Anping Li 2018-05-07 03:29:21 UTC

Verified with ose-ansible:v3.9.27, the pod podAntiAffinity are added to ES and Kibana.

Comment 13 errata-xmlrpc 2018-05-17 06:43:35 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1566

Note You need to log in before you can comment on or make changes to this bug.