1680062 – Two openshift-ingress router-default pods running on same worker node after install

Bug 1680062 - Two openshift-ingress router-default pods running on same worker node after install

Summary: Two openshift-ingress router-default pods running on same worker node after i...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Dan Mace
QA Contact:	Hongan Li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-02-22 16:18 UTC by Mike Fiedler
Modified:	2022-08-04 22:20 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:44:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-ingress-operator pull 154	0	None	closed	Bug 1680062: deployment: add ingress controller pod anti-affinity	2020-03-17 12:13:21 UTC
Red Hat Product Errata	RHBA-2019:0758	0	None	None	None	2019-06-04 10:44:33 UTC

Description Mike Fiedler 2019-02-22 16:18:36 UTC

Description of problem:

After a default install of 4.0, the router-default deployment has 2 replicas:

NAME             READY   UP-TO-DATE   AVAILABLE   AGE
router-default   2/2     2            2           59m

However, both pods are running on the same worker which seems to defeat the purpose of having 2 replicas (HA, performance):

NAME                              READY   STATUS    RESTARTS   AGE   IP           NODE                                           NOMINATED NODE                                                                                                                                          
router-default-6659fd47cc-9htpz   1/1     Running   0          55m   10.131.0.6   ip-172-31-140-190.us-east-2.compute.internal   <none>
router-default-6659fd47cc-ttwvw   1/1     Running   0          55m   10.131.0.4   ip-172-31-140-190.us-east-2.compute.internal   <none>

Anti-affinity should be used to keep the pods off of the same node.


Version-Release number of selected component (if applicable): 4.0.0-0.nightly-2019-02-22-074434


How reproducible: Often


Steps to Reproduce:
1.  Default AWS install  of 4.0 with next gen installer
2.  oc get pods -n openshift-ingress -o wide


Actual results:

Router pods are often on the same worker

Expected results:

Router pods on different workers for HA and performance reasons

Comment 3 Hongan Li 2019-03-13 05:45:07 UTC

verified with 4.0.0-0.ci-2019-03-12-223432 and issue has been fixed.


$ oc get pod -n openshift-ingress -o wide
NAME                              READY   STATUS    RESTARTS   AGE    IP           NODE                                           NOMINATED NODE
router-default-7844db4447-9rf2l   1/1     Running   0          138m   10.128.2.5   ip-172-31-171-121.us-east-2.compute.internal   <none>
router-default-7844db4447-vnml8   1/1     Running   0          138m   10.129.2.6   ip-172-31-155-59.us-east-2.compute.internal    <none>

$ oc get clusterversions.config.openshift.io 
NAME      VERSION                        AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.0.0-0.ci-2019-03-12-223432   True        False         126m    Cluster version is 4.0.0-0.ci-2019-03-12-223432

Comment 5 errata-xmlrpc 2019-06-04 10:44:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.