1703943 – router pods are always running on same node in fresh install AWS env

Bug 1703943 - router pods are always running on same node in fresh install AWS env

Summary: router pods are always running on same node in fresh install AWS env

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Dan Mace
QA Contact:	Hongan Li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-04-29 07:45 UTC by Hongan Li
Modified:	2022-08-04 22:24 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:48:10 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-ingress-operator pull 222	0	None	None	None	2019-04-30 20:27:42 UTC
Red Hat Product Errata	RHBA-2019:0758	0	None	None	None	2019-06-04 10:48:17 UTC

Description Hongan Li 2019-04-29 07:45:43 UTC

Description of problem:
Checked some fresh install env on AWS (IPI and UPI) and found that the router pods are always running on same node, although no functional impact.
And after scale down and scale up, then the two router pods running on different nodes. 

Version-Release number of selected component (if applicable):
4.1.0-0.nightly-2019-04-28-064010

How reproducible:
100%

Steps to Reproduce:
1. fresh install on AWS
2. check the router pods
   $ oc get pod -o wide -n openshift-ingress
3. scale down
   $ oc -n openshift-ingress-operator patch ingresscontroller/default -p '{"spec":{"replicas": 0}}' --type=merge
4. scale up
   $ oc -n openshift-ingress-operator patch ingresscontroller/default -p '{"spec":{"replicas": 2}}' --type=merge
5. check the router pods again
   $ oc get pod -o wide -n openshift-ingress

Actual results:
step 2:
$ oc get pod -o wide -n openshift-ingress
NAME                              READY   STATUS    RESTARTS   AGE     IP           NODE                                                NOMINATED NODE   READINESS GATES
router-default-84c7f9d456-fgxqp   1/1     Running   0          4h35m   10.131.0.4   ip-172-31-134-171.ap-northeast-2.compute.internal   <none>           <none>
router-default-84c7f9d456-sl7k2   1/1     Running   0          4h35m   10.131.0.3   ip-172-31-134-171.ap-northeast-2.compute.internal   <none>           <none>

step 5:
$ oc get pod -o wide -n openshift-ingress
NAME                              READY   STATUS    RESTARTS   AGE   IP           NODE                                                NOMINATED NODE   READINESS GATES
router-default-84c7f9d456-2wvzv   1/1     Running   0          50s   10.128.2.9   ip-172-31-141-240.ap-northeast-2.compute.internal   <none>           <none>
router-default-84c7f9d456-d9ndb   1/1     Running   0          50s   10.129.2.9   ip-172-31-150-49.ap-northeast-2.compute.internal    <none>           <none>


Expected results:
the two router pods are running on different nodes

Additional info:

Comment 1 W. Trevor King 2019-04-29 21:53:10 UTC

Because we don't redistribute pods unless they are evicted, maybe this is just:

1. Ingress requests pods, but we have no compute nodes yet.
2. Machine-API operator creates the first compute node.
3. Scheduler rejoices and drops both router pods on that node.
4. Machine API creates additional nodes, but since the router pods are already scheduled, it's too late for the scheduler to point an ingress pod at them.

This seems like something that should have a generic Kubernetes rebalancing solution.  I don't know if one exists or not, but if not, a short-term fix might be having your operator monitor for this and then kill one of the pods if it notices this condition.

Comment 3 Hongan Li 2019-05-05 06:03:55 UTC

verified with 4.1.0-0.nightly-2019-05-04-210601 on AWS and issue has been fixed.

$ oc get pod -o wide -n openshift-ingress
NAME                              READY   STATUS    RESTARTS   AGE     IP           NODE                                               NOMINATED NODE   READINESS GATES
router-default-75956b9c8d-6bmf2   1/1     Running   0          4h56m   10.128.2.3   ip-172-31-159-55.ap-southeast-1.compute.internal   <none>           <none>
router-default-75956b9c8d-sg4z7   1/1     Running   0          4h56m   10.131.0.4   ip-172-31-169-47.ap-southeast-1.compute.internal   <none>           <none>

Comment 5 errata-xmlrpc 2019-06-04 10:48:10 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.