1975379 – Console pods are scheduled on single master node

Bug 1975379 - Console pods are scheduled on single master node

Summary: Console pods are scheduled on single master node

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Management Console
Sub Component:
Version:	4.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.9.0
Assignee:	Jakub Hadvig
QA Contact:	Yanping Zhang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2003639
TreeView+	depends on / blocked

Reported:	2021-06-23 14:34 UTC by Apurva Nisal
Modified:	2024-12-20 20:19 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: Use of soft requirement for anti-affinity rules on both console's deployments. Consequence: Console pods are scheduled on single master node. Fix: Use hard requirement for anti-affinity rules on both console's deployments. Use the hostname as topology key when scheduling the pods. Result: Console pods are scheduled on different master node.
Clone Of:
Environment:
Last Closed:	2021-10-18 17:36:28 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift console-operator pull 560	None	open	Bug 1975379: Use hard requirement for anti-affinity rules on both console's deployments	2021-06-24 16:47:29 UTC
Github	openshift console-operator pull 566	None	open	Bug 1975379: Have timezone as soft requirement for pod antiaffinity	2021-07-15 16:15:45 UTC
Red Hat Product Errata	RHSA-2021:3759	None	None	None	2021-10-18 17:36:42 UTC

Description Apurva Nisal 2021-06-23 14:34:25 UTC

Description of problem:
Console pods are scheduled on single master node.

 oc get pods -owide
NAME                        READY   STATUS    RESTARTS   AGE    IP            NODE                                                   NOMINATED NODE   READINESS GATES
console-6558bcb9f9-7cnjk    1/1     Running   1          3d1h   10.129.0.38   master-1.abc.com   <none>           <none>
console-6558bcb9f9-fwpzf    1/1     Running   0          3d1h   10.129.0.46   master-1.abc.com   <none>           <none>
downloads-84f554976-9nwr2   1/1     Running   0          3d1h   10.131.0.11   worker-2.abc.com   <none>           <none>
downloads-84f554976-wl655   1/1     Running   0          3d1h   10.129.2.7    worker-0.abc.com   <none>           <none>


oc get nodes
NAME               STATUS   ROLES    AGE    VERSION
master-0.abc.com   Ready    master   3d1h   v1.20.0+df9c838
master-1.abc.com   Ready    master   3d1h   v1.20.0+df9c838
master-2.abc.com   Ready    master   3d1h   v1.20.0+df9c838
worker-0.abc.com   Ready    worker   3d1h   v1.20.0+df9c838
worker-1.abc.com   Ready    worker   3d1h   v1.20.0+df9c838
worker-2.abc.com   Ready    worker   3d1h   v1.20.0+df9c838


Actual results:
Console pods are scheduled on single master node

Expected results:
Console pods should be scheduled on different master node

Comment 2 Samuel Padgett 2021-06-23 14:59:11 UTC

We have anti-affinity rules set, but we're using `preferredDuringSchedulingIgnoredDuringExecution` which is the soft requirement rather than `requiredDuringSchedulingIgnoredDuringExecution` which is the hard requirement.

Comment 3 Jakub Hadvig 2021-07-02 16:57:19 UTC

Still valid. PR up and in merge process.

Comment 5 Ronald 2021-07-09 12:21:53 UTC

Please backport the PR to release 4.7 as well

Thx,
Ronald

Comment 7 Samuel Padgett 2021-07-15 13:06:45 UTC

Reopening as this breaks OpenStack deployments.

Comment 9 Yanping Zhang 2021-07-20 09:35:21 UTC

Checked on ocp 4.9 cluster with payload 4.9.0-0.nightly-2021-07-19-140945。
Check console/downloads deployment yaml, the anti-affinity rule is "requiredDuringSchedulingIgnoredDuringExecution". And console pods are scheduled on different master nodes.
# oc get node |grep master
ip-10-0-158-130.us-east-2.compute.internal   Ready    master   9h    v1.21.1+8268f88
ip-10-0-160-245.us-east-2.compute.internal   Ready    master   9h    v1.21.1+8268f88
ip-10-0-196-96.us-east-2.compute.internal    Ready    master   9h    v1.21.1+8268f88
# oc get pod -n openshift-console -o wide
NAME                         READY   STATUS    RESTARTS   AGE   IP            NODE                                         NOMINATED NODE   READINESS GATES
console-66946dc647-2vp26     1/1     Running   0          9h    10.128.0.36   ip-10-0-158-130.us-east-2.compute.internal   <none>           <none>
console-66946dc647-wpg64     1/1     Running   0          9h    10.130.0.35   ip-10-0-160-245.us-east-2.compute.internal   <none>           <none>
downloads-7d9df5cb76-5fsmr   1/1     Running   0          9h    10.130.0.28   ip-10-0-160-245.us-east-2.compute.internal   <none>           <none>
downloads-7d9df5cb76-x44nc   1/1     Running   0          9h    10.128.0.23   ip-10-0-158-130.us-east-2.compute.internal   <none>           <none>

Comment 10 Ronald 2021-07-28 11:08:35 UTC

Hi Guys,


Can it be backported to release 4.7 as well ?

Thx,
Ronald

Comment 14 errata-xmlrpc 2021-10-18 17:36:28 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759

Comment 15 Red Hat Bugzilla 2023-09-15 01:10:28 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.