1814605 – [OSP] Minimal compute node requirements are too small to stand up a cluster

Bug 1814605 - [OSP] Minimal compute node requirements are too small to stand up a cluster

Summary: [OSP] Minimal compute node requirements are too small to stand up a cluster

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	4.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	4.5.0
Assignee:	Pierre Prinetti
QA Contact:	David Sanz
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-03-18 11:08 UTC by Martin André
Modified:	2020-07-13 17:22 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Documentation change. Clearly state that 3 workers are needed for the cluster to be functional.
Clone Of:
Environment:
Last Closed:	2020-07-13 17:22:24 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift installer pull 3542	0	None	closed	Bug 1814605: openstack: Require three workers	2020-06-05 13:45:39 UTC
Red Hat Product Errata	RHBA-2020:2409	0	None	None	None	2020-07-13 17:22:45 UTC

Description Martin André 2020-03-18 11:08:36 UTC

We currently document a minimum of 2 vCPUs, 8 GB RAM and 25 GB Disk for the compute nodes. CPU, at least, is not enough to stand up a cluster.

ERROR Cluster operator monitoring Degraded is True with UpdatingPrometheusK8SFailed: Failed to rollout the stack. Error: running task Updating Prometheus-k8s failed: waiting for Prometheus object changes failed: w
aiting for Prometheus: expected 2 replicas, updated 1 and available 1 
FATAL failed to initialize the cluster: Cluster operator monitoring is still updating 

❯ oc get pods -A | grep -v Running | grep -v Completed           
NAMESPACE                                               NAME                                                              READY   STATUS      RESTARTS   AGE
openshift-monitoring                                    prometheus-k8s-1                                                  0/7     Pending     0          12m

❯ oc describe pod -n openshift-monitoring prometheus-k8s-1 | tail
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.

❯ oc describe node mandre-vex-pdldk-worker-d5nws | tail -n 7
  Resource                   Requests      Limits
  --------                   --------      ------
  cpu                        1432m (95%)   100m (6%)
  memory                     3331Mi (48%)  537Mi (7%)
  ephemeral-storage          0 (0%)        0 (0%)
  attachable-volumes-cinder  0             0

❯ openstack flavor show v1-standard-2
+----------------------------+--------------------------------------+
| Field                      | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| access_project_ids         | None                                 |
| disk                       | 100                                  |
| id                         | d68a5cb5-f133-460d-bc5b-81fd140325c9 |
| name                       | v1-standard-2                        |
| os-flavor-access:is_public | True                                 |
| properties                 |                                      |
| ram                        | 8192                                 |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 2                                    |
+----------------------------+--------------------------------------+

Comment 1 Pierre Prinetti 2020-05-04 17:56:37 UTC

As of 4.5, I was able to deploy a cluster with 3 master nodes and 3 worker nodes, with the minimum required resources described in the docs.

However, a cluster with only 2 workers was not healthy and had multiple failed pods.

The proposed change removes reference to healthy clusters with 2 workers only.

Comment 4 David Sanz 2020-05-11 12:32:28 UTC

Verified

Comment 6 errata-xmlrpc 2020-07-13 17:22:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409

Note You need to log in before you can comment on or make changes to this bug.