Bug 1814605 - [OSP] Minimal compute node requirements are too small to stand up a cluster
Summary: [OSP] Minimal compute node requirements are too small to stand up a cluster
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.4
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.5.0
Assignee: Pierre Prinetti
QA Contact: David Sanz
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-18 11:08 UTC by Martin André
Modified: 2020-07-13 17:22 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Documentation change. Clearly state that 3 workers are needed for the cluster to be functional.
Clone Of:
Environment:
Last Closed: 2020-07-13 17:22:24 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 3542 0 None closed Bug 1814605: openstack: Require three workers 2020-06-05 13:45:39 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:22:45 UTC

Description Martin André 2020-03-18 11:08:36 UTC
We currently document a minimum of 2 vCPUs, 8 GB RAM and 25 GB Disk for the compute nodes. CPU, at least, is not enough to stand up a cluster.

ERROR Cluster operator monitoring Degraded is True with UpdatingPrometheusK8SFailed: Failed to rollout the stack. Error: running task Updating Prometheus-k8s failed: waiting for Prometheus object changes failed: w
aiting for Prometheus: expected 2 replicas, updated 1 and available 1 
FATAL failed to initialize the cluster: Cluster operator monitoring is still updating 

❯ oc get pods -A | grep -v Running | grep -v Completed           
NAMESPACE                                               NAME                                                              READY   STATUS      RESTARTS   AGE
openshift-monitoring                                    prometheus-k8s-1                                                  0/7     Pending     0          12m

❯ oc describe pod -n openshift-monitoring prometheus-k8s-1 | tail
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.

❯ oc describe node mandre-vex-pdldk-worker-d5nws | tail -n 7
  Resource                   Requests      Limits
  --------                   --------      ------
  cpu                        1432m (95%)   100m (6%)
  memory                     3331Mi (48%)  537Mi (7%)
  ephemeral-storage          0 (0%)        0 (0%)
  attachable-volumes-cinder  0             0

❯ openstack flavor show v1-standard-2
+----------------------------+--------------------------------------+
| Field                      | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| access_project_ids         | None                                 |
| disk                       | 100                                  |
| id                         | d68a5cb5-f133-460d-bc5b-81fd140325c9 |
| name                       | v1-standard-2                        |
| os-flavor-access:is_public | True                                 |
| properties                 |                                      |
| ram                        | 8192                                 |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 2                                    |
+----------------------------+--------------------------------------+

Comment 1 Pierre Prinetti 2020-05-04 17:56:37 UTC
As of 4.5, I was able to deploy a cluster with 3 master nodes and 3 worker nodes, with the minimum required resources described in the docs.

However, a cluster with only 2 workers was not healthy and had multiple failed pods.

The proposed change removes reference to healthy clusters with 2 workers only.

Comment 4 David Sanz 2020-05-11 12:32:28 UTC
Verified

Comment 6 errata-xmlrpc 2020-07-13 17:22:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.