Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1814605

Summary: [OSP] Minimal compute node requirements are too small to stand up a cluster
Product: OpenShift Container Platform Reporter: Martin André <m.andre>
Component: InstallerAssignee: Pierre Prinetti <pprinett>
Installer sub component: OpenShift on OpenStack QA Contact: David Sanz <dsanzmor>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: mbridges, mschuppe, pprinett
Version: 4.4   
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Documentation change. Clearly state that 3 workers are needed for the cluster to be functional.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-13 17:22:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Martin André 2020-03-18 11:08:36 UTC
We currently document a minimum of 2 vCPUs, 8 GB RAM and 25 GB Disk for the compute nodes. CPU, at least, is not enough to stand up a cluster.

ERROR Cluster operator monitoring Degraded is True with UpdatingPrometheusK8SFailed: Failed to rollout the stack. Error: running task Updating Prometheus-k8s failed: waiting for Prometheus object changes failed: w
aiting for Prometheus: expected 2 replicas, updated 1 and available 1 
FATAL failed to initialize the cluster: Cluster operator monitoring is still updating 

❯ oc get pods -A | grep -v Running | grep -v Completed           
NAMESPACE                                               NAME                                                              READY   STATUS      RESTARTS   AGE
openshift-monitoring                                    prometheus-k8s-1                                                  0/7     Pending     0          12m

❯ oc describe pod -n openshift-monitoring prometheus-k8s-1 | tail
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.

❯ oc describe node mandre-vex-pdldk-worker-d5nws | tail -n 7
  Resource                   Requests      Limits
  --------                   --------      ------
  cpu                        1432m (95%)   100m (6%)
  memory                     3331Mi (48%)  537Mi (7%)
  ephemeral-storage          0 (0%)        0 (0%)
  attachable-volumes-cinder  0             0

❯ openstack flavor show v1-standard-2
+----------------------------+--------------------------------------+
| Field                      | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| access_project_ids         | None                                 |
| disk                       | 100                                  |
| id                         | d68a5cb5-f133-460d-bc5b-81fd140325c9 |
| name                       | v1-standard-2                        |
| os-flavor-access:is_public | True                                 |
| properties                 |                                      |
| ram                        | 8192                                 |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 2                                    |
+----------------------------+--------------------------------------+

Comment 1 Pierre Prinetti 2020-05-04 17:56:37 UTC
As of 4.5, I was able to deploy a cluster with 3 master nodes and 3 worker nodes, with the minimum required resources described in the docs.

However, a cluster with only 2 workers was not healthy and had multiple failed pods.

The proposed change removes reference to healthy clusters with 2 workers only.

Comment 4 David Sanz 2020-05-11 12:32:28 UTC
Verified

Comment 6 errata-xmlrpc 2020-07-13 17:22:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409