Bug 1802544 - The default workers' number can not make all the monitoring pods become running with default IPI installation setting
Summary: The default workers' number can not make all the monitoring pods become runni...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.4.0
Assignee: Ryan Phillips
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-13 11:47 UTC by Junqi Zhao
Modified: 2020-05-04 11:36 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-04 11:36:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:0581 0 None None None 2020-05-04 11:36:42 UTC

Description Junqi Zhao 2020-02-13 11:47:45 UTC
Description of problem:
use "openshift-install create install-config" to create a 4.4 AWS cluster, the default masters' number is 3(instance type: m4.xlarge(4 CPU/16Gi Memory)),
default workers' number is 3(instance type: m4.large(2 CPU/8Gi Memory)

install-config see below
*******************************************
apiVersion: v1
baseDomain: qe.devcluster.openshift.com
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform: {}
  replicas: 3
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform: {}
  replicas: 3
metadata:
  creationTimestamp: null
  name: xx
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 10.0.0.0/16
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  aws:
    region: us-east-2
publish: External
*******************************************

due to we have increased the systemReserved cpu/memory, for a m4.large worker, there is 2G-768m=1232m allocatable cpu
see https://github.com/openshift/machine-config-operator/commit/b811616049d7990c70fcfd56ff1d5b746b1a1121

each prometheus-k8s pod requests 480m cpu, if there is less than 480m cpu left, another prometheus-k8s pod would be failed to start up,
then we can see the installation error
level=info msg="Cluster operator monitoring Progressing is True with RollOutInProgress: Rolling out the stack."
level=error msg="Cluster operator monitoring Degraded is True with UpdatingPrometheusK8SFailed: Failed to rollout the stack. Error: running task Updating Prometheus-k8s failed: waiting for Prometheus object changes failed: waiting for Prometheus: expected 2 replicas, updated 1 and available 1"

# oc -n openshift-monitoring get pod | grep prometheus-k8s
prometheus-k8s-0                               7/7     Running   1          165m   10.129.2.7    ip-10-0-60-1.us-east-2.compute.internal     <none>           <none>
prometheus-k8s-1                               0/7     Pending   0          159m   <none>        <none>                                      <none>           <none>

# oc -n openshift-monitoring describe pod prometheus-k8s-1
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.

but the allocatable.cpu on worker is 1232m, except the already reserved cpu for other pods, there is not enough cpu for prometheus-k8s-1 pod
932+480, 1192+480, 1212+480 already > 1232
# for i in $(oc get node | grep worker | awk '{print $1}'); do echo $i; oc describe node $i | tail; done
ip-10-0-60-1.us-east-2.compute.internal
  openshift-sdn                           sdn-nc8zv                            100m (8%)     0 (0%)      200Mi (2%)       0 (0%)         3h16m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests      Limits
  --------                    --------      ------
  cpu                         1192m (96%)   300m (24%)
  memory                      2679Mi (39%)  587Mi (8%)
  ephemeral-storage           0 (0%)        0 (0%)
  attachable-volumes-aws-ebs  0             0
Events:                       <none>

ip-10-0-64-143.us-east-2.compute.internal
  openshift-sdn                               sdn-pxk6r                                            100m (8%)     0 (0%)      200Mi (2%)       0 (0%)         3h16m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests      Limits
  --------                    --------      ------
  cpu                         1212m (98%)   100m (8%)
  memory                      2837Mi (41%)  537Mi (7%)
  ephemeral-storage           0 (0%)        0 (0%)
  attachable-volumes-aws-ebs  0             0
Events:                       <none>

ip-10-0-75-97.us-east-2.compute.internal
  openshift-sdn                           sdn-l4bw8                                  100m (8%)     0 (0%)      200Mi (2%)       0 (0%)         3h16m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests      Limits
  --------                    --------      ------
  cpu                         932m (75%)    100m (8%)
  memory                      1951Mi (28%)  537Mi (7%)
  ephemeral-storage           0 (0%)        0 (0%)
  attachable-volumes-aws-ebs  0             0
Events:                       <none>

# for i in $(oc get node | grep worker | awk '{print $1}'); do echo $i;oc get node $i -o jsonpath="{.status.allocatable.cpu}"; echo -e "\n";done
ip-10-0-60-1.us-east-2.compute.internal
1232m
ip-10-0-64-143.us-east-2.compute.internal
1232m
ip-10-0-75-97.us-east-2.compute.internal
1232m

# kubectl -n openshift-monitoring get pod prometheus-k8s-1 -o go-template='{{range.spec.containers}}{{"Container Name: "}}{{.name}}{{"\r\nresources: "}}{{.resources}}{{"\n"}}{{end}}'
Container Name: prometheus
resources: map[requests:map[cpu:200m memory:1Gi]]
Container Name: prometheus-config-reloader
resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:100m memory:25Mi]]
Container Name: rules-configmap-reloader
resources: map[limits:map[cpu:100m memory:25Mi] requests:map[cpu:100m memory:25Mi]]
Container Name: thanos-sidecar
resources: map[requests:map[cpu:50m memory:100Mi]]
Container Name: prometheus-proxy
resources: map[requests:map[cpu:10m memory:20Mi]]
Container Name: kube-rbac-proxy
resources: map[requests:map[cpu:10m memory:20Mi]]
Container Name: prom-label-proxy
resources: map[requests:map[cpu:10m memory:20Mi]]


Version-Release number of the following components:
4.4.0-0.nightly-2020-02-12-211301

How reproducible:
Always

Steps to Reproduce:
1. "openshift-install create install-config" with the default setting
2.
3.

Actual results:
Cluster monitoring is degraded

Expected results:
Cluster should be fine

Additional info:

Comment 4 Scott Dodson 2020-02-14 21:21:47 UTC
Seems reasonable to mark this as a dupe of 1803239 then?

*** This bug has been marked as a duplicate of bug 1803239 ***

Comment 5 Ryan Phillips 2020-02-14 21:23:37 UTC
Yep. Thanks!

Comment 6 Johnny Liu 2020-02-18 11:28:31 UTC
QE caught this effective issue from block box testing perspective in the first time, now the PR is reverted, so I think this is effective bug. So move this bug to ON_QA.

Comment 7 Johnny Liu 2020-02-18 11:29:09 UTC
Verified this bug with 4.4.0-0.nightly-2020-02-17-211020, and PASS.

Comment 10 errata-xmlrpc 2020-05-04 11:36:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.