Bug 1816803
| Summary: | Failed to rollout the stack error when upgrading to 4.3.8 | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Sam Yangsao <syangsao> |
| Component: | Monitoring | Assignee: | Simon Pasquier <spasquie> |
| Status: | CLOSED DUPLICATE | QA Contact: | Junqi Zhao <juzhao> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.3.z | CC: | alegrand, anpicker, erooth, kakkoyun, lcosic, mloibl, pkrupa, spasquie, surbania |
| Target Milestone: | --- | ||
| Target Release: | 4.5.0 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-03-30 12:46:00 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Sam Yangsao
2020-03-24 19:04:08 UTC
Ouput from problematic pod:
# oc -n openshift-monitoring describe pod/prometheus-k8s-1
Name: prometheus-k8s-1
Namespace: openshift-monitoring
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: <none>
Labels: app=prometheus
controller-revision-hash=prometheus-k8s-6d5975d66
prometheus=k8s
statefulset.kubernetes.io/pod-name=prometheus-k8s-1
Annotations: openshift.io/scc: restricted
Status: Pending
IP:
IPs: <none>
Controlled By: StatefulSet/prometheus-k8s
Containers:
prometheus:
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:62f7a5e596028e36518adec7d5228c062fb515bb8395eaef3aa3914891fbee21
Port: <none>
Host Port: <none>
Args:
--web.console.templates=/etc/prometheus/consoles
--web.console.libraries=/etc/prometheus/console_libraries
--config.file=/etc/prometheus/config_out/prometheus.env.yaml
--storage.tsdb.path=/prometheus
--storage.tsdb.retention.time=15d
--web.enable-lifecycle
--storage.tsdb.no-lockfile
--web.external-url=https://prometheus-k8s-openshift-monitoring.apps.devocp4.lab.msp.redhat.com/
--web.route-prefix=/
--web.listen-address=127.0.0.1:9090
Requests:
cpu: 200m
memory: 1Gi
Liveness: exec [sh -c if [ -x "$(command -v curl)" ]; then curl http://localhost:9090/-/healthy; elif [ -x "$(command -v wget)" ]; then wget -q http://localhost:9090/-/healthy; else exit 1; fi] delay=0s timeout=3s period=5s #success=1 #failure=6
Readiness: exec [sh -c if [ -x "$(command -v curl)" ]; then curl http://localhost:9090/-/ready; elif [ -x "$(command -v wget)" ]; then wget -q http://localhost:9090/-/ready; else exit 1; fi] delay=0s timeout=3s period=5s #success=1 #failure=120
Environment: <none>
Mounts:
/etc/pki/ca-trust/extracted/pem/ from prometheus-trusted-ca-bundle (ro)
/etc/prometheus/certs from tls-assets (ro)
/etc/prometheus/config_out from config-out (ro)
/etc/prometheus/configmaps/kubelet-serving-ca-bundle from configmap-kubelet-serving-ca-bundle (ro)
/etc/prometheus/configmaps/serving-certs-ca-bundle from configmap-serving-certs-ca-bundle (ro)
/etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw)
/etc/prometheus/secrets/kube-etcd-client-certs from secret-kube-etcd-client-certs (ro)
/etc/prometheus/secrets/kube-rbac-proxy from secret-kube-rbac-proxy (ro)
/etc/prometheus/secrets/prometheus-k8s-htpasswd from secret-prometheus-k8s-htpasswd (ro)
/etc/prometheus/secrets/prometheus-k8s-proxy from secret-prometheus-k8s-proxy (ro)
/etc/prometheus/secrets/prometheus-k8s-tls from secret-prometheus-k8s-tls (ro)
/prometheus from prometheus-k8s-db (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-s8psx (ro)
prometheus-config-reloader:
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:083e5d6e7379d4043488a6093009cd2e4f19f7bdaf1a197b3daf45e0603f68ce
Port: <none>
Host Port: <none>
Command:
/bin/prometheus-config-reloader
Args:
--log-format=logfmt
--reload-url=http://localhost:9090/-/reload
--config-file=/etc/prometheus/config/prometheus.yaml.gz
--config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
Limits:
cpu: 100m
memory: 25Mi
Requests:
cpu: 100m
memory: 25Mi
Environment:
POD_NAME: prometheus-k8s-1 (v1:metadata.name)
Mounts:
/etc/prometheus/config from config (rw)
/etc/prometheus/config_out from config-out (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-s8psx (ro)
rules-configmap-reloader:
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:46276473e250109b4521803aedd54946cc40498b99fcbedce34a1931f0fb6ebf
Port: <none>
Host Port: <none>
Args:
--webhook-url=http://localhost:9090/-/reload
--volume-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
Limits:
cpu: 100m
memory: 25Mi
Requests:
cpu: 100m
memory: 25Mi
Environment: <none>
Mounts:
/etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-s8psx (ro)
thanos-sidecar:
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e697e0ba41815ade9ab7cd05c275b9f6730761355435ace189b9119f89d81238
Ports: 10902/TCP, 10901/TCP
Host Ports: 0/TCP, 0/TCP
Args:
sidecar
--prometheus.url=http://localhost:9090/
--tsdb.path=/prometheus
--grpc-address=[$(POD_IP)]:10901
--http-address=127.0.0.1:10902
--grpc-server-tls-cert=/etc/tls/grpc/server.crt
--grpc-server-tls-key=/etc/tls/grpc/server.key
--grpc-server-tls-client-ca=/etc/tls/grpc/ca.crt
Requests:
cpu: 50m
memory: 100Mi
Environment:
POD_IP: (v1:status.podIP)
Mounts:
/etc/tls/grpc from secret-grpc-tls (rw)
/prometheus from prometheus-k8s-db (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-s8psx (ro)
prometheus-proxy:
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:443087dca8e7ca16a06771c4e711145b56ddb1eacf0fdf3f0193bf9d7dc30a37
Port: 9091/TCP
Host Port: 0/TCP
Args:
-provider=openshift
-https-address=:9091
-http-address=
-email-domain=*
-upstream=http://localhost:9090
-htpasswd-file=/etc/proxy/htpasswd/auth
-openshift-service-account=prometheus-k8s
-openshift-sar={"resource": "namespaces", "verb": "get"}
-openshift-delegate-urls={"/": {"resource": "namespaces", "verb": "get"}}
-tls-cert=/etc/tls/private/tls.crt
-tls-key=/etc/tls/private/tls.key
-client-secret-file=/var/run/secrets/kubernetes.io/serviceaccount/token
-cookie-secret-file=/etc/proxy/secrets/session_secret
-openshift-ca=/etc/pki/tls/cert.pem
-openshift-ca=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
-skip-auth-regex=^/metrics
Requests:
cpu: 10m
memory: 20Mi
Environment:
HTTP_PROXY:
HTTPS_PROXY:
NO_PROXY:
Mounts:
/etc/pki/ca-trust/extracted/pem/ from prometheus-trusted-ca-bundle (ro)
/etc/proxy/htpasswd from secret-prometheus-k8s-htpasswd (rw)
/etc/proxy/secrets from secret-prometheus-k8s-proxy (rw)
/etc/tls/private from secret-prometheus-k8s-tls (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-s8psx (ro)
kube-rbac-proxy:
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:edbacf69d159af2598f9c5f626493c35a29d65f7f2d2c212f168001a719d8c31
Port: 9092/TCP
Host Port: 0/TCP
Args:
--secure-listen-address=0.0.0.0:9092
--upstream=http://127.0.0.1:9095
--config-file=/etc/kube-rbac-proxy/config.yaml
--tls-cert-file=/etc/tls/private/tls.crt
--tls-private-key-file=/etc/tls/private/tls.key
--tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
--logtostderr=true
--v=10
Requests:
cpu: 10m
memory: 20Mi
Environment: <none>
Mounts:
/etc/kube-rbac-proxy from secret-kube-rbac-proxy (rw)
/etc/tls/private from secret-prometheus-k8s-tls (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-s8psx (ro)
prom-label-proxy:
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:431d605858b2b0e03ddad79854dded6176ae93a89c53b7a92f599b797cb01634
Port: <none>
Host Port: <none>
Args:
--insecure-listen-address=127.0.0.1:9095
--upstream=http://127.0.0.1:9090
--label=namespace
Requests:
cpu: 10m
memory: 20Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-k8s-token-s8psx (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
config:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-k8s
Optional: false
tls-assets:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-k8s-tls-assets
Optional: false
config-out:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
prometheus-k8s-rulefiles-0:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: prometheus-k8s-rulefiles-0
Optional: false
secret-kube-etcd-client-certs:
Type: Secret (a volume populated by a Secret)
SecretName: kube-etcd-client-certs
Optional: false
secret-prometheus-k8s-tls:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-k8s-tls
Optional: false
secret-prometheus-k8s-proxy:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-k8s-proxy
Optional: false
secret-prometheus-k8s-htpasswd:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-k8s-htpasswd
Optional: false
secret-kube-rbac-proxy:
Type: Secret (a volume populated by a Secret)
SecretName: kube-rbac-proxy
Optional: false
configmap-serving-certs-ca-bundle:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: serving-certs-ca-bundle
Optional: false
configmap-kubelet-serving-ca-bundle:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kubelet-serving-ca-bundle
Optional: false
prometheus-k8s-db:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
secret-grpc-tls:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-k8s-grpc-tls-8r79ud0kfmcr2
Optional: false
prometheus-trusted-ca-bundle:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: prometheus-trusted-ca-bundle-39man1pbaa8jq
Optional: true
prometheus-k8s-token-s8psx:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-k8s-token-s8psx
Optional: false
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
Warning FailedScheduling <unknown> default-scheduler 0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that the pod didn't tolerate.
One workaround that I validated (Thanks to Engineering) was to add an additional worker node to the cluster as it sits awaiting for this to be fixed during the upgrade.
The cluster was originally configured with the default of:
3 master nodes (control plane) with 4 vCPU & 16 GB memory each
2 worker nodes with 2 vCPU & 8 GB memory each
This minimum configuration [1] will fail during an upgrade from 4.3.1 as the limits have changed and there is backport [2] created made to reduce the CPU requirement.
Here is an output of the problematic pod's yaml `prometheus-k8s-1.yaml`:
<snip>
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2020-03-24T18:19:32Z"
message: '0/5 nodes are available: 2 Insufficient cpu, 3 node(s) had taints that
the pod didn''t tolerate.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: Burstable
</snip>
Another way to view this message is by running the following command to see the error:
# oc -n openshift-monitoring describe pod/prometheus-k8s-1
NOTE: prometheus-k8s pods requires 480m cpu for all containers, so the default minimums [2] will need to be adjusted.
Once I added an additional worker with more vCPU & memory, the prometheus pods started and the upgrade to 4.3.8 continued to run and completed.
[1] https://docs.openshift.com/container-platform/4.3/installing/installing_vsphere/installing-vsphere.html#minimum-resource-requirements_installing-vsphere
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1812719
|