Description of problem: IPI AWS cluster with only 2 workers, reconciling PrometheusAdapter Deployment failed, no such issue with 3 workers with the same payload # oc get no | grep worker ip-10-0-150-214.ap-south-1.compute.internal Ready worker 152m v1.21.0-rc.0+2993be8 ip-10-0-163-79.ap-south-1.compute.internal Ready worker 152m v1.21.0-rc.0+2993be8 # oc get co monitoring -oyaml ... status: conditions: - lastTransitionTime: "2021-04-19T05:03:40Z" message: 'Failed to rollout the stack. Error: running task Updating prometheus-adapter failed: reconciling PrometheusAdapter Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-adapter: expected 3 replicas, got 1 updated replicas' reason: UpdatingprometheusAdapterFailed status: "True" type: Degraded NOTE: prometheus-adapter deployment requires only 2 prometheus-adapter pods # oc -n openshift-monitoring get po | grep prometheus-adapter prometheus-adapter-6d9fc84f4c-2jq9s 0/1 ContainerCreating 0 92m prometheus-adapter-6d9fc84f4c-pwhb4 0/1 ContainerCreating 0 92m prometheus-adapter-7785cf7594-df8bf 0/1 Pending 0 87m # oc -n openshift-monitoring get deploy prometheus-adapter NAME READY UP-TO-DATE AVAILABLE AGE prometheus-adapter 0/2 1 0 93m # oc -n openshift-monitoring get rs NAME DESIRED CURRENT READY AGE .. prometheus-adapter-6d9fc84f4c 2 2 0 93m prometheus-adapter-7785cf7594 1 1 0 88m describe the Pending pod # oc -n openshift-monitoring describe pod prometheus-adapter-7785cf7594-df8bf Name: prometheus-adapter-7785cf7594-df8bf Namespace: openshift-monitoring Priority: 2000000000 Priority Class Name: system-cluster-critical Node: <none> Labels: app.kubernetes.io/component=metrics-adapter app.kubernetes.io/managed-by=cluster-monitoring-operator app.kubernetes.io/name=prometheus-adapter app.kubernetes.io/part-of=openshift-monitoring app.kubernetes.io/version=0.8.4 pod-template-hash=7785cf7594 Annotations: openshift.io/scc: restricted workload.openshift.io/management: {"effect": "PreferredDuringScheduling"} Status: Pending IP: IPs: <none> Controlled By: ReplicaSet/prometheus-adapter-7785cf7594 Containers: prometheus-adapter: Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6400b1199456905c17cfdf73b72b606b6f081d533cbcc823d4ae050e4ef63390 Port: 6443/TCP Host Port: 0/TCP Args: --prometheus-auth-config=/etc/prometheus-config/prometheus-config.yaml --config=/etc/adapter/config.yaml --logtostderr=true --metrics-relist-interval=1m --prometheus-url=https://prometheus-k8s.openshift-monitoring.svc:9091 --secure-port=6443 --client-ca-file=/etc/tls/private/client-ca-file --requestheader-client-ca-file=/etc/tls/private/requestheader-client-ca-file --requestheader-allowed-names=kube-apiserver-proxy,system:kube-apiserver-proxy,system:openshift-aggregator --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --tls-cert-file=/etc/tls/private/tls.crt --tls-private-key-file=/etc/tls/private/tls.key Requests: cpu: 1m memory: 25Mi Environment: <none> Mounts: /etc/adapter from config (rw) /etc/prometheus-config from prometheus-adapter-prometheus-config (rw) /etc/ssl/certs from serving-certs-ca-bundle (rw) /etc/tls/private from tls (ro) /tmp from tmpfs (rw) /var/run/secrets/kubernetes.io/serviceaccount from prometheus-adapter-token-r8pjh (ro) Conditions: Type Status PodScheduled False Volumes: tmpfs: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> config: Type: ConfigMap (a volume populated by a ConfigMap) Name: adapter-config Optional: false prometheus-adapter-prometheus-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: prometheus-adapter-prometheus-config Optional: false serving-certs-ca-bundle: Type: ConfigMap (a volume populated by a ConfigMap) Name: serving-certs-ca-bundle Optional: false tls: Type: Secret (a volume populated by a Secret) SecretName: prometheus-adapter-12shoolsvvf93 Optional: false prometheus-adapter-token-r8pjh: Type: Secret (a volume populated by a Secret) SecretName: prometheus-adapter-token-r8pjh Optional: false QoS Class: Burstable Node-Selectors: kubernetes.io/os=linux Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 67m default-scheduler 0/3 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 67m default-scheduler 0/3 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 66m default-scheduler 0/5 nodes are available: 2 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 65m default-scheduler 0/5 nodes are available: 2 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 65m default-scheduler 0/5 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity rules, 2 node(s) didn't match pod anti-affinity rules, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 65m default-scheduler 0/5 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity rules, 2 node(s) didn't match pod anti-affinity rules, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 64m default-scheduler 0/5 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity rules, 2 node(s) didn't match pod anti-affinity rules, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. describe the ContainerCreating pod # oc -n openshift-monitoring describe pod prometheus-adapter-6d9fc84f4c-2jq9s Name: prometheus-adapter-6d9fc84f4c-2jq9s Namespace: openshift-monitoring Priority: 2000000000 Priority Class Name: system-cluster-critical Node: ip-10-0-150-214.ap-south-1.compute.internal/10.0.150.214 Start Time: Mon, 19 Apr 2021 00:56:35 -0400 Labels: app.kubernetes.io/component=metrics-adapter app.kubernetes.io/managed-by=cluster-monitoring-operator app.kubernetes.io/name=prometheus-adapter app.kubernetes.io/part-of=openshift-monitoring app.kubernetes.io/version=0.8.4 pod-template-hash=6d9fc84f4c Annotations: openshift.io/scc: restricted workload.openshift.io/management: {"effect": "PreferredDuringScheduling"} Status: Pending IP: IPs: <none> Controlled By: ReplicaSet/prometheus-adapter-6d9fc84f4c Containers: prometheus-adapter: Container ID: Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6400b1199456905c17cfdf73b72b606b6f081d533cbcc823d4ae050e4ef63390 Image ID: Port: 6443/TCP Host Port: 0/TCP Args: --prometheus-auth-config=/etc/prometheus-config/prometheus-config.yaml --config=/etc/adapter/config.yaml --logtostderr=true --metrics-relist-interval=1m --prometheus-url=https://prometheus-k8s.openshift-monitoring.svc:9091 --secure-port=6443 --client-ca-file=/etc/tls/private/client-ca-file --requestheader-client-ca-file=/etc/tls/private/requestheader-client-ca-file --requestheader-allowed-names=kube-apiserver-proxy,system:kube-apiserver-proxy,system:openshift-aggregator --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --tls-cert-file=/etc/tls/private/tls.crt --tls-private-key-file=/etc/tls/private/tls.key State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Requests: cpu: 1m memory: 25Mi Environment: <none> Mounts: /etc/adapter from config (rw) /etc/prometheus-config from prometheus-adapter-prometheus-config (rw) /etc/ssl/certs from serving-certs-ca-bundle (rw) /etc/tls/private from tls (ro) /tmp from tmpfs (rw) /var/run/secrets/kubernetes.io/serviceaccount from prometheus-adapter-token-r8pjh (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: tmpfs: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> config: Type: ConfigMap (a volume populated by a ConfigMap) Name: adapter-config Optional: false prometheus-adapter-prometheus-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: prometheus-adapter-prometheus-config Optional: false serving-certs-ca-bundle: Type: ConfigMap (a volume populated by a ConfigMap) Name: serving-certs-ca-bundle Optional: false tls: Type: Secret (a volume populated by a Secret) SecretName: prometheus-adapter-7gdh4nvu6vtep Optional: false prometheus-adapter-token-r8pjh: Type: Secret (a volume populated by a Secret) SecretName: prometheus-adapter-token-r8pjh Optional: false QoS Class: Burstable Node-Selectors: kubernetes.io/os=linux Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 66m default-scheduler Successfully assigned openshift-monitoring/prometheus-adapter-6d9fc84f4c-2jq9s to ip-10-0-150-214.ap-south-1.compute.internal Warning FailedScheduling 74m default-scheduler 0/3 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 74m default-scheduler 0/3 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 72m default-scheduler 0/3 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 71m default-scheduler 0/3 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 68m default-scheduler 0/5 nodes are available: 2 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 67m default-scheduler 0/5 nodes are available: 2 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 74m default-scheduler 0/3 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedMount 51m (x2 over 55m) kubelet Unable to attach or mount volumes: unmounted volumes=[tls], unattached volumes=[serving-certs-ca-bundle tls prometheus-adapter-token-r8pjh tmpfs config prometheus-adapter-prometheus-config]: timed out waiting for the condition Warning FailedMount 48m (x4 over 64m) kubelet Unable to attach or mount volumes: unmounted volumes=[tls], unattached volumes=[prometheus-adapter-prometheus-config serving-certs-ca-bundle tls prometheus-adapter-token-r8pjh tmpfs config]: timed out waiting for the condition Warning FailedMount 30m (x5 over 53m) kubelet Unable to attach or mount volumes: unmounted volumes=[tls], unattached volumes=[prometheus-adapter-token-r8pjh tmpfs config prometheus-adapter-prometheus-config serving-certs-ca-bundle tls]: timed out waiting for the condition Warning FailedMount 26m kubelet Unable to attach or mount volumes: unmounted volumes=[tls], unattached volumes=[config prometheus-adapter-prometheus-config serving-certs-ca-bundle tls prometheus-adapter-token-r8pjh tmpfs]: timed out waiting for the condition Warning FailedMount 21m kubelet Unable to attach or mount volumes: unmounted volumes=[tls], unattached volumes=[tls prometheus-adapter-token-r8pjh tmpfs config prometheus-adapter-prometheus-config serving-certs-ca-bundle]: timed out waiting for the condition Warning FailedMount <invalid> (x5 over 60m) kubelet Unable to attach or mount volumes: unmounted volumes=[tls], unattached volumes=[tmpfs config prometheus-adapter-prometheus-config serving-certs-ca-bundle tls prometheus-adapter-token-r8pjh]: timed out waiting for the condition Warning FailedMount <invalid> (x50 over 66m) kubelet MountVolume.SetUp failed for volume "tls" : secret "prometheus-adapter-7gdh4nvu6vtep" not found # oc -n openshift-monitoring get rs prometheus-adapter-7785cf7594 -oyaml | grep "prometheus-adapter-7gdh4nvu6vtep" no result # oc -n openshift-monitoring get rs prometheus-adapter-6d9fc84f4c -oyaml | grep "prometheus-adapter-7gdh4nvu6vtep" -C3 - name: tls secret: defaultMode: 420 secretName: prometheus-adapter-7gdh4nvu6vtep status: fullyLabeledReplicas: 2 observedGeneration: 1 the secret should be prometheus-adapter-12shoolsvvf93 # oc -n openshift-monitoring get secret | grep prometheus-adapter prometheus-adapter-12shoolsvvf93 Opaque 4 93m prometheus-adapter-dockercfg-wp4bv kubernetes.io/dockercfg 1 97m prometheus-adapter-tls kubernetes.io/tls 2 99m prometheus-adapter-token-gqp6k kubernetes.io/service-account-token 4 97m prometheus-adapter-token-r8pjh kubernetes.io/service-account-token 4 100m the deployment also uses prometheus-adapter-12shoolsvvf93 secret # # oc -n openshift-monitoring get deploy prometheus-adapter -oyaml | grep "prometheus-adapter-12shoolsvvf93" -C3 - name: tls secret: defaultMode: 420 secretName: prometheus-adapter-12shoolsvvf93 status: conditions: - lastTransitionTime: "2021-04-19T04:48:39Z" replicaset prometheus-adapter-7785cf7594 desired number is 1, and it uses prometheus-adapter-12shoolsvvf93 secret, which is the same with prometheus-adapter deployment # oc -n openshift-monitoring get rs prometheus-adapter-7785cf7594 -oyaml | grep "prometheus-adapter-12shoolsvvf93" -C3 - name: tls secret: defaultMode: 420 secretName: prometheus-adapter-12shoolsvvf93 status: fullyLabeledReplicas: 1 observedGeneration: 1 replicaset prometheus-adapter-6d9fc84f4c desired number is 2, and it does not use prometheus-adapter-12shoolsvvf93 secret, it ues prometheus-adapter-7gdh4nvu6vtep secret whihc is not found as we see from above # oc -n openshift-monitoring get rs prometheus-adapter-6d9fc84f4c -oyaml | grep "prometheus-adapter-12shoolsvvf93" -C3 no result Version-Release number of selected component (if applicable): 4.8.0-0.nightly-2021-04-18-101412 prometheus-adapter 0.8.4 Kubernetes Version: v1.21.0-rc.0+2993be8 How reproducible: in 2 worker nodes cluster Steps to Reproduce: 1. see from the description 2. 3. Actual results: reconciling PrometheusAdapter Deployment failed in 2 worker nodes cluster Expected results: no issue Additional info:
Adding more info which we have observed while doing the RCA: ================================================================ Intially the deployment tried to rollout but it saw that workers nodes were not ready so the pods went into ContainerCreating state, but once the workers were ready deployment could not recongize and it started to rollout another replicaset, thus causing a race and unable to handle itself.
Run the following command can recover the env #oc -n openshift-monitoring delete rs prometheus-adapter-7785cf759
(In reply to RamaKasturi from comment #2) > Adding more info which we have observed while doing the RCA: > ================================================================ > Intially the deployment tried to rollout but it saw that workers nodes were > not ready so the pods went into ContainerCreating state, but once the > workers were ready deployment could not recongize and it started to rollout > another replicaset, thus causing a race and unable to handle itself. the workers are all ready, no NotReady nodes, I recreated a cluster to file the bug
This issue is very similar to bug 1950761 that was filled for SNO. This bug was a regression from applying HA conventions to prometheus-adapter. Part of the conventions we added are hard anti-affinity on hostname and maxUnavailability set to 25% as defined for operand with 2 replicas. So far, we've noticed the same status reported for the prometheus-adapter deployment in both cases: ``` status: conditions: - lastTransitionTime: "2021-04-19T04:48:39Z" lastUpdateTime: "2021-04-19T04:48:39Z" message: Deployment does not have minimum availability. reason: MinimumReplicasUnavailable status: "False" type: Available - lastTransitionTime: "2021-04-19T05:03:53Z" lastUpdateTime: "2021-04-19T05:03:53Z" message: ReplicaSet "prometheus-adapter-7785cf7594" has timed out progressing. reason: ProgressDeadlineExceeded status: "False" type: Progressing observedGeneration: 2 replicas: 3 unavailableReplicas: 3 updatedReplicas: 1 ``` What's weird is that the number of replicas reported in the status is 3 even though we set it to 2 in the deployment's spec. This seems to be caused by the creation of a second replicaset for prometheus-adapter. During rollout, instead of moving all the replicas to the new replicaset, the first replicaset is scaled up to 2 and the second one is scaled up to 1, thus causing an issue with our anti-affinity rule since there are only 2 nodes for 3 pods. Also, it might be worth noting that the `ProgressDeadlineExceeded` status is only observed for the second replicaset, maybe because of the hard anti-affinity rule that prevent the pod from being scheduled. Hence, my question would be why do we end-up with 3 replicas even though we asked for 2? Is it a normal behavior during rolling update or is there an issue with our configuration?
Hi, We have seen failures in MA CI which can be related. I have linked below the failing jobs. We have seen consistent failures from 2 days now. Can be marked as a blocker? https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-remote-libvirt-s390x-4.8/1384114697518714880 https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-remote-libvirt-ppc64le-4.8/1383933519910146048
The prometheus-adapter deploy has 2 rs: [root@localhost roottest]# oc get rs |grep prometheus-adapter prometheus-adapter-68c598744f 2 2 0 100m prometheus-adapter-6d847c6c8d 1 1 0 95m The first rs's pods failed with error: 3m25s Warning FailedMount pod/prometheus-adapter-68c598744f-wfkdn MountVolume.SetUp failed for volume "tls" : secret "prometheus-adapter-58qc3mf19bli0" not found The second rs' pod failed with error: [root@localhost roottest]# oc get po |grep prometheus-adapter prometheus-adapter-68c598744f-jm8mq 0/1 ContainerCreating 0 102m prometheus-adapter-68c598744f-wfkdn 0/1 ContainerCreating 0 102m prometheus-adapter-6d847c6c8d-nrbn2 0/1 Pending 0 97m [root@localhost roottest]# oc get events |grep prometheus-adapter-6d847c6c8d-nrbn2 .... 6m8s Warning FailedScheduling pod/prometheus-adapter-6d847c6c8d-nrbn2 0/5 nodes are available: 2 node(s) didn't match pod affinity/anti-affinity rules, 2 node(s) didn't match pod anti-affinity rules, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
(In reply to Damien Grisonnet from comment #5) > This issue is very similar to bug 1950761 that was filled for SNO. This bug > was a regression from applying HA conventions to prometheus-adapter. Part of > the conventions we added are hard anti-affinity on hostname and > maxUnavailability set to 25% as defined for operand with 2 replicas. > > So far, we've noticed the same status reported for the prometheus-adapter > deployment in both cases: > > ``` > status: > conditions: > - lastTransitionTime: "2021-04-19T04:48:39Z" > lastUpdateTime: "2021-04-19T04:48:39Z" > message: Deployment does not have minimum availability. > reason: MinimumReplicasUnavailable > status: "False" > type: Available > - lastTransitionTime: "2021-04-19T05:03:53Z" > lastUpdateTime: "2021-04-19T05:03:53Z" > message: ReplicaSet "prometheus-adapter-7785cf7594" has timed out > progressing. > reason: ProgressDeadlineExceeded > status: "False" > type: Progressing > observedGeneration: 2 > replicas: 3 > unavailableReplicas: 3 > updatedReplicas: 1 > > ``` > What's weird is that the number of replicas reported in the status is 3 even > though we set it to 2 in the deployment's spec. This seems to be caused by > the creation of a second replicaset for prometheus-adapter. During rollout, > instead of moving all the replicas to the new replicaset, the first > replicaset is scaled up to 2 and the second one is scaled up to 1, thus > causing an issue with our anti-affinity rule since there are only 2 nodes > for 3 pods. Also, it might be worth noting that the > `ProgressDeadlineExceeded` status is only observed for the second > replicaset, maybe because of the hard anti-affinity rule that prevent the > pod from being scheduled. > > Hence, my question would be why do we end-up with 3 replicas even though we > asked for 2? Is it a normal behavior during rolling update or is there an > issue with our configuration? The "3 replicas" is because the "Rollover" feature of the deployment, it's expected .
with the fix of bug 1950761, no such issue now # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-04-19-225513 True False 43m Cluster version is 4.8.0-0.nightly-2021-04-19-225513 # oc get no | grep worker ip-10-0-132-81.ap-south-1.compute.internal Ready worker 57m v1.21.0-rc.0+98d91ef ip-10-0-160-93.ap-south-1.compute.internal Ready worker 57m v1.21.0-rc.0+98d91ef # oc get co monitoring NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE monitoring 4.8.0-0.nightly-2021-04-19-225513 True False False 39m # oc -n openshift-monitoring get po | grep prometheus-adapter prometheus-adapter-69876c9996-26wg9 1/1 Running 0 42m prometheus-adapter-69876c9996-6b4pk 1/1 Running 0 42m # oc -n openshift-monitoring get deploy prometheus-adapter NAME READY UP-TO-DATE AVAILABLE AGE prometheus-adapter 2/2 2 2 64m # oc -n openshift-monitoring get rs | grep prometheus-adapter prometheus-adapter-69876c9996 2 2 2 64m close this bug
*** This bug has been marked as a duplicate of bug 1950761 ***