Descriptionmchebbi@redhat.com
2020-06-18 11:34:56 UTC
The customer did an upgrade of his cluster from 4.2.25 => 4.3.18 but he gets seral operators not updated. I have fixed the machine-config operator but still the prometheus-operator is not ready and the monitoring operator degraded
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
dns 4.2.25 True True False 160d
monitoring 4.2.25 False True True 14d
network 4.2.25 True True False 160d
sl-uosbast1t:~ # oc project openshift-monitoring
Now using project "openshift-monitoring" on server "https://api.ocp.corp.wan:6443".
sl-uosbast1t:~ # oc get pods | grep operator
cluster-monitoring-operator-bc89787ff-fptbl 1/1 Running 0 82d
prometheus-operator-f8fc5b975-fh8vg 1/1 Running 0 82d
sl-uosbast1t:~ # oc get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
cluster-monitoring-operator 1/1 1 1 160d
grafana 1/1 1 1 160d
kube-state-metrics 1/1 1 1 160d
openshift-state-metrics 1/1 1 1 160d
prometheus-adapter 2/2 2 2 160d
prometheus-operator 0/1 1 0 160d
telemeter-client 0/1 1 0 61d
conditions:
- lastTransitionTime: 2020-06-16T13:50:21Z
message: Rolling out the stack.
reason: RollOutInProgress
status: "True"
type: Progressing
- lastTransitionTime: 2020-06-16T09:04:25Z
message: 'Failed to rollout the stack. Error: running task Updating Prometheus Operator failed: reconciling Prometheus Operator Deployment failed: updating deployment object failed: waiting for DeploymentRollout of prometheus-operator: deployment prometheus-operator is not ready. status: (replicas: 1, updated: 1, ready: 0, unavailable: 1)'
reason: UpdatingPrometheusOperatorFailed
status: "True"
type: Degraded
- lastTransitionTime: 2020-06-16T13:50:21Z
message: Rollout of the monitoring stack is in progress. Please wait until it
finishes.
reason: RollOutInProgress
status: "True"
type: Upgradeable
All needed information are available through this link : https://bit.ly/2YdDGZi
Setting target release to the active development branch (4.6.0). For fixes, if any, requested/required on previous versions, cloned BZs targeting those release z-streams will be created.
You have 6 nodes:
3 of those are master nodes which are tainted by default and pods are not allowed to be scheduled there
3 of those (worker nodes) don't have enough CPU for new nodes.
You need to increase worker nodes resources or add new nodes.