Hide Forgot
Seen in a 4.5.11 cluster's Insights tarball: $ tar -xOz config/imageregistry.json < "$(ls | tail -n1)" | jq '{spec: (.spec | {managementState, replicas}), status: (.status | {readyReplicas})}' { "spec": { "managementState": "Managed", "replicas": 2 }, "status": { "readyReplicas": 0 } } But despite having no ready replicas, the registry is claiming Available=True: $ tar -xOz config/imageregistry.json < "$(ls | tail -n1)" | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + (.reason // "-") + ": " + (.message // "-")' | sort 2020-09-30T03:51:17Z ImageConfigControllerDegraded=False AsExpected: - 2020-09-30T03:51:17Z NodeCADaemonControllerDegraded=False AsExpected: - 2020-09-30T03:51:23Z Degraded=False -: - 2020-09-30T03:51:23Z Removed=False -: - 2020-09-30T04:26:53Z Available=True Ready: The registry is ready 2020-12-09T13:46:05Z ImageRegistryCertificatesControllerDegraded=False AsExpected: - 2020-12-14T12:32:26Z Progressing=False Ready: The registry is ready 2020-12-14T12:32:26Z StorageExists=True GCS Bucket Exists: - I would have expected Available=False.
readyReplicas is always 0, the operator is not aware of this field. As we cannot remove this field from API, perhaps the operator should report the number of the registry replicas without node-ca and cron jobs.
readyReplicas looks like a non-deprecated v1 property [1]. Can't you pass through the status value from the Deployment? Picking a random CI job: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1356101157344251904/artifacts/e2e-aws-upgrade/deployments.json | jq '.items[] | select(.metadata.name == "i mage-registry").status | {replicas, availableReplicas, updatedReplicas, readyReplicas}' { "replicas": 2, "availableReplicas": 2, "updatedReplicas": 2, "readyReplicas": 2 } [1]: https://github.com/openshift/api/blob/a9e731090f5ed361e5ab887d0ccd55c1db7fc633/imageregistry/v1/00-crd.yaml#L1111-L1113
It came from OperatorStatus [1]. Yes, most likely we'll just pass through the value from the Deployment, as nobody cares about the node-ca daemonset. Hopefully one day we find a new home for node-ca. [1]: https://github.com/openshift/api/blob/a9e731090f5ed361e5ab887d0ccd55c1db7fc633/operator/v1/types.go#L120
When replica is set to 0, Available=False as blow: NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE image-registry 4.8.0-0.ci.test-2021-04-07-073739-ci-ln-g8qkn42 False False True 5m6s
Wenjing, this is a bit different problem. The registry with alive replicas used to report .status.readyReplicas == 0 on config.imageregistry.operator.openshift.io/cluster. This should be fixed and now it should be equal to .spec.replicas when everything works fine.
(In reply to Oleg Bulatov from comment #8) > Wenjing, this is a bit different problem. > > The registry with alive replicas used to report .status.readyReplicas == 0 > on config.imageregistry.operator.openshift.io/cluster. This should be fixed > and now it should be equal to .spec.replicas when everything works fine. Thanks for the reminder! Yes, I can see .status.readyReplicas is reflecting the same value with .spec.replicas now.
Verified on 4.8.0-0.nightly-2021-04-09-222447.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438