Hide Forgot
Description of problem: image registry is degraded after upgrading with error in pod as below: $ oc logs pods/image-registry-6f4c7b9569-mgtgs mkdir: cannot create directory '/etc/pki/ca-trust/extracted/edk2': File exists mkdir: cannot create directory '/etc/pki/ca-trust/extracted/java': File exists mkdir: cannot create directory '/etc/pki/ca-trust/extracted/openssl': File exists mkdir: cannot create directory '/etc/pki/ca-trust/extracted/pem': File exists $ oc get co | grep image-registry image-registry 4.6.0-0.nightly-2021-04-09-145812 False True False 55m $ oc get pods NAME READY STATUS RESTARTS AGE cluster-image-registry-operator-65968cd5c9-n4cph 1/1 Running 1 57m image-registry-6f4c7b9569-jphzm 0/1 CrashLoopBackOff 15 59m image-registry-6f4c7b9569-mgtgs 0/1 CrashLoopBackOff 15 59m node-ca-228kt 1/1 Running 0 79m node-ca-4r4sv 1/1 Running 0 79m node-ca-785bv 1/1 Running 0 80m node-ca-hbj6x 1/1 Running 0 80m node-ca-scnrg 1/1 Running 0 79m Version-Release number of selected component (if applicable): 4.6.24 to 4.6.0-0.nightly-2021-04-09-145812 How reproducible: always Steps to Reproduce: 1.Upgrade from 4.6.24 to 4.6.0-0.nightly-2021-04-09-145812 2. 3. Actual results: image registry is degraded Expected results: image registry should be available after upgrade. Additional info:
*** Bug 1949086 has been marked as a duplicate of this bug. ***
I'm a bit confused. This bug is now a child of bug 1897520, and is backporting a fix that landed in 4.7 in November. How is it only impacting 4.6 now? Has this been an issue with all 4.6->4.6 updates, and we only noticed now? Or is this a corner case that only impacts some fraction of 4.6->4.6 updates? Or...?
Who is impacted? Anyone who uses 4.6.24 and later 4.6 without the fix, if the registry processes crash or restart for any reason after the pod is created. What is the impact? The registry does not survive restarts, once the process is restarted it enters into a crash loop. Manual intervention is needed. How involved is remediation? Deleting image-registry pods should bring the registry back to the normal state. Updating to a fixed release will also recover the registry. Is this a regression? Yes, we regressed in 4.6.24 while fixing bug 1936984.
I am not clear on why QE has been able to consistently reproduce this, since comment 4 claims the need for some kind of initial crash inside the pod to get to the broken state. And [1] shows the cluster-bot update from 4.6.24 to 4.6.0-0.nightly-2021-04-09-145812 that I launched today, which succeeded without hitting this issue [2]. I dunno what could be different between QE's updates and the cluster-bot update run... [1]: https://amd64.ocp.releases.ci.openshift.org/releasestream/4-stable/release/4.6.24#upgrades-to [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp/1382007608419815424
This bug can also be reproduced sometime from 4.5.37-x86_64 to 4.6.24-x86_64: https://mastern-jenkins-csb-openshift-qe.apps.ocp4.prod.psi.redhat.com/job/upgrade_CI/13262/console
Verified with several upgrade paths from/to 4.6.0-0.nightly-2021-04-14-161003: https://docs.google.com/spreadsheets/d/1T-tmF1tjNmuNTgMvve9ZkeiUvFLXl1Y3-t55Kfj8egQ/edit#gid=0
(In reply to Wenjing Zheng from comment #10) > This bug can also be reproduced sometime from 4.5.37-x86_64 to > 4.6.24-x86_64: > https://mastern-jenkins-csb-openshift-qe.apps.ocp4.prod.psi.redhat.com/job/ > upgrade_CI/13262/console Sorry, should be this job: https://mastern-jenkins-csb-openshift-qe.apps.ocp4.prod.psi.redhat.com/job/upgrade_CI/13164/consoleFull
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6.25 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1153
Removing UpgradeBlocker, because I don't think we blocked any update recommendations based on this bug.