Bug 1942552
Summary: | OCP update from 4.7.1 to 4.7.2 hangs during openshift-apiserver CO update | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Romain P <romain.pochard> | ||||
Component: | openshift-apiserver | Assignee: | Standa Laznicka <slaznick> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Xingxing Xia <xxia> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 4.7 | CC: | aos-bugs, mfojtik | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | s390x | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2021-04-15 11:45:50 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Is this connected to https://bugzilla.redhat.com/show_bug.cgi?id=1942725, i.e. StackRox installation? (In reply to Stefan Schimanski from comment #1) > Is this connected to https://bugzilla.redhat.com/show_bug.cgi?id=1942725, > i.e. StackRox installation? Hi Stefan, I didn't install StackRox but yes it was related to a wrong SCC in openshift-apiserver as describe in your link. The failing pod returned this SCC: openshift.io/scc: logging-elk-filebeat-ds As the working pods returned: openshift.io/scc: node-exporter But I really don't know why this SCC was applied to this pod. We were able to manually bypass this and to complete the upgrade to 7.2. Unfortunately I don't have another cluster to test again the update 4.7.1 => 4.7.2 to try to reproduce the issue. I can see this has the same symptoms as the referenced Stackrox BZ (read-only root FS for the openshift-apiserver pod), going to close as a duplicate as the fix is the same for both the cases. *** This bug has been marked as a duplicate of bug 1942725 *** |
Created attachment 1765948 [details] describe pod apiserver-658846ccbd-8pqrb Description of problem: I started an OCP update from 4.7.1 to 4.7.2. Update hangs during openshift-apiserver ClusterOperators upgrade (7 of 31 / 23%): "APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver (2 crashlooping containers are waiting in apiserver-658846ccbd-8pqrb pod)" One pod of openshift-apiserver failed "CrashLoopBackOff". See output below Version-Release number of selected component (if applicable): How reproducible: Don't have another cluster to do the same test Steps to Reproduce: 1. Start OCP update from 4.7.1 to 4.7.2 Actual results: Update hangs forever. Cannot update the openshift-apiserver CO properly. Openshift-apiserver in degraded state Expected results: Update successful, OCP in version 4.7.2 Additional info: oc get clusteroperators NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.7.1 True False False 8d baremetal 4.7.1 True False False 15d cloud-credential 4.7.1 True False False 15d cluster-autoscaler 4.7.1 True False False 15d config-operator 4.7.2 True False False 15d console 4.7.1 True False False 13d csi-snapshot-controller 4.7.1 True False False 13d dns 4.7.1 True False False 15d etcd 4.7.2 True False False 15d image-registry 4.7.1 True False False 13d ingress 4.7.1 True False False 15d insights 4.7.1 True False False 15d kube-apiserver 4.7.2 True False False 15d kube-controller-manager 4.7.2 True False False 15d kube-scheduler 4.7.2 True False False 15d kube-storage-version-migrator 4.7.1 True False False 13d machine-api 4.7.2 True False False 15d machine-approver 4.7.1 True False False 15d machine-config 4.7.1 True False False 13d marketplace 4.7.1 True False False 13d monitoring 4.7.1 True False False 15d network 4.7.1 True False False 15d node-tuning 4.7.1 True False False 13d openshift-apiserver 4.7.2 True False True 13d openshift-controller-manager 4.7.1 True False False 13d openshift-samples 4.7.1 True False False 13d operator-lifecycle-manager 4.7.1 True False False 15d operator-lifecycle-manager-catalog 4.7.1 True False False 15d operator-lifecycle-manager-packageserver 4.7.1 True False False 13d service-ca 4.7.1 True False False 15d storage 4.7.1 True False False 15d oc get pods -n openshift-apiserver NAME READY STATUS RESTARTS AGE apiserver-575f7f5c59-l4jbr 2/2 Running 0 13d apiserver-575f7f5c59-qz6wc 2/2 Running 0 13d apiserver-658846ccbd-8pqrb 0/2 CrashLoopBackOff 106 4h6m oc logs apiserver-658846ccbd-8pqrb -n openshift-apiserver openshift-apiserver Copying system trust bundle cp: cannot remove '/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem': Read-only file system I attached the output of "oc describe pod apiserver-658846ccbd-8pqrb -n openshift-apiserver openshift-apiserver"