Bug 1813512
Summary: | 4.3 -> 4.4 update hangs on ClusterOperatorDegraded, NodeInstallerDegraded, RevisionControllerDegraded, TargetConfigControllerDegraded | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | W. Trevor King <wking> | |
Component: | kube-controller-manager | Assignee: | Tomáš Nožička <tnozicka> | |
Status: | CLOSED ERRATA | QA Contact: | zhou ying <yinzhou> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 4.4 | CC: | aos-bugs, maszulik, mfojtik, sdodson, tnozicka, vrutkovs | |
Target Milestone: | --- | Keywords: | Upgrades | |
Target Release: | 4.4.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause:
Upgrade race when snaphotting KCM localhost-recovery-client-token
Consequence:
Failed upgrade
Fix:
Fixed secret snaphotting to wait for token to be injected before making a snaphot
Result: Upgrade works
|
Story Points: | --- | |
Clone Of: | ||||
: | 1819256 (view as bug list) | Environment: | ||
Last Closed: | 2020-05-04 11:46:21 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1819256 | |||
Bug Blocks: |
Description
W. Trevor King
2020-03-14 04:58:22 UTC
The number of installer revisions is outstanding. Mar 13 22:16:57.566 E clusterversion/version changed Failing to True: ClusterOperatorDegraded: Cluster operator kube-controller-manager is reporting a failure: NodeInstallerDegraded: 1 nodes are failing on revision 111:\nNodeInstallerDegraded: \nTargetConfigControllerDegraded: "serviceaccount/localhost-recovery-client": secret openshift-kube-controller-manager/localhost-recovery-client-token hasn't been populated with SA token yet: missing SA UID Mar 13 22:26:12.601 E clusterversion/version changed Failing to True: ClusterOperatorDegraded: Cluster operator kube-controller-manager is reporting a failure: NodeInstallerDegraded: 1 nodes are failing on revision 155:\nNodeInstallerDegraded: \nTargetConfigControllerDegraded: "serviceaccount/localhost-recovery-client": secret openshift-kube-controller-manager/localhost-recovery-client-token hasn't been populated with SA token yet: missing SA UID Mar 13 22:44:12.670 E clusterversion/version changed Failing to True: ClusterOperatorDegraded: Cluster operator kube-controller-manager is reporting a failure: NodeInstallerDegraded: 1 nodes are failing on revision 250:\nNodeInstallerDegraded: \nTargetConfigControllerDegraded: "serviceaccount/localhost-recovery-client": secret openshift-kube-controller-manager/localhost-recovery-client-token hasn't been populated with SA token yet: missing SA UID Mar 13 22:52:57.703 E clusterversion/version changed Failing to True: ClusterOperatorDegraded: Cluster operator kube-controller-manager is reporting a failure: NodeInstallerDegraded: 1 nodes are failing on revision 298:\nNodeInstallerDegraded: \nRevisionControllerDegraded: conflicting latestAvailableRevision 303\nTargetConfigControllerDegraded: "serviceaccount/localhost-recovery-client": secret openshift-kube-controller-manager/localhost-recovery-client-token hasn't been populated with SA token yet: missing SA UID Also this looks pretty fatal: Mar 13 23:05:31.216 W ns/openshift-kube-controller-manager pod/kube-controller-manager-ci-op-vf25zxdf-7f637-cb7p6-master-0 Unable to attach or mount volumes: unmounted volumes=[resource-dir cert-dir], unattached volumes=[resource-dir cert-dir]: timed out waiting for the condition I didn't find the exact cause yet but I'll continue in the morning. Moving it to 4.4.0 as it looks fairly bad so far. ``` { "apiVersion": "v1", "kind": "Secret", "metadata": { "annotations": { "kubernetes.io/service-account.name": "localhost-recovery-client" }, "creationTimestamp": "2020-03-13T21:57:13Z", "name": "localhost-recovery-client-token", "namespace": "openshift-kube-controller-manager", "resourceVersion": "24994", "selfLink": "/api/v1/namespaces/openshift-kube-controller-manager/secrets/localhost-recovery-client-token", "uid": "853d9788-b36f-459c-b26b-ee8e5e70b45f" }, "type": "kubernetes.io/service-account-token" } ``` is missing the token that should have been injected by kube-controller-manager. I was considering it could be fighting with apply but the kube-controller-manager-operator logs are missing SecretUpdated events. Unfortunately, the audit logs that could prove either way are corrupted and contain no mention of secret openshift-kube-controller-manager/localhost-recovery-client-token, not even the create audit event. ``` { "apiVersion": "v1", "data": { "ca.crt": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURNakNDQWhxZ0F3SUJBZ0lJSCs1L0tCZ21DdWt3RFFZSktvWklodmNOQVFFTEJRQXdOekVTTUJBR0ExVUUKQ3hNSmIzQmxibk5vYVdaME1TRXdId1lEVlFRREV4aHJkV0psTFdGd2FYTmxjblpsY2kxc1lpMXphV2R1WlhJdwpIaGNOTWpBd016RXpNakExT0RRMVdoY05NekF3TXpFeE1qQTFPRFExV2pBM01SSXdFQVlEVlFRTEV3bHZjR1Z1CmMyaHBablF4SVRBZkJnTlZCQU1UR0d0MVltVXRZWEJwYzJWeWRtVnlMV3hpTFhOcFoyNWxjakNDQVNJd0RRWUoKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBS2VlTWZTUC9jUld4N1hXTGhqZ3hKV1F0WTNlVzY3SgptL1FBSHY1SGRtaGttdlZRMHplWitIRHVtSktWYTFjNmpyL240NXh6OUs2R2VhSFZoVTlxVUZubzhDZXFpcStVCkZ3dzNXNERNaWJoRWpVUmVQbWNGZys1TnZxcjJLaVQ0U3J0SUZtMTRmMWRINzcyNElaUUZ6ckxjRUF2QUVCTnUKK05ieG9MbnU2WVNhR1BYZTFDbTNiNkViUTdLeTMraGczZGxMcVRTTzlQNFhPRmtrdDdKRHhsdkd5K3Vydnd2OApNcGlKa1NMSDVITEppY2VFMHo2czlZbnAvd0lHNTFqZERUKzEzRDZ3VVZKYUEwd0gzSXdINy9lejNEZTZEZ0JRCldLdlNtMXJtNkl0aTJublN5ZE51MXdUNDVrSCtscHhMdEJ3NldWL1BMVk5kR0Frb3d3dk84MTBDQXdFQUFhTkMKTUVBd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCL3dRRk1BTUJBZjh3SFFZRFZSME9CQllFRk53awpwKzhrZjJtVTBaU00wRFZYaHkveFc3Mi9NQTBHQ1NxR1NJYjNEUUVCQ3dVQUE0SUJBUUNmQ2JiakoxamtjU2xECjFqVnN5VE5EZmg1a3RLV0t2TENZNHJ1ZnhhN3ZIaE9waVk0UHIrdnB2akRDUzlrZWkrU3VJWjA5YU5sRkIvOFcKM1krckloVUZmME1tOVlqTTgwRmFWR1ZRYUFTNUhJaGw2aUFGYXNIK24yRUkycWNNdldwR3JIcHhudTFMdkJkeApHTTh6elNSeFFycURzaWRoRFJZaVlxb1A2MW9PL01Qalptb0x5UkZRdmpiVTBPaEZCaUNmUWFYUi9VZEJqQWJsCmNQSnFmdlpxenZDRDBmOGN1dncyR1M5endDWXNtR2pkWnZxNi9nQnBNY3FSUERCZmtNQjNUdlpPdzQ5T3VnT1kKNHZza3ZlcHR5eEdMZWFoRVdyb3djOStlWGdhUFQrZjIzU3pJZzNqRklIMldveGFSQ2ZhekhMRkw5WmVSQ2haVgpVN0ZldTFSRAotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCi0tLS0tQkVHSU4gQ0VSVElGSUNBVEUtLS0tLQpNSUlEUURDQ0FpaWdBd0lCQWdJSWRjQkRJZ0RXMFp3d0RRWUpLb1pJaHZjTkFRRUxCUUF3UGpFU01CQUdBMVVFCkN4TUpiM0JsYm5Ob2FXWjBNU2d3SmdZRFZRUURFeDlyZFdKbExXRndhWE5sY25abGNpMXNiMk5oYkdodmMzUXQKYzJsbmJtVnlNQjRYRFRJd01ETXhNekl3TlRnME5Gb1hEVE13TURNeE1USXdOVGcwTkZvd1BqRVNNQkFHQTFVRQpDeE1KYjNCbGJuTm9hV1owTVNnd0pnWURWUVFERXg5cmRXSmxMV0Z3YVhObGNuWmxjaTFzYjJOaGJHaHZjM1F0CmMybG5ibVZ5TUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUEzY1ZWQWlDbTN5T2gKN3ZJSE9DeTRFSlpPUDVlbHBIVkJDNEdwdWVRMEVlek9HbHV6Mnk4TmphMHIxZWxRZUJGeFpzK3dBODZwSEEwSQphMUJwdXVWRVNoNHN4UytkSXhuSmtDeXhrTVBhU1NKc3VvUlphUk11ZkFQMHRweUMxb2pqYnhWOXFFQUREdlVtCmRjbVlHbVd1S292WC9kR0Z5VEgxWG1TSzBZb3RZamNuWUhjWnVlOTVrRHloOUJUQXUvbktSVjZoZWMxbVV4eFQKUlRSYnlJc3l6MEJFZkQwcmtaM2QxOGU4MkRGWjNYQk8vNCtFUi94UERBcm91czJpNHBtSHA1Vm10OWVPTE5wUApnaUpxMFZ1ZHcrMTJKUkl4QXNGOVFvemVId2s0K0JlNVZieVdGWjRNSkdiZHlEd09hOUJVM3FxUEJsSE9TcXJFCmZUbHRGVVliTXdJREFRQUJvMEl3UURBT0JnTlZIUThCQWY4RUJBTUNBcVF3RHdZRFZSMFRBUUgvQkFVd0F3RUIKL3pBZEJnTlZIUTRFRmdRVTJIbWgzUWx5Z1RLRmlza3BXNmZVMTlUYkN3VXdEUVlKS29aSWh2Y05BUUVMQlFBRApnZ0VCQUQ4WEZOZlRtQmt0Ni9ET0hWYUUvRW0wY3BCR3NqMmdpRExGaS83ZlRZVWltSXZ4NXJFUzFwQ2YrZHNWCmxYQ2NaUEt0dzR1aGJOdnlGM04yRUxqdWRWdWlVaEd4cVFvamZMc25WQi8zbnVqT2VhNWo1a1dvKzc4clc1RE0KNXBnY3hsTVM4MElTcGdVV1BKN1ovSzhuV0hVT3NoR2I5bjNwTEgwaWxLOGR4WlRsa0hpekx5WVVuWE80N0FLTQpEMi82eXF6ZWZjSHkyVU84ZnluNkVRRDF2NWtDbGFHUHNIQ2E0WXNRL3ltYUd2UGtpL2RTaG1VN05mT1prc1FkClVUTGxyclFuTGJuT3doN3Y3dGZZZWFPMU9ITDVrZ0VlVFZqVWFBYklDZTE1eEJpdVJ3RnM1NTd6U045bTE1M0IKakQ0VW03bk9FckUvNHNCNUJUdk5DMHdNVFFFPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCi0tLS0tQkVHSU4gQ0VSVElGSUNBVEUtLS0tLQpNSUlEVERDQ0FqU2dBd0lCQWdJSUpMQmJUY2lUeWxnd0RRWUpLb1pJaHZjTkFRRUxCUUF3UkRFU01CQUdBMVVFCkN4TUpiM0JsYm5Ob2FXWjBNUzR3TEFZRFZRUURFeVZyZFdKbExXRndhWE5sY25abGNpMXpaWEoyYVdObExXNWwKZEhkdmNtc3RjMmxuYm1WeU1CNFhEVEl3TURNeE16SXdOVGcwTlZvWERUTXdNRE14TVRJd05UZzBOVm93UkRFUwpNQkFHQTFVRUN4TUpiM0JsYm5Ob2FXWjBNUzR3TEFZRFZRUURFeVZyZFdKbExXRndhWE5sY25abGNpMXpaWEoyCmFXTmxMVzVsZEhkdmNtc3RjMmxuYm1WeU1JSUJJakFOQmdrcWhraUc5dzBCQVFFRkFBT0NBUThBTUlJQkNnS0MKQVFFQTJLcFFXemRnS0lvUEtiR2hVWDN4TkVtdkEwN1J6UFJDWUk1dFViUTJhZ1ZDY2VqaHg1UkVSL1dyTTJGVQp1S0FIY0M2cnVzQnFVUnhNVnJQQ05lKzJ2S3BJV0h4ZnZKa3JOcXkxb2FmdDJWelpqd3JLQk01MHZqQUsxdDF4ClFpNU5aZTFQNHgrTlpRRDdQYnpaMFlvb0l6eEorVzQveTljdFM2Q1hWc0tmaVh5SUw0Z1o1Q0NudHE1bnZiZnQKcHNPTUdiWEV6dTdiT3pXN29wRGdSUHAvai94TzYyTm4wRWs0emNVZ29FZDdpMHRwd3l0ZFJRaXMzbFpJbHJwRApnRjVqK21SSjNBR1REYjY3bmNnVS9nUFFKak5sV1ZOS0drSVdaOVQzRUx2RmgxUHg2MWZyZGdJVXQ1Z2ExVkVFClpGM3NCL0VaTEpMbGRRNis4VVBKNGpucmxRSURBUUFCbzBJd1FEQU9CZ05WSFE4QkFmOEVCQU1DQXFRd0R3WUQKVlIwVEFRSC9CQVV3QXdFQi96QWRCZ05WSFE0RUZnUVVWcWJyTTJkdDJDZlA0bVdnZkt4TWNpZHdpVkV3RFFZSgpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFMcnJaU2FpYW9xRXhBRWl2d1AzMWlQaXVVQVlFQWZKcnlIcnU4NkM3TFpHCi9uMWx1bTQxWTAvek9Pc3VXdTJhWldTM0kzU3BlZUxpK2dLZUJIVlNMMjMveHZCbWgwNVIrSHhLS2Izd25jYkQKYkNDSElxSkd4NUNZMXhRY0xyNVpQcTYxRUx3Uk1Zam9OTG9pdUtTc2x5Qm4zZFovRUtjY2VmY20zY2E2d2ZkUwpma1FlRmY1QkhuM3E5clNNZ093U1lBdkM2SU1HYjRjSTRDckZlNEJQcWVIZnhudGJRdGFXd1JPSytGUTF3alNRClVZdnlSNjVZT2dxaDY2bzFRSVFtT2tJYjh3b0dLTEd5NFdQYnE5eHJ6eUROMVNxNXN1dUZkMm54TUh1dStWajQKK3ZzWC8xMFFMdWxpN214ZGdvMHdDc0M5a09iR0RmYkJRT0t4S2pSVlgvND0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQotLS0tLUJFR0lOIENFUlRJRklDQVRFLS0tLS0KTUlJRGtEQ0NBbmlnQXdJQkFnSUJBVEFOQmdrcWhraUc5dzBCQVFzRkFEQlpNVmN3VlFZRFZRUURERTV2Y0dWdQpjMmhwWm5RdGEzVmlaUzFoY0dselpYSjJaWEl0YjNCbGNtRjBiM0pmYkc5allXeG9iM04wTFhKbFkyOTJaWEo1CkxYTmxjblpwYm1jdGMybG5ibVZ5UURFMU9EUXhNelEwTnpNd0hoY05NakF3TXpFek1qRXlNVEV6V2hjTk16QXcKTXpFeE1qRXlNVEUwV2pCWk1WY3dWUVlEVlFRRERFNXZjR1Z1YzJocFpuUXRhM1ZpWlMxaGNHbHpaWEoyWlhJdApiM0JsY21GMGIzSmZiRzlqWVd4b2IzTjBMWEpsWTI5MlpYSjVMWE5sY25acGJtY3RjMmxuYm1WeVFERTFPRFF4Ck16UTBOek13Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRREd3N0JVR3NqQ1Q3dVQKSGdzaHoyLzRuNUpEbXVKNmVOQzdCV3RIcDZDUEZwSElXNUkzY3Evc1p5MG4rQVNhZXdycTNHelc2WmR0U2ZIWApyaElLZnBhOHEwL1dmcVJqYVVxbENpbEFteU9XY25IYW5pQnJUMzRxY3pONFh2cGk5VkhjSGFxTnM5K0JvVGNaCmJkZ1prT0hqN3VuTmtMa1RoRFNIcXR5TDlyMjc2NFN4ekxiaGZlaXpIbHM3R3VjN0xyVm1KUUN3VUE2SHhISUoKUWgrOXZPZHJ1TFBhWHM0dCtkL1hJWURIbXlzem0rUzZyYnN0SzFGa1pKUzE0TkYwVG9odllYVUdITGcxUEF2Sgo2TUt0N3htOCtNaG53MjFvYjYyQ1ZENGRGM1ozV0JvQmlHY1haalYyNHBXRWw4d0VSUW96K3diYWNEQVVDY1hkCndaNUhTOWRyQWdNQkFBR2pZekJoTUE0R0ExVWREd0VCL3dRRUF3SUNwREFQQmdOVkhSTUJBZjhFQlRBREFRSC8KTUIwR0ExVWREZ1FXQkJSbWdkVXlpTzc5RzA4LytVdk82QTNKUEh6UnVqQWZCZ05WSFNNRUdEQVdnQlJtZ2RVeQppTzc5RzA4LytVdk82QTNKUEh6UnVqQU5CZ2txaGtpRzl3MEJBUXNGQUFPQ0FRRUFyQkRxWm9hY3FNSHpvSDRKCnVZSHovRjZmNVdsVUlNVU9NZG9ta1VmaW1teUQxNFdRWkR0T0d1cUpWMGFLVGo2QllKSmUwMC9jOGNFRnZrcy8KSVVYSklac2taWjM5ZWlCb3JKcXNqcnYzN2EvSzNYUDVQbmRZWWRNazBId1M2SGsyZmVNSFN1aGV3QzUwaFVseAovaHU0REI3Sk1ieW0zVldxK3oyaGRtSHUyUmZZQ0EvYldnT005aVhQUHJ5ZDdia3A5SmVPeGFRZ2pDdzVzem1NClRQNExMWHRpWmhUa2pIdlBsMG90YUg5VGMrMFFIS3FmYm5EcVBkVnk5Yy9kV3ZxaWw5cUUrL3hGR2tUVFg2clcKMzB0WHhmbklDTlU2MC9zMEk1ckdRdytnVkVJckd5bjhUVEJLS29xU3hmOHhvVHJxbmtnNTZhNDFsV00zTU9Bdwp0RmN6OUE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCi0tLS0tQkVHSU4gQ0VSVElGSUNBVEUtLS0tLQpNSUlDN1RDQ0FkV2dBd0lCQWdJQkFUQU5CZ2txaGtpRzl3MEJBUXNGQURBbU1TUXdJZ1lEVlFRRERCdHBibWR5ClpYTnpMVzl3WlhKaGRHOXlRREUxT0RReE16UTNOVEl3SGhjTk1qQXdNekV6TWpFeU5UVXlXaGNOTWpJd016RXoKTWpFeU5UVXpXakFtTVNRd0lnWURWUVFEREJ0cGJtZHlaWE56TFc5d1pYSmhkRzl5UURFMU9EUXhNelEzTlRJdwpnZ0VpTUEwR0NTcUdTSWIzRFFFQkFRVUFBNElCRHdBd2dnRUtBb0lCQVFDN1FGVGkwRUhKNEkvcEZHazVZeFlMCllSUVRUUEYxcHBjRllUMEc4WnlGTHA5QzdVcVBSc091VW5hWDVVMUdWUXI5VmUySXJYaVBxaXhHK0ZabC9zeHcKUjlkdkh5Snpqd21KMEg0NnR3TmNDY0FodTRzcTV5dnhjZm9Hbit1UEpWTGFEVHJEcEpiaFlYUTBoeDQ1Z3lUUQpPS2MralhnTTJ5Sys1TzlwSWNSZlB6Ly9RWTNSRVJPTlZpY0d3eW52UXZmdXhuMGJsZDIzZ0YxQkl1bG84aFhLCm1sWDNyaG11em1RUFNoL2s0YXdyY09qN2hEZWNteitSRW5La0VUQTIxclhZUnJSTmFxZGZEQ2MyYTZKdmgvQUEKNkpZVE5VMGh4OWdidFNnYWJSMHQ5QnJxZHFmT2ZVdlV4N0ZjcDY2VDV2MHlvcndTaTNHZTlmVUE5VW1EQkVXUApBZ01CQUFHakpqQWtNQTRHQTFVZER3RUIvd1FFQXdJQ3BEQVNCZ05WSFJNQkFmOEVDREFHQVFIL0FnRUFNQTBHCkNTcUdTSWIzRFFFQkN3VUFBNElCQVFCY0NRc3E3ekgrbmpmUDFPTWI1Ym9haUtrWmRrMG40dU5wZWhlSEtuenoKWWhSbGUzTm4rcG9qb0xIU1NXVUJoT3Znb3JmZFMxN2RTQ1dkbU5URFB3V3V3blBJTmtyMWRUcmx5emZLZUlobAp0TjYxQS9haHRJNDNkR2hpUzBnZCtQTnpOQ0RnVjFxNzRRU0ZIRzEzdUc5dXV5dEJQL1dYa0hoUjJFaE54SzRxClZGU0pJNVVQOHdaMEZ2UUZnZ05ZMWk3S3lNTUZRTVV6UkhYUUt1MThIbmFiRTdjVUV1clArbUgzM054Q1VFRUgKU21BY29reU9oU013WnpRdjgxbGxUNGd5OHdUbmVzUkN4aXpmNGJ3MnJ5QmQyTnExZGdjZkJXclF1SjBiRy9MMgplRjVWRTlpNzEybmRBNXJDZDZ2MkU0UjJLNDNuQ3V3R0cvSkdUZGFUYkNQRAotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==", "namespace": "", "service-ca.crt": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURNakNDQWhxZ0F3SUJBZ0lJSCs1L0tCZ21DdWt3RFFZSktvWklodmNOQVFFTEJRQXdOekVTTUJBR0ExVUUKQ3hNSmIzQmxibk5vYVdaME1TRXdId1lEVlFRREV4aHJkV0psTFdGd2FYTmxjblpsY2kxc1lpMXphV2R1WlhJdwpIaGNOTWpBd016RXpNakExT0RRMVdoY05NekF3TXpFeE1qQTFPRFExV2pBM01SSXdFQVlEVlFRTEV3bHZjR1Z1CmMyaHBablF4SVRBZkJnTlZCQU1UR0d0MVltVXRZWEJwYzJWeWRtVnlMV3hpTFhOcFoyNWxjakNDQVNJd0RRWUoKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBS2VlTWZTUC9jUld4N1hXTGhqZ3hKV1F0WTNlVzY3SgptL1FBSHY1SGRtaGttdlZRMHplWitIRHVtSktWYTFjNmpyL240NXh6OUs2R2VhSFZoVTlxVUZubzhDZXFpcStVCkZ3dzNXNERNaWJoRWpVUmVQbWNGZys1TnZxcjJLaVQ0U3J0SUZtMTRmMWRINzcyNElaUUZ6ckxjRUF2QUVCTnUKK05ieG9MbnU2WVNhR1BYZTFDbTNiNkViUTdLeTMraGczZGxMcVRTTzlQNFhPRmtrdDdKRHhsdkd5K3Vydnd2OApNcGlKa1NMSDVITEppY2VFMHo2czlZbnAvd0lHNTFqZERUKzEzRDZ3VVZKYUEwd0gzSXdINy9lejNEZTZEZ0JRCldLdlNtMXJtNkl0aTJublN5ZE51MXdUNDVrSCtscHhMdEJ3NldWL1BMVk5kR0Frb3d3dk84MTBDQXdFQUFhTkMKTUVBd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCL3dRRk1BTUJBZjh3SFFZRFZSME9CQllFRk53awpwKzhrZjJtVTBaU00wRFZYaHkveFc3Mi9NQTBHQ1NxR1NJYjNEUUVCQ3dVQUE0SUJBUUNmQ2JiakoxamtjU2xECjFqVnN5VE5EZmg1a3RLV0t2TENZNHJ1ZnhhN3ZIaE9waVk0UHIrdnB2akRDUzlrZWkrU3VJWjA5YU5sRkIvOFcKM1krckloVUZmME1tOVlqTTgwRmFWR1ZRYUFTNUhJaGw2aUFGYXNIK24yRUkycWNNdldwR3JIcHhudTFMdkJkeApHTTh6elNSeFFycURzaWRoRFJZaVlxb1A2MW9PL01Qalptb0x5UkZRdmpiVTBPaEZCaUNmUWFYUi9VZEJqQWJsCmNQSnFmdlpxenZDRDBmOGN1dncyR1M5endDWXNtR2pkWnZxNi9nQnBNY3FSUERCZmtNQjNUdlpPdzQ5T3VnT1kKNHZza3ZlcHR5eEdMZWFoRVdyb3djOStlWGdhUFQrZjIzU3pJZzNqRklIMldveGFSQ2ZhekhMRkw5WmVSQ2haVgpVN0ZldTFSRAotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCi0tLS0tQkVHSU4gQ0VSVElGSUNBVEUtLS0tLQpNSUlEUURDQ0FpaWdBd0lCQWdJSWRjQkRJZ0RXMFp3d0RRWUpLb1pJaHZjTkFRRUxCUUF3UGpFU01CQUdBMVVFCkN4TUpiM0JsYm5Ob2FXWjBNU2d3SmdZRFZRUURFeDlyZFdKbExXRndhWE5sY25abGNpMXNiMk5oYkdodmMzUXQKYzJsbmJtVnlNQjRYRFRJd01ETXhNekl3TlRnME5Gb1hEVE13TURNeE1USXdOVGcwTkZvd1BqRVNNQkFHQTFVRQpDeE1KYjNCbGJuTm9hV1owTVNnd0pnWURWUVFERXg5cmRXSmxMV0Z3YVhObGNuWmxjaTFzYjJOaGJHaHZjM1F0CmMybG5ibVZ5TUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUEzY1ZWQWlDbTN5T2gKN3ZJSE9DeTRFSlpPUDVlbHBIVkJDNEdwdWVRMEVlek9HbHV6Mnk4TmphMHIxZWxRZUJGeFpzK3dBODZwSEEwSQphMUJwdXVWRVNoNHN4UytkSXhuSmtDeXhrTVBhU1NKc3VvUlphUk11ZkFQMHRweUMxb2pqYnhWOXFFQUREdlVtCmRjbVlHbVd1S292WC9kR0Z5VEgxWG1TSzBZb3RZamNuWUhjWnVlOTVrRHloOUJUQXUvbktSVjZoZWMxbVV4eFQKUlRSYnlJc3l6MEJFZkQwcmtaM2QxOGU4MkRGWjNYQk8vNCtFUi94UERBcm91czJpNHBtSHA1Vm10OWVPTE5wUApnaUpxMFZ1ZHcrMTJKUkl4QXNGOVFvemVId2s0K0JlNVZieVdGWjRNSkdiZHlEd09hOUJVM3FxUEJsSE9TcXJFCmZUbHRGVVliTXdJREFRQUJvMEl3UURBT0JnTlZIUThCQWY4RUJBTUNBcVF3RHdZRFZSMFRBUUgvQkFVd0F3RUIKL3pBZEJnTlZIUTRFRmdRVTJIbWgzUWx5Z1RLRmlza3BXNmZVMTlUYkN3VXdEUVlKS29aSWh2Y05BUUVMQlFBRApnZ0VCQUQ4WEZOZlRtQmt0Ni9ET0hWYUUvRW0wY3BCR3NqMmdpRExGaS83ZlRZVWltSXZ4NXJFUzFwQ2YrZHNWCmxYQ2NaUEt0dzR1aGJOdnlGM04yRUxqdWRWdWlVaEd4cVFvamZMc25WQi8zbnVqT2VhNWo1a1dvKzc4clc1RE0KNXBnY3hsTVM4MElTcGdVV1BKN1ovSzhuV0hVT3NoR2I5bjNwTEgwaWxLOGR4WlRsa0hpekx5WVVuWE80N0FLTQpEMi82eXF6ZWZjSHkyVU84ZnluNkVRRDF2NWtDbGFHUHNIQ2E0WXNRL3ltYUd2UGtpL2RTaG1VN05mT1prc1FkClVUTGxyclFuTGJuT3doN3Y3dGZZZWFPMU9ITDVrZ0VlVFZqVWFBYklDZTE1eEJpdVJ3RnM1NTd6U045bTE1M0IKakQ0VW03bk9FckUvNHNCNUJUdk5DMHdNVFFFPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCi0tLS0tQkVHSU4gQ0VSVElGSUNBVEUtLS0tLQpNSUlEVERDQ0FqU2dBd0lCQWdJSUpMQmJUY2lUeWxnd0RRWUpLb1pJaHZjTkFRRUxCUUF3UkRFU01CQUdBMVVFCkN4TUpiM0JsYm5Ob2FXWjBNUzR3TEFZRFZRUURFeVZyZFdKbExXRndhWE5sY25abGNpMXpaWEoyYVdObExXNWwKZEhkdmNtc3RjMmxuYm1WeU1CNFhEVEl3TURNeE16SXdOVGcwTlZvWERUTXdNRE14TVRJd05UZzBOVm93UkRFUwpNQkFHQTFVRUN4TUpiM0JsYm5Ob2FXWjBNUzR3TEFZRFZRUURFeVZyZFdKbExXRndhWE5sY25abGNpMXpaWEoyCmFXTmxMVzVsZEhkdmNtc3RjMmxuYm1WeU1JSUJJakFOQmdrcWhraUc5dzBCQVFFRkFBT0NBUThBTUlJQkNnS0MKQVFFQTJLcFFXemRnS0lvUEtiR2hVWDN4TkVtdkEwN1J6UFJDWUk1dFViUTJhZ1ZDY2VqaHg1UkVSL1dyTTJGVQp1S0FIY0M2cnVzQnFVUnhNVnJQQ05lKzJ2S3BJV0h4ZnZKa3JOcXkxb2FmdDJWelpqd3JLQk01MHZqQUsxdDF4ClFpNU5aZTFQNHgrTlpRRDdQYnpaMFlvb0l6eEorVzQveTljdFM2Q1hWc0tmaVh5SUw0Z1o1Q0NudHE1bnZiZnQKcHNPTUdiWEV6dTdiT3pXN29wRGdSUHAvai94TzYyTm4wRWs0emNVZ29FZDdpMHRwd3l0ZFJRaXMzbFpJbHJwRApnRjVqK21SSjNBR1REYjY3bmNnVS9nUFFKak5sV1ZOS0drSVdaOVQzRUx2RmgxUHg2MWZyZGdJVXQ1Z2ExVkVFClpGM3NCL0VaTEpMbGRRNis4VVBKNGpucmxRSURBUUFCbzBJd1FEQU9CZ05WSFE4QkFmOEVCQU1DQXFRd0R3WUQKVlIwVEFRSC9CQVV3QXdFQi96QWRCZ05WSFE0RUZnUVVWcWJyTTJkdDJDZlA0bVdnZkt4TWNpZHdpVkV3RFFZSgpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFMcnJaU2FpYW9xRXhBRWl2d1AzMWlQaXVVQVlFQWZKcnlIcnU4NkM3TFpHCi9uMWx1bTQxWTAvek9Pc3VXdTJhWldTM0kzU3BlZUxpK2dLZUJIVlNMMjMveHZCbWgwNVIrSHhLS2Izd25jYkQKYkNDSElxSkd4NUNZMXhRY0xyNVpQcTYxRUx3Uk1Zam9OTG9pdUtTc2x5Qm4zZFovRUtjY2VmY20zY2E2d2ZkUwpma1FlRmY1QkhuM3E5clNNZ093U1lBdkM2SU1HYjRjSTRDckZlNEJQcWVIZnhudGJRdGFXd1JPSytGUTF3alNRClVZdnlSNjVZT2dxaDY2bzFRSVFtT2tJYjh3b0dLTEd5NFdQYnE5eHJ6eUROMVNxNXN1dUZkMm54TUh1dStWajQKK3ZzWC8xMFFMdWxpN214ZGdvMHdDc0M5a09iR0RmYkJRT0t4S2pSVlgvND0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQotLS0tLUJFR0lOIENFUlRJRklDQVRFLS0tLS0KTUlJRGtEQ0NBbmlnQXdJQkFnSUJBVEFOQmdrcWhraUc5dzBCQVFzRkFEQlpNVmN3VlFZRFZRUURERTV2Y0dWdQpjMmhwWm5RdGEzVmlaUzFoY0dselpYSjJaWEl0YjNCbGNtRjBiM0pmYkc5allXeG9iM04wTFhKbFkyOTJaWEo1CkxYTmxjblpwYm1jdGMybG5ibVZ5UURFMU9EUXhNelEwTnpNd0hoY05NakF3TXpFek1qRXlNVEV6V2hjTk16QXcKTXpFeE1qRXlNVEUwV2pCWk1WY3dWUVlEVlFRRERFNXZjR1Z1YzJocFpuUXRhM1ZpWlMxaGNHbHpaWEoyWlhJdApiM0JsY21GMGIzSmZiRzlqWVd4b2IzTjBMWEpsWTI5MlpYSjVMWE5sY25acGJtY3RjMmxuYm1WeVFERTFPRFF4Ck16UTBOek13Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRREd3N0JVR3NqQ1Q3dVQKSGdzaHoyLzRuNUpEbXVKNmVOQzdCV3RIcDZDUEZwSElXNUkzY3Evc1p5MG4rQVNhZXdycTNHelc2WmR0U2ZIWApyaElLZnBhOHEwL1dmcVJqYVVxbENpbEFteU9XY25IYW5pQnJUMzRxY3pONFh2cGk5VkhjSGFxTnM5K0JvVGNaCmJkZ1prT0hqN3VuTmtMa1RoRFNIcXR5TDlyMjc2NFN4ekxiaGZlaXpIbHM3R3VjN0xyVm1KUUN3VUE2SHhISUoKUWgrOXZPZHJ1TFBhWHM0dCtkL1hJWURIbXlzem0rUzZyYnN0SzFGa1pKUzE0TkYwVG9odllYVUdITGcxUEF2Sgo2TUt0N3htOCtNaG53MjFvYjYyQ1ZENGRGM1ozV0JvQmlHY1haalYyNHBXRWw4d0VSUW96K3diYWNEQVVDY1hkCndaNUhTOWRyQWdNQkFBR2pZekJoTUE0R0ExVWREd0VCL3dRRUF3SUNwREFQQmdOVkhSTUJBZjhFQlRBREFRSC8KTUIwR0ExVWREZ1FXQkJSbWdkVXlpTzc5RzA4LytVdk82QTNKUEh6UnVqQWZCZ05WSFNNRUdEQVdnQlJtZ2RVeQppTzc5RzA4LytVdk82QTNKUEh6UnVqQU5CZ2txaGtpRzl3MEJBUXNGQUFPQ0FRRUFyQkRxWm9hY3FNSHpvSDRKCnVZSHovRjZmNVdsVUlNVU9NZG9ta1VmaW1teUQxNFdRWkR0T0d1cUpWMGFLVGo2QllKSmUwMC9jOGNFRnZrcy8KSVVYSklac2taWjM5ZWlCb3JKcXNqcnYzN2EvSzNYUDVQbmRZWWRNazBId1M2SGsyZmVNSFN1aGV3QzUwaFVseAovaHU0REI3Sk1ieW0zVldxK3oyaGRtSHUyUmZZQ0EvYldnT005aVhQUHJ5ZDdia3A5SmVPeGFRZ2pDdzVzem1NClRQNExMWHRpWmhUa2pIdlBsMG90YUg5VGMrMFFIS3FmYm5EcVBkVnk5Yy9kV3ZxaWw5cUUrL3hGR2tUVFg2clcKMzB0WHhmbklDTlU2MC9zMEk1ckdRdytnVkVJckd5bjhUVEJLS29xU3hmOHhvVHJxbmtnNTZhNDFsV00zTU9Bdwp0RmN6OUE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCi0tLS0tQkVHSU4gQ0VSVElGSUNBVEUtLS0tLQpNSUlDN1RDQ0FkV2dBd0lCQWdJQkFUQU5CZ2txaGtpRzl3MEJBUXNGQURBbU1TUXdJZ1lEVlFRRERCdHBibWR5ClpYTnpMVzl3WlhKaGRHOXlRREUxT0RReE16UTNOVEl3SGhjTk1qQXdNekV6TWpFeU5UVXlXaGNOTWpJd016RXoKTWpFeU5UVXpXakFtTVNRd0lnWURWUVFEREJ0cGJtZHlaWE56TFc5d1pYSmhkRzl5UURFMU9EUXhNelEzTlRJdwpnZ0VpTUEwR0NTcUdTSWIzRFFFQkFRVUFBNElCRHdBd2dnRUtBb0lCQVFDN1FGVGkwRUhKNEkvcEZHazVZeFlMCllSUVRUUEYxcHBjRllUMEc4WnlGTHA5QzdVcVBSc091VW5hWDVVMUdWUXI5VmUySXJYaVBxaXhHK0ZabC9zeHcKUjlkdkh5Snpqd21KMEg0NnR3TmNDY0FodTRzcTV5dnhjZm9Hbit1UEpWTGFEVHJEcEpiaFlYUTBoeDQ1Z3lUUQpPS2MralhnTTJ5Sys1TzlwSWNSZlB6Ly9RWTNSRVJPTlZpY0d3eW52UXZmdXhuMGJsZDIzZ0YxQkl1bG84aFhLCm1sWDNyaG11em1RUFNoL2s0YXdyY09qN2hEZWNteitSRW5La0VUQTIxclhZUnJSTmFxZGZEQ2MyYTZKdmgvQUEKNkpZVE5VMGh4OWdidFNnYWJSMHQ5QnJxZHFmT2ZVdlV4N0ZjcDY2VDV2MHlvcndTaTNHZTlmVUE5VW1EQkVXUApBZ01CQUFHakpqQWtNQTRHQTFVZER3RUIvd1FFQXdJQ3BEQVNCZ05WSFJNQkFmOEVDREFHQVFIL0FnRUFNQTBHCkNTcUdTSWIzRFFFQkN3VUFBNElCQVFCY0NRc3E3ekgrbmpmUDFPTWI1Ym9haUtrWmRrMG40dU5wZWhlSEtuenoKWWhSbGUzTm4rcG9qb0xIU1NXVUJoT3Znb3JmZFMxN2RTQ1dkbU5URFB3V3V3blBJTmtyMWRUcmx5emZLZUlobAp0TjYxQS9haHRJNDNkR2hpUzBnZCtQTnpOQ0RnVjFxNzRRU0ZIRzEzdUc5dXV5dEJQL1dYa0hoUjJFaE54SzRxClZGU0pJNVVQOHdaMEZ2UUZnZ05ZMWk3S3lNTUZRTVV6UkhYUUt1MThIbmFiRTdjVUV1clArbUgzM054Q1VFRUgKU21BY29reU9oU013WnpRdjgxbGxUNGd5OHdUbmVzUkN4aXpmNGJ3MnJ5QmQyTnExZGdjZkJXclF1SjBiRy9MMgplRjVWRTlpNzEybmRBNXJDZDZ2MkU0UjJLNDNuQ3V3R0cvSkdUZGFUYkNQRAotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCgotLS0tLUJFR0lOIENFUlRJRklDQVRFLS0tLS0KTUlJRFNqQ0NBaktnQXdJQkFnSUJBVEFOQmdrcWhraUc5dzBCQVFzRkFEQTJNVFF3TWdZRFZRUUREQ3R2Y0dWdQpjMmhwWm5RdGMyVnlkbWxqWlMxelpYSjJhVzVuTFhOcFoyNWxja0F4TlRnME1UTTBORGMyTUI0WERUSXdNRE14Ck16SXhNakV4TlZvWERUSXlNRFV4TWpJeE1qRXhObG93TmpFME1ESUdBMVVFQXd3cmIzQmxibk5vYVdaMExYTmwKY25acFkyVXRjMlZ5ZG1sdVp5MXphV2R1WlhKQU1UVTROREV6TkRRM05qQ0NBU0l3RFFZSktvWklodmNOQVFFQgpCUUFEZ2dFUEFEQ0NBUW9DZ2dFQkFMQ3ZKa2VDUjl3VThYdXlhcThFUjQ3QmUwdktQWC9wemliQWYrWVdyaTlJCjVmTjFhalZMeUxJSEU3aHhrLzMyZ01RMnUwUlNWRkJkd0kzSnI1YTVMaTdOM2hHSlV0WmRKbUk0bzVDOXkzTWkKT3c4MlpNZGdpUisrWGxaVjJINnY0VW9jNlVCZExuTzRjMUwrR3UrR2NmVS9oSVV6eG9FU2cwYVBnWXpuWVJwago0YVQxZXBvWHdIblRJZG8vS251aWpnMG1KNzBIRXhXOVVnV1N2L0hKM1B6VzBGeEwva0FNVWRJdUR6Znd1MVJaCjF1UnFuNXF2M1cwQmRjYXNBTzJMTUo0VFlGSHdWalhwNTFzNitOMUN5K1RGVmZhbmlxcU10UjVadVpSSlYraTEKNHcxN3JCc3hJL01XNzdweG00c0FwWC9TT1UvMWJ6cnhPczdXRzJ0NEw2OENBd0VBQWFOak1HRXdEZ1lEVlIwUApBUUgvQkFRREFnS2tNQThHQTFVZEV3RUIvd1FGTUFNQkFmOHdIUVlEVlIwT0JCWUVGSWR1RjcyZmtzeVJuK2pjClVnM1ltc2ErdFYyVU1COEdBMVVkSXdRWU1CYUFGSWR1RjcyZmtzeVJuK2pjVWczWW1zYSt0VjJVTUEwR0NTcUcKU0liM0RRRUJDd1VBQTRJQkFRQWF6OUp6NUtFY25sSU0zV1NUR2xIaUpQMTA5NWl2Vnlob1JqbkROcThUb1pocApGM1ZieXp5TjcwdzJWNzVkTVBvUk9Bc0doc2F2eXlyLytpSVpNRmRwWkQ2UlJqUjBPWVlVRitoL01KWnZSd3NkCnlsOSt6c2lyeTNENFg5VDlwcVpUSEU2QktuL3ZOcWFBMDRZdmdkNE9va0ZkTSt6cld4U1VkYUlaUkpFVnVSZ2IKMVJsdm9uNjVDVEJmNTBjdXBRaURlRzE5dERacEl6ZERkZitPaHFlSDdnbVRid2w2dkN4ZVR0TmdjSkU5L01kMAp6UUh6eWZuVkd5MGVIQWFzUG9LRndROTQvNy9SRnprZkp1bmIrby95WDJ3M1VMa3VZZkxVcTFiWE9nZUNHd2JJClhXZ3lqWTNmVlpMM0VEeHNOMkEyTnc0a2t6VWRIelNrdDdRMGkwOFcKLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=", "token": "" }, "kind": "Secret", "metadata": { "annotations": { "kubernetes.io/service-account.name": "localhost-recovery-client", "kubernetes.io/service-account.uid": "5ec083e7-7f42-45a6-b3d0-f3d03eea45de" }, "creationTimestamp": "2020-03-13T23:10:59Z", "name": "localhost-recovery-client-token-400", "namespace": "openshift-kube-controller-manager", "ownerReferences": [ { "apiVersion": "v1", "kind": "ConfigMap", "name": "revision-status-400", "uid": "8bda187a-6ee8-4851-b2f4-58500631d220" } ], "resourceVersion": "65608", "selfLink": "/api/v1/namespaces/openshift-kube-controller-manager/secrets/localhost-recovery-client-token-400", "uid": "ce6980c0-4f45-49a3-aae1-454ad783a948" }, "type": "kubernetes.io/service-account-token" } ``` The revisioned versions contain "ca.crt" and "service-ca.crt" values injected (possibly some admission for service-ca - might worth to have a look as it isn't present on localhost-recovery-client-token). This is what triggers the 400 rollouts as it is hot looping on DeepEqual between those two. ACTION ITEM #1 Do not rollout a revision if createTargetConfigController failed maybe blocking the managePod would prevent the install before the token is ready https://github.com/openshift/cluster-kube-controller-manager-operator/blob/e976dbc5f71d9b8c988a802a96073a929b7bdae1/pkg/operator/targetconfigcontroller/targetconfigcontroller.go#L234-L241 The lack of injected token is likely a case where no kube-controller-manager is running. === kube-controller-manager-ci-op-vf25zxdf-7f637-cb7p6-master-0 - is the new version deploying but waiting for the token to be injected - pending - it shouldn't install before the revisioned resource is ready === kube-controller-manager-ci-op-vf25zxdf-7f637-cb7p6-master-1 2020-03-13T21:34:29.9862257Z E0313 21:34:29.986173 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get https://localhost:6443/api/v1/namespaces/kube-system/configmaps/kube-controller-manager?timeout=10s: dial tcp [::1]:6443: connect: connection refused 2020-03-13T21:34:36.5130096Z E0313 21:34:36.512916 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: configmaps "kube-controller-manager" is forbidden: User "system:kube-controller-manager" cannot get resource "configmaps" in API group "" in the namespace "kube-system": RBAC: [clusterrole.rbac.authorization.k8s.io "system:webhook" not found, clusterrole.rbac.authorization.k8s.io "system:discovery" not found, clusterrole.rbac.authorization.k8s.io "system:public-info-viewer" not found, clusterrole.rbac.authorization.k8s.io "system:basic-user" not found, clusterrole.rbac.authorization.k8s.io "system:oauth-token-deleter" not found, clusterrole.rbac.authorization.k8s.io "system:build-strategy-jenkinspipeline" not found, clusterrole.rbac.authorization.k8s.io "system:openshift:discovery" not found, clusterrole.rbac.authorization.k8s.io "self-access-reviewer" not found, clusterrole.rbac.authorization.k8s.io "basic-user" not found, clusterrole.rbac.authorization.k8s.io "system:build-strategy-docker" not found, clusterrole.rbac.authorization.k8s.io "system:scope-impersonation" not found, clusterrole.rbac.authorization.k8s.io "cluster-status" not found, clusterrole.rbac.authorization.k8s.io "system:openshift:public-info-viewer" not found, clusterrole.rbac.authorization.k8s.io "system:kube-controller-manager" not found, clusterrole.rbac.authorization.k8s.io "system:build-strategy-source" not found, clusterrole.rbac.authorization.k8s.io "console-extensions-reader" not found] 2020-03-13T22:01:14.6880472Z E0313 22:01:14.687968 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get https://localhost:6443/api/v1/namespaces/kube-system/configmaps/kube-controller-manager?timeout=10s: dial tcp [::1]:6443: connect: connection refused 2020-03-13T22:01:19.4268737Z E0313 22:01:19.426825 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: configmaps "kube-controller-manager" is forbidden: User "system:kube-controller-manager" cannot get resource "configmaps" in API group "" in the namespace "kube-system" === kube-controller-manager-ci-op-vf25zxdf-7f637-cb7p6-master-2 2020-03-13T23:12:18.3357629Z E0313 23:12:18.335746 1 azure_standard.go:714] error: az.EnsureHostInPool(ci-op-vf25zxdf-7f637-cb7p6-master-1), az.vmSet.GetPrimaryInterface.Get(ci-op-vf25zxdf-7f637-cb7p6-master-1, ), err=azure - cloud provider rate limited(read) for operation:NicGet 2020-03-13T23:12:33.8048022Z E0313 23:12:33.804725 1 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1beta1.PodSecurityPolicy: the server could not find the requested resource 2020-03-13T23:12:33.8058897Z E0313 23:12:33.805858 1 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1beta1.ReplicaSet: the server could not find the requested resource 2020-03-13T23:12:33.8073024Z E0313 23:12:33.807277 1 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1beta1.NetworkPolicy: the server could not find the requested resource ... ACTION ITEM #2 - we should never have all kube-controller-managers down during upgrade - some of the errors might be connected to kube-apiservers Maciej, would you mind looking into this further while I am out? Thanks. I looked into this and confirmed most of that Tomas wrote in the previous comment, that kcm-o is hot-looping while waiting for localhost-recovery-client-token SA token. The problem is that this is a new resource which is additionally dependant on kcm to inject additional data. This makes this mechanism very racy. I spoke with Stefan about it and we both agree that blocking managePod would only hide the issue here. The best option and recommended approach for the future cases like this one is to create a "blueprint" resource and allow the controller to inject that token. Only then sync the blueprint into its actual location. This will lengthen the process but only the first time, at the same time preventing any races like the one described above. Re-instating the needsinfo. Comments above make it sound like folks have a good handle on the underlying issue, but the impact-statement questions from comment 0 ("Who is impacted?" through "Is this a regression?" remain unanswered, so it's not clear if this should remain an UpgradeBlocker or, if so, which edges should be blocked by it. (In reply to W. Trevor King from comment #4) > Re-instating the needsinfo. Comments above make it sound like folks have a > good handle on the underlying issue, but the impact-statement questions from > comment 0 ("Who is impacted?" through "Is this a regression?" remain > unanswered, so it's not clear if this should remain an UpgradeBlocker or, if > so, which edges should be blocked by it. Sorry, I missed that. No this is not a regression, in 4.4 we've introduced new mechanism for auto recovery, which was not present previously in the product. Specifically the operator dependence on a new secret, namely localhost-recovery-client-token which additionally needs to be reacted on by the operand introduce a race which was identified and I'm working on a fix. So everyone updating from 4.3.z to 4.4 (until your fix lands) is impacted? Or just some subset of those folks? Also, can you answer the "What is the impact?" and "How involved is remediation?", or do we not have a handle on those yet? (In reply to W. Trevor King from comment #6) > So everyone updating from 4.3.z to 4.4 (until your fix lands) is impacted? > Or just some subset of those folks? Also, can you answer the "What is the > impact?" and "How involved is remediation?", or do we not have a handle on > those yet? So this is time dependant iow. unpredictable, it may or may not happen. The ordering here is important. We have it all identified and actively working on a fix. > We have it all identified and actively working on a fix.
But you cannot retroactively fix existing releases like 4.4.0-rc.1. My current understanding is that this issue impacts some racy fraction of 4.3.z -> 4.4 updates, and that it does not impact other updates like 4.3.z -> 4.3.z. I am still not clear on whether it impacts 4.4 -> 4.4 updates. Or, when a given cluster is impacted, what the impact is, or how involved remediation is. Or are we saying something like "only 4.4 RCs are impacted, we don't really care about managing update edges for RCs, we'll have this fixed before 4.4.0 GAs, and we don't want to bother working up remediation details for the folks who do happen to hit this on a 4.4 RC"?
> My current understanding is that this issue impacts some racy fraction of 4.3.z -> 4.4 updates, and that it does not impact other updates like 4.3.z -> 4.3.z. correct > I am still not clear on whether it impacts 4.4 -> 4.4 updates. I don't think it does. The SA would already be created and provided with the long lived token. The issue comes from a race when creating the long lived SA the first time (4.3->4.4) and also hitting other issues. The impact is likely fatal (manual intervention) as you can't update the operator to newer version afterwards to remedy the situation. My hope is we fix this before 4.4 ships. Confirmed with payload: 4.4.0-0.nightly-2020-04-09-220855, the issue has fixed: 1) Run this command in a loop to make sure that data.token is deleted from openshift-kube-controller-manager/localhost-recovery-client-token # for i in {1..100} ; do oc patch secret localhost-recovery-client-token --type='json' -p='[{"op": "replace", "path": "/data/token", "value": ""}]'; done 2) In second terminal ran the command below to see that revisions have not made the token empty(token with name "localhost-recovery-client-token-<number>" when run the command on second terminal). # for i in {1..100}; do oc get secrets -o json | jq '.items[] | select(.metadata.name | test("localhost-recovery-client-token-*")) | select(.data.token == null or .data.token == "")'; done No revisions had the token empty. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 Removing UpgradeBlocker from this older bug, to remove it from the suspect queue described in [1]. If you feel like this bug still needs to be a suspect, please add keyword again. [1]: https://github.com/openshift/enhancements/pull/475 |