Bug 2049907 - SNO: cluster-policy-controller failed to start due to missing serving-cert/tls.crt
Summary: SNO: cluster-policy-controller failed to start due to missing serving-cert/tl...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-controller-manager
Version: 4.9
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.9.z
Assignee: Filip Krepinsky
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On: 2048484
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-02 21:49 UTC by Filip Krepinsky
Modified: 2022-02-14 12:01 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2048484
Environment:
Last Closed: 2022-02-14 12:00:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-controller-manager-operator pull 600 0 None open Bug 2049907: allow cluster-policy-controller to fallback to default cert 2022-02-02 22:25:01 UTC
Red Hat Product Errata RHBA-2022:0488 0 None None None 2022-02-14 12:01:26 UTC

Comment 4 RamaKasturi 2022-02-10 12:29:16 UTC
Checked from the recent jobs and could not see the error as reported.

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-e2e-metal-single-node-live-iso/1488560014497943552/artifacts/e2e-metal-single-node-live-iso/baremetalds-sno-gather/artifacts/post-tests-must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-1f3994a75464c01f1953aaeda23c2a02c477e1b5ea36eb3434123ecccd141b0c/namespaces/openshift-kube-controller-manager/pods/kube-controller-manager-test-infra-cluster-master-0/cluster-policy-controller/cluster-policy-controller/logs/current.log -> Feb1st

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-e2e-metal-single-node-live-iso/1489736870999887872/artifacts/e2e-metal-single-node-live-iso/baremetalds-sno-gather/artifacts/post-tests-must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-1f3994a75464c01f1953aaeda23c2a02c477e1b5ea36eb3434123ecccd141b0c/namespaces/openshift-kube-controller-manager/pods/kube-controller-manager-test-infra-cluster-master-0/cluster-policy-controller/cluster-policy-controller/logs/current.log -> Feb2nd

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-e2e-metal-single-node-live-iso/1490526977445072896/artifacts/e2e-metal-single-node-live-iso/baremetalds-sno-gather/artifacts/post-tests-must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-1f3994a75464c01f1953aaeda23c2a02c477e1b5ea36eb3434123ecccd141b0c/namespaces/openshift-kube-controller-manager/pods/kube-controller-manager-test-infra-cluster-master-0/cluster-policy-controller/cluster-policy-controller/logs/current.log -> Feb7th

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-e2e-metal-single-node-live-iso/1491426754189856768/artifacts/e2e-metal-single-node-live-iso/baremetalds-sno-gather/artifacts/post-tests-must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-023fba035f3807a225d262ef0f32e726cbdf6aadc3959fe82283778fecb33954/namespaces/openshift-kube-controller-manager/pods/kube-controller-manager-test-infra-cluster-master-0/cluster-policy-controller/cluster-policy-controller/logs/current.log -> Feb9th

Built the SNO cluster and it was successful, so moving the bug to verified state.

https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/75425/

Also checked with Filip as i did not see that 4.9 did not encounter the issue before or after the BZ was reported, and what i learnt is "I am not sure how often this issue manifests itself in 4.9. They reported that it started hurting them only in 4.10 so the issue can be pretty rare in 4.9. But looking at the code, the issue still could potentially appear in 4.9.
quote from https://bugzilla.redhat.com/show_bug.cgi?id=2045872, After analyzing last week CI, this issue causes 23% installations to fail (4/17) - which basically leads to 77% success rate (assuming this is the main reason for SNOs to fail). Just to compare, 4.8 we had 99.7% success rate, 4.9 97%. "

Based on the above moving bug to verified state.

Comment 6 errata-xmlrpc 2022-02-14 12:00:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.9.21 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0488


Note You need to log in before you can comment on or make changes to this bug.