Bug 2046094

Summary: SNO: cluster version operator failed to start due to missing secret "cluster-version-operator-serving-cert"
Product: OpenShift Container Platform Reporter: Igal Tsoiref <itsoiref>
Component: Cluster Version OperatorAssignee: Over the Air Updates <aos-team-ota>
Status: CLOSED DUPLICATE QA Contact: liujia <jiajliu>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.10CC: aos-bugs, rfreiman, vrutkovs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-01-26 11:23:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Igal Tsoiref 2022-01-26 09:23:53 UTC
Description of problem:
We started to see failures in our single node ci after some search we saw that cluster version operator fails to start due to missing secret.
We have already couple of failed CI jobs and i will link only latest

in https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-single-node-live-iso/1484898355296342016/artifacts/e2e-metal-single-node-live-iso/baremetalds-sno-gather/artifacts/post-tests-must-gather/registry-redhat-io-openshift4-ose-must-gather-sha256-8c0b3bc10756c463f1aa6b622e396ae244079dd8f7f2f3c5d8695a777c95eec6/host_service_logs/masters/kubelet_service.log you can see many lines like :
Jan 22 15:01:33.839270 test-infra-cluster-master-0 hyperkube[1803]: E0122 15:01:33.839245    1803 secret.go:195] Couldn't get secret openshift-cluster-version/cluster-version-operator-serving-cert: secret "cluster-version-operator-serving-cert" not found
 



Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

How reproducible:
SNO CI

Steps to Reproduce:
1.
2.
3.

Actual results:
Jan 22 15:01:33.839270 test-infra-cluster-master-0 hyperkube[1803]: E0122 15:01:33.839245    1803 secret.go:195] Couldn't get secret openshift-cluster-version/cluster-version-operator-serving-cert: secret "cluster-version-operator-serving-cert" not found

Expected results:
secret is found and cvo starts running


Additional info:

can be relevant to https://bugzilla.redhat.com/show_bug.cgi?id=2045872

Full must-gather logs can be found here:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-single-node-live-iso/1484898355296342016/artifacts/e2e-metal-single-node-live-iso/baremetalds-sno-gather/artifacts/post-tests-must-gather

Comment 1 Rom Freiman 2022-01-26 11:08:12 UTC
Might be related to https://bugzilla.redhat.com/show_bug.cgi?id=2045872

Comment 2 Vadim Rutkovsky 2022-01-26 11:23:53 UTC
Service CA pod is creating those - and it won't start:
Error creating: pods "service-ca-6dc7f77f4d-" is forbidden: error fetching namespace "openshift-service-ca": unable to find annotation openshift.io/sa.scc.uid-range

See https://bugzilla.redhat.com/show_bug.cgi?id=1961204#c4

*** This bug has been marked as a duplicate of bug 1961204 ***