Bug 1878015 - KCM cert-syncer panic when caches don't sync
Summary: KCM cert-syncer panic when caches don't sync
Keywords:
Status: VERIFIED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-controller-manager
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.6.0
Assignee: Tomáš Nožička
QA Contact: zhou ying
URL:
Whiteboard:
Depends On:
Blocks: 1879637
TreeView+ depends on / blocked
 
Reported: 2020-09-11 05:11 UTC by Tomáš Nožička
Modified: 2020-09-16 17:21 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1879637 (view as bug list)
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift cluster-kube-controller-manager-operator pull 447 None closed Bug 1878015: Remove panic on cache sync 2020-09-16 17:19:51 UTC

Description Tomáš Nožička 2020-09-11 05:11:25 UTC
When caches fail to sync we shouldn't panic but only exit 1.

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-4.6/1303842838693285888

Comment 2 zhou ying 2020-09-14 13:15:47 UTC
Confirmed with latest payload: 4.6.0-0.nightly-2020-09-12-230035, the issue has fixed:

1) Turn off kube-apiserver on node1;
2) Delete the kube-controller-manager-cert-syncer container of the same node;
3) Wait for 10 mins, check the kube-controller-manager-cert-syncer container again . 

Could see the container only exit with code 1 . no panic:

  kube-controller-manager-cert-syncer:
....
F0914 13:05:32.055616       1 base_controller.go:95] unable to sync caches for CertSyncController

      Exit Code:    1


Note You need to log in before you can comment on or make changes to this bug.