Bug 1831045

Summary: kube-apiserver fails to live reload client CA and front proxy CA
Product: OpenShift Container Platform Reporter: Tomáš Nožička <tnozicka>
Component: kube-apiserverAssignee: Tomáš Nožička <tnozicka>
Status: CLOSED ERRATA QA Contact: Ke Wang <kewang>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.4CC: aos-bugs, maszulik, mfojtik, sttts, xxia
Target Milestone: ---   
Target Release: 4.4.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1831042 Environment:
Last Closed: 2020-05-26 16:50:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1831042    
Bug Blocks:    

Description Tomáš Nožička 2020-05-04 14:13:40 UTC
+++ This bug was initially created as a clone of Bug #1831042 +++

kube-apiserver fails to live reload client CA and front proxy CA

Comment 3 Xingxing Xia 2020-05-18 02:00:48 UTC
Ke, help verify this bug. Its 4.5 clone bug 1831042 has steps for reference.

Comment 4 Ke Wang 2020-05-18 16:15:07 UTC
Hi Tomáš Nožička, when I am verifying the bug , run into the following problem, you can see the last step as below, Could you give me some advice? Thanks.

$ cd ose

$ git pull

$ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.4.0-0.nightly-2020-05-17-221856 | grep hyperkube
hyperkube    https://github.com/openshift/origin  b23e21a...

$ git checkout -b 4.4.0-0.nightly-2020-05-17-221856 b23e21a

$ export GO111MODULE=on

$ go mod init
go: creating new go.mod: module github.com/openshift/origin
go: copying requirements from Godeps/Godeps.json
go: converting Godeps/Godeps.json: stat k8s.io/kubernetes/pkg/api@2f054b7646dc9e98f6dea458d2fb65e1d2c1f731: unknown revision 2f054b7646dc9e98f6dea458d2fb65e1d2c1f731

$ echo $?
0

$ export KUBECONFIG=/path/to/kubeconfig

$ go test -race ./vendor/k8s.io/kubernetes/test/integration/apiserver/certreload/ -run='TestClientCA' -v
go: finding k8s.io/kubernetes/test latest
go: finding k8s.io/kubernetes/test/integration/apiserver/certreload latest
go: finding k8s.io/kubernetes/test/integration/apiserver latest
go: finding k8s.io/kubernetes/test/integration latest
go: k8s.io/kubernetes.2 requires
	k8s.io/api.0: reading k8s.io/api/go.mod at revision v0.0.0: unknown revision v0.0.0

Comment 5 Ke Wang 2020-05-18 16:19:39 UTC
I searched the similar problem from google, here is link https://github.com/kubernetes/kubernetes/issues/79384.

Comment 6 Tomáš Nožička 2020-05-18 18:22:08 UTC
don't run `go mod init`, dependencies are vendored with glide. see https://bugzilla.redhat.com/show_bug.cgi?id=1831042#c8

Comment 7 Ke Wang 2020-05-19 02:57:41 UTC
Hi Tomáš, seems the step 'go mod init' is required, anyway, without your comments, I was unable to run passed, verified with OCP 4.4.0-0.nightly-2020-05-18-164758, detail see below,

$ cd ose

$ git pull

$ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.4.0-0.nightly-2020-05-18-164758 | grep hyperkube
hyperkube    https://github.com/openshift/origin  b23e21a...

$ git checkout -b 4.4.0-0.nightly-2020-05-18-164758 b23e21a

$ export GO111MODULE=on

$ go mod init
go: creating new go.mod: module github.com/openshift/origin
go: copying requirements from Godeps/Godeps.json
go: converting Godeps/Godeps.json: stat k8s.io/kubernetes/pkg/api@2f054b7646dc9e98f6dea458d2fb65e1d2c1f731: unknown revision 2f054b7646dc9e98f6dea458d2fb65e1d2c1f731

$ echo $?
0

$ export KUBECONFIG=/path/to/kubeconfig

To run test, required etcd server, we copied the etcd binary to testbox from one etcd pod of OCP 4.4 cluster,
$ sudo -E oc cp -n openshift-etcd etcd-ip-xx-xx-xxx-35.ap-south-1.compute.internal:/bin/etcd /usr/local/bin/etcd

$ sudo chmod a+x /usr/local/bin/etcd

$ go test -race -mod=vendor ./vendor/k8s.io/kubernetes/test/integration/apiserver/certreload/ -run='TestClientCA' -v

I0519 10:42:29.706734  149662 etcd.go:99] starting etcd on http://127.0.0.1:46217
I0519 10:42:29.706864  149662 etcd.go:105] storing etcd data in: /tmp/integration_test_etcd_data079020049
2020-05-19 10:42:29.728723 W | auth: simple token is not cryptographically signed
=== RUN   TestClientCA
2020-05-19 10:42:30.428412 E | etcdmain: forgot to set Type=notify in systemd service file?
2020-05-19 10:42:30.428435 N | etcdserver/membership: set the initial cluster version to 3.3
2020-05-19 10:42:30.428811 N | embed: serving insecure client requests on 127.0.0.1:46217, this is strongly discouraged!
I0519 10:42:44.439058  149662 serving.go:308] Generated self-signed cert (/tmp/test-integration-TestClientCA991027772/apiserver.crt, /tmp/test-integration-TestClientCA991027772/apiserver.key)
...
I0519 10:42:45.306565  149662 plugins.go:158] Loaded 10 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,MutatingAdmissionWebhook,RuntimeClass.
I0519 10:42:45.306624  149662 plugins.go:161] Loaded 7 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,RuntimeClass,ResourceQuota.
I0519 10:42:45.315891  149662 master.go:267] Using reconciler: lease
I0519 10:42:45.420228  149662 rest.go:115] the default service ipfamily for this cluster is: IPv4
...
I0519 10:42:51.354686  149662 dynamic_cafile_content.go:167] Starting client-ca-bundle::/tmp/test-integration-TestClientCA991027772/client-ca.crt300032206
I0519 10:42:51.354832  149662 dynamic_cafile_content.go:167] Starting request-header::/tmp/test-integration-TestClientCA991027772/proxy-ca.crt160415851
I0519 10:42:51.355892  149662 dynamic_serving_content.go:130] Starting serving-cert::/tmp/test-integration-TestClientCA991027772/apiserver.crt::/tmp/test-integration-TestClientCA991027772/apiserver.key
I0519 10:42:51.358876  149662 secure_serving.go:178] Serving securely on 127.0.0.1:45885
I0519 10:42:51.358956  149662 tlsconfig.go:241] Starting DynamicServingCertificateController
E0519 10:42:51.361956  149662 controller.go:151] Unable to remove old endpoints from kubernetes service: StorageError: key not found, Code: 1, Key: /fc3660e0-8c94-4315-b6f4-23f98005076f/registry/masterleases/xx.xx.xxx.9, ResourceVersion: 0, AdditionalErrorMsg: 
I0519 10:42:51.366864  149662 dynamic_cafile_content.go:167] Starting client-ca-bundle::/tmp/test-integration-TestClientCA991027772/client-ca.crt300032206
I0519 10:42:51.368179  149662 dynamic_cafile_content.go:167] Starting request-header::/tmp/test-integration-TestClientCA991027772/proxy-ca.crt160415851
I0519 10:42:51.368785  149662 cluster_authentication_trust_controller.go:440] Starting cluster_authentication_trust_controller controller
I0519 10:42:51.369035  149662 shared_informer.go:197] Waiting for caches to sync for cluster_authentication_trust_controller
I0519 10:42:51.469843  149662 shared_informer.go:204] Caches are synced for cluster_authentication_trust_controller 
I0519 10:42:52.389802  149662 storage_scheduling.go:133] created PriorityClass system-node-critical with value 2000001000
I0519 10:42:52.404188  149662 storage_scheduling.go:133] created PriorityClass system-cluster-critical with value 2000000000
I0519 10:42:52.404277  149662 storage_scheduling.go:142] all system priority classes are created successfully or already exist.
I0519 10:42:53.856556  149662 controller.go:606] quota admission added evaluator for: roles.rbac.authorization.k8s.io
I0519 10:42:54.027141  149662 controller.go:606] quota admission added evaluator for: rolebindings.rbac.authorization.k8s.io
W0519 10:42:54.295292  149662 lease.go:224] Resetting endpoints for master service "kubernetes" to [xx.xx.xxx.9]
I0519 10:42:54.299816  149662 controller.go:606] quota admission added evaluator for: endpoints
--- PASS: TestClientCA (28.71s)
I0519 10:42:58.518720  149662 controller.go:180] Shutting down kubernetes service endpoint reconciler
PASS

The PR related tests run passed, so move the bug verified.

Comment 9 errata-xmlrpc 2020-05-26 16:50:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2180