Bug 1770970 - [MSTR-485] When operatorLogLevel set to Debug, etcd encryption migration is stuck
Summary: [MSTR-485] When operatorLogLevel set to Debug, etcd encryption migration is s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.3.0
Assignee: Lukasz Szaszkiewicz
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-11 15:40 UTC by Xingxing Xia
Modified: 2020-01-23 11:12 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-23 11:12:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 659 0 'None' closed bump(*): library-go to pickup observer log fix 2020-12-24 13:50:28 UTC
Github openshift library-go pull 591 0 'None' closed verifies input params of JSONPatchSecret function to prevent npe 2020-12-24 13:49:55 UTC
Red Hat Product Errata RHBA-2020:0062 0 None None None 2020-01-23 11:12:36 UTC

Description Xingxing Xia 2019-11-11 15:40:00 UTC
Description of problem:
When operatorLogLevel set to Debug, etcd encryption migration is stuck.

Version-Release number of selected component (if applicable):
4.3.0-0.nightly-2019-11-11-115927

How reproducible:
Always

Steps to Reproduce:
1. After verifying bug 1758954#c2 with:
oc edit kubeapiserver cluster
...
  spec:
...
    operatorLogLevel: Debug
...

Enable etcd encryption:
oc edit apiserver cluster
spec:
  encryption:
    type: aescbc

2. Check OAS pods restart immediately
oc get po -n openshift-apiserver -l apiserver --show-labels -w
...
apiserver-cwnfp   1/1     Running   0          87s   apiserver=true,app=openshift-apiserver,controller-revision-hash=57957f676c,pod-template-generation=6,revision=2

3. Check KAS pods, it is stuck, no restart:
oc get po -n openshift-kube-apiserver -l apiserver --show-labels -w
NAME                                          READY   STATUS    RESTARTS   AGE   LABELSkube-apiserver-ip-10-0-142-221.ec2.internal   3/3     Running   0          95m   apiserver=true,app=openshift-kube-apiserver,revision=6
kube-apiserver-ip-10-0-147-197.ec2.internal   3/3     Running   0          93m   apiserver=true,app=openshift-kube-apiserver,revision=6
kube-apiserver-ip-10-0-161-221.ec2.internal   3/3     Running   0          91m   apiserver=true,app=openshift-kube-apiserver,revision=6
^C

oc get po -n openshift-kube-apiserver -l apiserver --show-labels -w
NAME                                          READY   STATUS    RESTARTS   AGE    LABELS
kube-apiserver-ip-10-0-142-221.ec2.internal   3/3     Running   0          105m   apiserver=true,app=openshift-kube-apiserver,revision=6
kube-apiserver-ip-10-0-147-197.ec2.internal   3/3     Running   0          103m   apiserver=true,app=openshift-kube-apiserver,revision=6
kube-apiserver-ip-10-0-161-221.ec2.internal   3/3     Running   0          101m   apiserver=true,app=openshift-kube-apiserver,revision=6

Check openshift-kube-apiserver secret, no encencryption-config-$REVISION :
oc get secret -n openshift-kube-apiserver | grep enc
encryption-config                              Opaque                                1      12m

Actual results:
3. KAS pods no restart and secret encencryption-config-$REVISION
Check KASO pod logs:
oc logs -f kube-apiserver-operator-7d9c55db4c-gmztn -n openshift-kube-apiserver-operator | tee kaso.log
...
I1111 14:42:25.401069       1 request.go:538] Throttling request took 1.19312161s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/encryption-config-6I1111 14:42:25.601087       1 request.go:538] Throttling request took 1.193587976s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/encryption-config-6
I1111 14:42:25.801091       1 request.go:538] Throttling request took 1.194025874s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/encryption-config-6
I1111 14:42:26.001019       1 request.go:538] Throttling request took 1.196919776s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets?labelSelector=encryption.apiserver.operator.openshift.io%2Fcomponent%3Dopenshift-kube-apiserver
I1111 14:42:26.005703       1 transition.go:113] no encryption secrets found
I1111 14:42:26.201085       1 request.go:538] Throttling request took 1.19183128s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/config
I1111 14:42:26.401075       1 request.go:538] Throttling request took 1.195426237s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/pods/kube-apiserver-ip-10-0-142-221.ec2.internal
I1111 14:42:26.601094       1 request.go:538] Throttling request took 1.196496812s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets?labelSelector=encryption.apiserver.operator.openshift.io%2Fcomponent%3Dopenshift-kube-apiserver
I1111 14:42:26.605209       1 transition.go:113] no encryption secrets found
I1111 14:42:26.801087       1 request.go:538] Throttling request took 1.196851819s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets?labelSelector=encryption.apiserver.operator.openshift.io%2Fcomponent%3Dopenshift-kube-apiserver
I1111 14:42:26.805426       1 transition.go:113] no encryption secrets found
I1111 14:42:27.001084       1 request.go:538] Throttling request took 1.196707889s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets?labelSelector=encryption.apiserver.operator.openshift.io%2Fcomponent%3Dopenshift-kube-apiserver
I1111 14:42:27.005359       1 transition.go:113] no encryption secrets found
...

4. oc edit kubeapiserver cluster # set back to ""
...
  spec:
...
    operatorLogLevel: ""

Above oc get po -n openshift-kube-apiserver -l apiserver --show-labels -w immediately becomes not stuck and shows up new revision:
kube-apiserver-ip-10-0-147-197.ec2.internal   0/3     Pending       0          0s     apiserver=true,app=openshift-kube-apiserver,revision=7   
kube-apiserver-ip-10-0-147-197.ec2.internal   0/3     Init:0/1      0          5s     apiserver=true,app=openshift-kube-apiserver,revision=7
...

Finally KAS pods restart successfully and see openshift-kube-apiserver/encryption-config-7 secret

Expected results:
3. Should restart and have the secret encencryption-config-$REVISION

Additional info:

Comment 3 Lukasz Szaszkiewicz 2019-11-22 13:34:26 UTC
I provided a fix and verified on a cluster created from 4.3.0-0.ci-2019-11-20-022156 image.

Comment 5 Xingxing Xia 2019-11-28 10:21:39 UTC
Verified this bug in 4.3.0-0.nightly-2019-11-28-004553 env, it is fixed. Tested OAS-O similarly BTW, found bug 1777776.

Comment 7 errata-xmlrpc 2020-01-23 11:12:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062


Note You need to log in before you can comment on or make changes to this bug.