Description of problem: kube-apiserver etcd encryption is too slow on fresh installed env, always taking nearly 15mins Version-Release number of selected component (if applicable): 4.3.0-0.nightly-2019-11-13-004353 How reproducible: Always Steps to Reproduce: 1. Install fresh env, login with some users. Check resources counts: oc get secret -A | wc -l 1004 oc get cm -A | wc -l 282 oc get oauthaccesstoken | wc -l 58 oc get route -A | wc -l 10 2. Enable encryption: $ export PS1='[\u \D{%F %T} \W]\$ ' # add timestamp in shell prompt, my local time is CST (UTC+8) [xxia 2019-11-13 17:40:48 enc-story]$ oc edit apiserver cluster # add below and save/exit spec: encryption: type: aescbc apiserver.config.openshift.io/cluster edited [xxia 2019-11-13 17:42:02 my]$ oc get secret -n openshift-config-managed | grep enc # Approximately use this time in shell prompt as the encryption starting time, let it as Time1=2019-11-13 17:42:02 (UTC+8) encryption-config-openshift-apiserver Opaque 1 30s encryption-config-openshift-kube-apiserver Opaque 1 29s encryption-key-openshift-apiserver-1 Opaque 1 32s encryption-key-openshift-kube-apiserver-1 Opaque 1 32s 3. In terminal A, repeatedly check: [xxia 2019-11-13 17:46:10 enc-story]$ oc get secret -n openshift-config-managed encryption-key-openshift-kube-apiserver-1 encryption-key-openshift-apiserver-1 -o yaml | grep -A 5 annotations | grep encryption.apiserver.operator.openshift.io/migrated encryption.apiserver.operator.openshift.io/migrated-resources: '{"resources":[{"Group":"oauth.openshift.io","Resource":"oauthauthorizetokens"},{"Group":"route.openshift.io","Resource":"routes"},{"Group":"oauth.openshift.io","Resource":"oauthaccesstokens"}]}' encryption.apiserver.operator.openshift.io/migrated-timestamp: "2019-11-13T09:46:25Z" [xxia 2019-11-13 17:57:58 enc-story]$ oc get secret -n openshift-config-managed encryption-key-openshift-kube-apiserver-1 encryption-key-openshift-apiserver-1 -o yaml | grep -A 5 annotations | grep encryption.apiserver.operator.openshift.io/migrated encryption.apiserver.operator.openshift.io/migrated-resources: '{"resources":[{"Group":"","Resource":"configmaps"}]}' encryption.apiserver.operator.openshift.io/migrated-timestamp: "2019-11-13T09:57:33Z" encryption.apiserver.operator.openshift.io/migrated-resources: '{"resources":[{"Group":"oauth.openshift.io","Resource":"oauthauthorizetokens"},{"Group":"route.openshift.io","Resource":"routes"},{"Group":"oauth.openshift.io","Resource":"oauthaccesstokens"}]}' encryption.apiserver.operator.openshift.io/migrated-timestamp: "2019-11-13T09:46:25Z 4. In terminal B, repeatedly check: [xxia 2019-11-13 17:42:28 my]$ oc get po -n openshift-kube-apiserver -l apiserver --show-labels -w NAME READY STATUS RESTARTS AGE LABELS kube-apiserver-ip-10-0-137-119.ap-northeast-1.compute.internal 3/3 Running 0 6h34m apiserver=true,app=openshift-kube-apiserver,revision=6 kube-apiserver-ip-10-0-149-168.ap-northeast-1.compute.internal 0/3 Init:0/1 0 12s apiserver=true,app=openshift-kube-apiserver,revision=7 kube-apiserver-ip-10-0-170-230.ap-northeast-1.compute.internal 3/3 Running 0 6h32m apiserver=true,app=openshift-kube-apiserver,revision=6 ... kube-apiserver-ip-10-0-137-119.ap-northeast-1.compute.internal 0/3 Terminating 0 6h39m apiserver=true,app=openshift-kube-apiserver,revision=6 kube-apiserver-ip-10-0-137-119.ap-northeast-1.compute.internal 0/3 Pending 0 0s apiserver=true,app=openshift-kube-apiserver,revision=8 kube-apiserver-ip-10-0-137-119.ap-northeast-1.compute.internal 0/3 Init:0/1 0 8s apiserver=true,app=openshift-kube-apiserver,revision=8 [xxia 2019-11-13 17:48:53 my]$ [xxia 2019-11-13 17:52:55 my]$ oc get po -n openshift-kube-apiserver -l apiserver --show-labels -w NAME READY STATUS RESTARTS AGE LABELS kube-apiserver-ip-10-0-137-119.ap-northeast-1.compute.internal 3/3 Running 0 5m44s apiserver=true,app=openshift-kube-apiserver,revision=8 ... kube-apiserver-ip-10-0-170-230.ap-northeast-1.compute.internal 3/3 Running 0 80s apiserver=true,app=openshift-kube-apiserver,revision=9 ... kube-apiserver-ip-10-0-137-119.ap-northeast-1.compute.internal 0/3 PodInitializing 0 71s apiserver=true,app=openshift-kube-apiserver,revision=9 kube-apiserver-ip-10-0-137-119.ap-northeast-1.compute.internal 2/3 Running 0 73s apiserver=true,app=openshift-kube-apiserver,revision=9 kube-apiserver-ip-10-0-137-119.ap-northeast-1.compute.internal 3/3 Running 0 81s apiserver=true,app=openshift-kube-apiserver,revision=9 Actual results: 3. For KAS-O encryption, calculate the time gap: Time1 (see above) = 2019-11-13 17:42:02 (UTC+8) = 2019-11-13T09:42:02Z Time2 = encryption.apiserver.operator.openshift.io/migrated-timestamp = 2019-11-13T09:57:33Z Time used is nearly 15 mins, too slow for the resources counts in step 1 which are not many In step 3, for kube-apiserver secret, after repeatedly checking yaml, the annotations finally show up. 4. the KAS pods always need restarts of 3 (9 minus 6) times of revisions to complete finally Running. Expected results: 4. Etcd encryption should be not too slow on fresh installed env with not many resources Additional info:
Verified in following ENV: $ oc version Client Version: v4.4.0 Server Version: 4.4.0-0.nightly-2020-01-05-221122 Kubernetes Version: v1.17.0 Prepare resources: cat > create-resources.sh << EOF #!/bin/bash N1=$1 N2=$2 cd ~/my for i in `seq -w $1 $2` do oc new-project ke-test-proj-$i --skip-config-write for j in `seq -w 01 09` do oc create secret generic mysecret-$j --from-literal abcdefg='12345^&*()' -n ke-test-proj-$i done for j in `seq -w 1 20` do sed "s/my/my-$j/" route-test.yaml | oc create -f - -n ke-test-proj-$i done done EOF bash create-resources.sh 1 100 & # may cost quite some time bash create-resources.sh 101 200 & # may cost quite some time cat > watch-migration.sh << EOF #!/bin/bash OAS_Y= KAS_Y= while true do if [ "$OAS_Y" != "Y" ]; then date echo "Checking OAS" oc get po -n openshift-apiserver -l apiserver --show-labels oc get openshiftapiserver cluster -o json | jq -r '.status.conditions[] | select(.type == "EncryptionMigrationControllerProgressing")' oc get secret -n openshift-config-managed encryption-key-openshift-apiserver-1 -o json | jq -r '.metadata.annotations' if oc get secret -n openshift-config-managed encryption-key-openshift-apiserver-1 -o json | jq -r '.metadata.annotations' | grep "migrated-resources.*routes" | grep oauthaccesstokens | grep oauthauthorizetokens > /dev/null; then date echo "OAS-O migration completed" OAS_Y=Y fi fi if [ "$KAS_Y" != "Y" ]; then date echo "Checking KAS" oc get po -n openshift-kube-apiserver -l apiserver --show-labels oc get kubeapiserver cluster -o json | jq -r '.status.conditions[] | select(.type == "EncryptionMigrationControllerProgressing")' oc get secret -n openshift-config-managed encryption-key-openshift-kube-apiserver-1 -o json | jq -r '.metadata.annotations' if oc get secret -n openshift-config-managed encryption-key-openshift-kube-apiserver-1 -o json | jq -r '.metadata.annotations' | grep "migrated-resources.*secrets" | grep configmaps > /dev/null; then date echo "KAS-O migration completed" KAS_Y=Y fi fi [ "$OAS_Y" == "Y" ] && [ "$KAS_Y" == "Y" ] && break sleep 20 echo "====================" done EOF Then enable etcd encryption. Then watch migration: bash watch-migration.sh | tee watch-migration.log 2020年 01月 06日 星期一 17:08:39 CST Checking OAS ... Checking KAS ... Check the time gaps from watch-migration.log: 2020年 01月 06日 星期一 17:13:23 CST ... "lastTransitionTime": "2020-01-06T09:13:18Z", "message": "migrating resources to a new write key: [route.openshift.io/routes]", ... 020年 01月 06日 星期一 17:15:47 CST OAS-O migration completed ... 2020年 01月 06日 星期一 17:20:56 CST ... "lastTransitionTime": "2020-01-06T09:20:43Z", "message": "migrating resources to a new write key: [core/secrets]", ... 2020年 01月 06日 星期一 17:23:19 CST KAS-O migration completed Get the migrating resources number, $ oc get route --no-headers -A | wc -l 4007 $ oc get secret,cm --no-headers -A | wc -l 4912 for KAS, 4912 resources / 2m13s, for OAS, 4007 resources / 2m24s. The speed is about 1800 resources / min, this is expected fast.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581