Bug 1982398 - OCP 4.9 etcd-encryption leads to constantly progressing kube-apiserver
Summary: OCP 4.9 etcd-encryption leads to constantly progressing kube-apiserver
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.9
Hardware: s390x
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Lukasz Szaszkiewicz
QA Contact: Ke Wang
URL:
Whiteboard:
Depends On:
Blocks: ocp-49-z-tracker
TreeView+ depends on / blocked
 
Reported: 2021-07-14 18:51 UTC by Tom Dale
Modified: 2022-07-02 16:08 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-20 16:47:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-authentication-operator pull 466 0 None closed encryption condition controller doesn't reset previously set condition 2021-07-30 07:33:00 UTC
Github openshift cluster-kube-apiserver-operator pull 1178 0 None closed encryption condition controller doesn't reset previously set condition 2021-07-30 07:28:19 UTC
Github openshift cluster-openshift-apiserver-operator pull 460 0 None closed clear encryption conditions when there is no work to be done 2021-07-30 07:33:01 UTC
Github openshift cluster-openshift-apiserver-operator pull 462 0 None closed encryption condition controller doesn't reset previously set conditon 2021-07-30 07:32:54 UTC

Description Tom Dale 2021-07-14 18:51:30 UTC
Description of problem: 
After encrypting etcd in openshift 4.8 on IBM Z 


Version-Release number of selected component (if applicable):
Server Version: 4.9.0-0.nightly-s390x-2021-07-14-151720


How reproducible:
Every time (Tried twice)

Steps to Reproduce:
1. Install new openshift 4.8 cluster on z/VM
2. Run `oc patch apiserver cluster --type='merge' --patch '{ "spec": { "encryption": { "type": "aescbc" } } }'`
3. Run `oc get openshiftapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="Encrypted")]}{.reason}{"\n"}{.message}{"\n"}'` to get encryption status. None present
4. Observe kube-apiserver and watch as it constantly is in the state nodeInstallProgressing

Expected results:
I would expect to be able to see encryption status oc get openshiftapiserver

Additional info:


must-gather logs -> https://drive.google.com/file/d/18NN7PgOo-bl9q0LKmW0Z2aQTVfnPgbmt/view?usp=sharing

-> % oc get --raw=/healthz/etcd
ok
-> % oc get apiserver -o yaml 
...
spec:
    audit:
      profile: Default
    encryption:
      type: aescbc
...
-> % oc get co kube-apiserver
spec: {}
status:
  conditions:
  - lastTransitionTime: "2021-07-14T17:31:47Z"
    message: 'NodeControllerDegraded: All master nodes are ready'
    reason: AsExpected
    status: "False"
    type: Degraded
  - lastTransitionTime: "2021-07-14T18:27:05Z"
    message: 'NodeInstallerProgressing: 3 nodes are at revision 10'
    reason: AsExpected
    status: "False"
    type: Progressing
  - lastTransitionTime: "2021-07-14T17:30:23Z"
    message: 'StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 10'
    reason: AsExpected
    status: "True"
    type: Available
  - lastTransitionTime: "2021-07-14T17:24:43Z"
    message: All is well
    reason: AsExpected
    status: "True"
    type: Upgradeable
extension: null
  relatedObjects:
  - group: operator.openshift.io
    name: cluster
    resource: kubeapiservers
  - group: apiextensions.k8s.io
    name: ""
    resource: customresourcedefinitions
  - group: security.openshift.io
    name: ""
    resource: securitycontextconstraints
  - group: ""
    name: openshift-config
    resource: namespaces
  - group: ""
    name: openshift-config-managed
    resource: namespaces
  - group: ""
    name: openshift-kube-apiserver-operator
    resource: namespaces
  - group: ""
    name: openshift-kube-apiserver
    resource: namespaces
  - group: admissionregistration.k8s.io
    name: ""
    resource: mutatingwebhookconfigurations
  - group: admissionregistration.k8s.io
    name: ""
    resource: validatingwebhookconfigurations
  - group: controlplane.operator.openshift.io
    name: ""
    namespace: openshift-kube-apiserver
    resource: podnetworkconnectivitychecks
  - group: apiserver.openshift.io
    name: ""
    resource: apirequestcounts

Comment 1 Tom Dale 2021-07-14 18:52:48 UTC
Mistyped "4.8", to clarify this is all on a 4.9 nightly build.

Comment 2 Lukasz Szaszkiewicz 2021-07-15 12:15:48 UTC
thanks for reporting, I have already opened https://github.com/openshift/library-go/pull/1136 to address the issue.

Comment 3 Ke Wang 2021-07-30 09:50:59 UTC
Verification steps:

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-07-29-103526   True        False         70m     Cluster version is 4.9.0-0.nightly-2021-07-29-103526

- Make etcd encryption and check results with following scripts

#!/usr/bin/env bash
> encryption.result
oc patch apiserver cluster --type='merge' --patch '{ "spec": { "encryption": { "type": "aescbc" } } }'

while true
do
    oc get openshiftapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="Encrypted")]}{.reason}{"\n"}{.message}{"\n"}' >> encryption.result
    sleep 10
done

output for results:
EncryptionInProgress
Resource routes.route.openshift.io is not encrypted
...
EncryptionInProgress
Resource routes.route.openshift.io is not encrypted
 
 
EncryptionInProgress
Resource routes.route.openshift.io is being encrypted
 
 
EncryptionInProgress
Resource routes.route.openshift.io is being encrypted

The status doesn't flip to an empty message.


- Make etcd decryption and check results with following scripts
#!/usr/bin/env bash
> decryption.result
oc patch apiserver/cluster -p '{"spec":{"encryption": {"type":"identity"}}}' --type merge
while true
do
    oc get openshiftapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="Encrypted")]}{.reason}{"\n"}{.message}{"\n"}' >> decryption.result
    sleep 10
done

output for results:
EncryptionCompleted
All resources encrypted: routes.route.openshift.io
 
 
EncryptionCompleted
All resources encrypted: routes.route.openshift.io
 
 
DecryptionInProgress
Encryption mode set to identity and decryption is not finished
 
 
DecryptionInProgress
Encryption mode set to identity and decryption is not finished
...
DecryptionCompleted
Encryption mode set to identity and everything is decrypted
 
 
DecryptionCompleted
Encryption mode set to identity and everything is decrypted

The status doesn't flip to an empty message.

Based on above results, the PR fix the bug, so move the bug VERIFIED.

Comment 4 Tom Dale 2021-07-30 13:55:19 UTC
Thanks! Working for me as well now on 4.9.0-0.nightly-s390x-2021-07-29-103644


Note You need to log in before you can comment on or make changes to this bug.