Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1777776

Summary: [MSTR-485] When OAS-O operatorLogLevel set to Debug, its etcd encryption migration is stuck with pod CrashLoopBackOff
Product: OpenShift Container Platform Reporter: Xingxing Xia <xxia>
Component: openshift-apiserverAssignee: Lukasz Szaszkiewicz <lszaszki>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: high Docs Contact:
Priority: high    
Version: 4.3.0CC: aos-bugs, mfojtik, yinzhou
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-23 11:14:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Xingxing Xia 2019-11-28 10:20:08 UTC
Description of problem:
When OAS-O operatorLogLevel set to Debug, its etcd encryption migration is stuck with pod CrashLoopBackOff

Version-Release number of selected component (if applicable):
4.3.0-0.nightly-2019-11-28-004553

How reproducible:
Always

Steps to Reproduce:
While verifying bug 1770970, found this issue:
1. In a terminal, set below:
[xxia 2019-11-28 17:27:27 my]$ oc edit openshiftapiserver cluster # edit below and save
...
    operatorLogLevel: Debug
...
openshiftapiserver.operator.openshift.io/cluster edited
[xxia 2019-11-28 17:29:51 my]$ oc edit apiserver cluster # edit below and save
...
spec:
  encryption:
    type: aescbc
apiserver.config.openshift.io/cluster edited

2. In another terminal, watch OAS pods:
[xxia 2019-11-28 17:28:23 my]$ oc get po -n openshift-apiserver -l apiserver --show-labels -w
NAME              READY   STATUS     RESTARTS   AGE     LABELS
apiserver-5cswg   0/1     Init:0/1   0          6s      apiserver=true,app=openshift-apiserver,controller-revision-hash=679fc94c77,pod-template-generation=5,revision=0
apiserver-xbcfg   1/1     Running    1          7h18m   apiserver=true,app=openshift-apiserver,controller-revision-hash=6574b8f858,pod-template-generation=4,revision=0
...
apiserver-xbcfg   1/1     Terminating       1          7h18m   apiserver=true,app=openshift-apiserver,controller-revision-hash=6574b8f858,pod-template-generation=4,revision=0
^C[xxia 2019-11-28 17:28:56 my]$

[xxia 2019-11-28 17:31:43 my]$ oc get po -n openshift-apiserver --show-labels -w
NAME              READY   STATUS    RESTARTS   AGE     LABELS
apiserver-5cswg   1/1     Running   0          3m17s   apiserver=true,app=openshift-apiserver,controller-revision-hash=679fc94c77,pod-template-generation=5,revision=0apiserver-82gzl   1/1     Running   0          3m45s   apiserver=true,app=openshift-apiserver,controller-revision-hash=679fc94c77,pod-template-generation=5,revision=0
apiserver-r6jxg   1/1     Running   0          2m45s   apiserver=true,app=openshift-apiserver,controller-revision-hash=679fc94c77,pod-template-generation=5,revision=0
apiserver-82gzl   1/1     Terminating   0          6m3s    apiserver=true,app=openshift-apiserver,controller-revision-hash=679fc94c77,pod-template-generation=5,revision=0
...
apiserver-lg97t   0/1     Init:0/1      0          0s      apiserver=true,app=openshift-apiserver,controller-revision-hash=fc867857,pod-template-generation=6,revision=0
apiserver-lg97t   0/1     PodInitializing   0          12s     apiserver=true,app=openshift-apiserver,controller-revision-hash=fc867857,pod-template-generation=6,revision=0
apiserver-lg97t   0/1     CrashLoopBackOff   1          15s     apiserver=true,app=openshift-apiserver,controller-revision-hash=fc867857,pod-template-generation=6,revision=0
...
apiserver-lg97t   0/1     CrashLoopBackOff   8          16m     apiserver=true,app=openshift-apiserver,controller-revision-hash=fc867857,pod-template-generation=6,revision=0

3. In another terminal, check the pod:
[xxia 2019-11-28 17:38:16 my]$ oc get po apiserver-lg97t -n openshift-apiserver -o yaml
...
  containerStatuses:
...
        exitCode: 255
        finishedAt: "2019-11-28T09:37:27Z"
        message: |
          Copying system trust bundle
          F1128 09:37:27.044495       1 cmd.go:62] error opening encryption provider configuration file "/var/run/secrets/encryption-config/encryption-config": open /var/run/secrets/encryption-config/encryption-config: no such file or directory
        reason: Error
        startedAt: "2019-11-28T09:37:26Z"
    name: openshift-apiserver
    ready: false
    restartCount: 5
    started: false
    state:
      waiting:
        message: back-off 2m40s restarting failed container=openshift-apiserver pod=apiserver-lg97t_openshift-apiserver(8a7989cf-1855-45c6-a688-d58cfdb5b06f)
        reason: CrashLoopBackOff
...

4. Check secret
[xxia 2019-11-28 17:38:53 my]$ oc get secret -n openshift-apiserver | grep enc # no encryption-config-$REVISION secret
encryption-config                        Opaque                                1      9m

Actual results:
step 2 and 3: apiserver-lg97t is CrashLoopBackOff
4. No encryption-config-$REVISION secret

Expected results:
step 2 and 3: No CrashLoopBackOff
4. Should have encryption-config-$REVISION secret

Comment 2 Xingxing Xia 2019-12-04 09:27:56 UTC
Verified in 4.3.0-0.nightly-2019-12-04-004448, issue is fixed.

Comment 4 errata-xmlrpc 2020-01-23 11:14:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062