Bug 1772759

Summary: console operator panic when RouterCAValidationDegraded
Product: OpenShift Container Platform Reporter: Yadan Pei <yapei>
Component: Management ConsoleAssignee: bpeterse
Status: CLOSED ERRATA QA Contact: Yadan Pei <yapei>
Severity: high Docs Contact:
Priority: medium    
Version: 4.3.0CC: aos-bugs, jokerman, spadgett, yapei
Target Milestone: ---   
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1778930 (view as bug list) Environment:
Last Closed: 2020-05-04 11:15:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1778930    

Description Yadan Pei 2019-11-15 05:57:02 UTC
Description of problem:
when RouterCAValidationDegraded failure detected with error `FailedGet router-ca configmap not found` console opeartor panic

Version-Release number of selected component (if applicable):
4.3.0-0.nightly-2019-11-13-233341

How reproducible:
Always

Steps to Reproduce:
1. Generate a CA and certificate (for testing, if you do not already have a CA and certificate):
    BASE_DOMAIN="$(oc get dns.config/cluster -o 'jsonpath={.spec.baseDomain}')"
    INGRESS_DOMAIN="$(oc get ingress.config/cluster -o 'jsonpath={.spec.domain}')"
    openssl genrsa -out example-ca.key 2048
    openssl req -x509 -new -key example-ca.key -out example-ca.crt -days 1 -subj "/C=US/ST=NC/L=Chocowinity/O=OS3/OU=Eng/CN=$BASE_DOMAIN"
    openssl genrsa -out example.key 2048
    openssl req -new -key example.key -out example.csr -subj "/C=US/ST=NC/L=Chocowinity/O=OS3/OU=Eng/CN=*.$INGRESS_DOMAIN"
    openssl x509 -req -in example.csr -CA example-ca.crt -CAkey example-ca.key -CAcreateserial -out example.crt -days 1
2. Patch ingress router to use custom certificate
$ oc -n openshift-ingress create secret tls custom-default-cert --cert=example.crt --key=example.key
secret/custom-default-cert created
$ oc -n openshift-ingress-operator patch ingresscontrollers/default --type=merge --patch='{"spec":{"defaultCertificate":{"name":"custom-default-cert"}}}'
ingresscontroller.operator.openshift.io/default patched
3. After ingress router are patched with custom cert, tries to visit console, it reports x509 error then check console operator log
4. Check console operator log
$ oc logs -f console-operator-55d4b4455f-hnbvn -n openshift-console-operator

Actual results:
4. console operator panic
E1115 03:34:52.872929       1 status.go:73] RouterCAValidationDegraded FailedGet router-ca configmap not found
E1115 03:34:52.885804       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 549 [running]:
github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1e0fe00, 0x3cdf310)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xa3
github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x82
panic(0x1e0fe00, 0x3cdf310)
    /opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/panic.go:522 +0x1b5
github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/apis/meta/v1.(*ObjectMeta).GetResourceVersion(...)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/meta.go:145
github.com/openshift/console-operator/pkg/console/subresource/deployment.DefaultDeployment(0xc00049c000, 0xc000cc37c0, 0xc000cc3b80, 0x0, 0xc000b1eb40, 0xc000b1f540, 0xc000d06e00, 0xc000e2b1e0, 0x0, 0x2)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/pkg/console/subresource/deployment/deployment.go:73 +0x1b4
github.com/openshift/console-operator/pkg/console/operator.(*consoleOperator).SyncDeployment(0xc0003e8000, 0xc00049c000, 0xc000cc37c0, 0xc000cc3b80, 0x0, 0xc000b1eb40, 0xc000b1f540, 0xc000d06e00, 0xc000e2b1e0, 0x0, ...)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/pkg/console/operator/sync_v400.go:228 +0xa7
github.com/openshift/console-operator/pkg/console/operator.(*consoleOperator).sync_v400(0xc0003e8000, 0xc000508b40, 0xc0005f8dc0, 0xc00049c000, 0xc000e2b040, 0xc000e2b1e0, 0x0, 0x0)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/pkg/console/operator/sync_v400.go:100 +0x664
github.com/openshift/console-operator/pkg/console/operator.(*consoleOperator).handleSync(0xc0003e8000, 0xc0005f8dc0, 0xc00049c000, 0xc000e2b040, 0xc000e2b1e0, 0x0, 0x0)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/pkg/console/operator/operator.go:239 +0x516
github.com/openshift/console-operator/pkg/console/operator.(*consoleOperator).Sync(0xc0003e8000, 0x2563580, 0xc00049c000, 0x0, 0x0)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/pkg/console/operator/operator.go:206 +0x696
github.com/openshift/console-operator/vendor/monis.app/go/openshift/controller.(*controller).handleSync(0xc0002937a0, 0x20e1b1a, 0x7, 0x20e1b1a, 0x7, 0xc0000a4a80, 0x1)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/vendor/monis.app/go/openshift/controller/controller.go:118 +0x11c
github.com/openshift/console-operator/vendor/monis.app/go/openshift/controller.(*controller).processNextWorkItem(0xc0002937a0, 0xc000600700)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/vendor/monis.app/go/openshift/controller/controller.go:104 +0x18d
github.com/openshift/console-operator/vendor/monis.app/go/openshift/controller.(*controller).runWorker(0xc0002937a0)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/vendor/monis.app/go/openshift/controller/controller.go:91 +0x2b
github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0001a20a0)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0001a20a0, 0x3b9aca00, 0x0, 0x1, 0xc000444780)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8
github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc0001a20a0, 0x3b9aca00, 0xc000444780)
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by github.com/openshift/console-operator/vendor/monis.app/go/openshift/controller.(*controller).Run
    /go/src/github.com/openshift/console-operator/_output/local/go/src/github.com/openshift/console-operator/vendor/monis.app/go/openshift/controller/controller.go:66 +0x267
^C


Expected results:
4. even cm/router-ca is not found, console operator should not panic

Additional info:

Comment 1 Samuel Padgett 2019-11-19 14:39:24 UTC
We should revert the change in https://github.com/openshift/console-operator/pull/328. The router-ca won't work for the case we are trying to handle. It's only present when using system-generated certificates, which don't cause a problem. See https://bugzilla.redhat.com/show_bug.cgi?id=1772775#c1

Comment 2 bpeterse 2019-11-26 20:38:23 UTC
Note that if we revert it, we will have to add a commit to update starter.go, the RemoveStaleConditionsController() will need RouterCAValidationDegraded & RouterCAValidationProgressing added, else we could have clusters running with wedged operators.

Comment 4 Yadan Pei 2019-12-16 09:17:16 UTC
Use the same steps as above, now console operator has no panic


Verified on 4.4.0-0.nightly-2019-12-15-184910

Comment 6 errata-xmlrpc 2020-05-04 11:15:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581