Bug 1887441 - ingress misconfiguration may break authentication but ingress operator keeps reporting "degraded: False"
Summary: ingress misconfiguration may break authentication but ingress operator keeps ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Routing
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.7.0
Assignee: Stephen Greene
QA Contact: Arvind iyengar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-12 13:36 UTC by Standa Laznicka
Modified: 2021-02-24 15:25 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Set an Ingress Controller's Spec.DefaultCertificate field to reference a secret that does not exist. Consequence: The operator generated default certificate for that ingress controller is deleted. Fix: The ingress operator now verifies that an Ingress Controller's Spec.DefaultCertificate value exists, should it be specified, before deleting the operator generated default certificate. Result: The operator generated default certificate for a given ingress controller is not forcibly deleted when an Ingress Controller's Spec.DefaultCertificate is updated to point to a secret that does not exist.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:24:44 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-ingress-operator pull 505 0 None closed Bug 1887441: Conditionally delete generated default cert 2021-02-15 16:05:17 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:25:28 UTC

Description Standa Laznicka 2020-10-12 13:36:55 UTC
Description of problem:
Configuring the `defaultCertificate` to refer to a non-existent resource leads to router-certs secret being wiped of data, which breaks the authn operator.

Version-Release number of selected component (if applicable):
master

How reproducible:
100%

Steps to Reproduce:
1. oc edit ingresscontroller -n openshift-ingress-operator default
2. set defaultCertificate.name to something you're sure does not exist
3. watch oc get co ingress

Actual results:
ingress operator never goes degraded

Expected results:
ingress operator reports degraded as it's incapable to find its input resource

Comment 1 Stephen Greene 2020-10-23 14:08:17 UTC
Adding upcoming sprint.

Comment 2 Stephen Greene 2020-11-13 17:49:07 UTC
I looked through the ingress controller status code and have a better idea of what a fix for this looks like. Adding upcoming sprint.

Comment 4 Stephen Greene 2020-12-07 19:44:26 UTC
Note that as is the ingress controller would go degraded after a period of 60 minutes in this scenario.

https://github.com/openshift/cluster-ingress-operator/blob/master/pkg/operator/controller/ingress/status.go#L440-L442

Creating a PR to address the main problem of the router-certs secret being deleted when the user specified Spec.DefaultCertificate in the default ingress controller does not exist.

Comment 5 Arvind iyengar 2020-12-23 05:07:33 UTC
Tested in "4.7.0-0.ci.test-2020-12-23-033722-ci-ln-2lsq5dt" release. With this payload, in reference to C#4, it is now noted that the default certificates continues to remain available when the router is incorrectly configured with the wrong unavailable certificate secret:
------
$ oc get clusterversion
NAME      VERSION                                           AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.ci.test-2020-12-23-033722-ci-ln-2lsq5dt   True        False         49m     Cluster version is 4.7.0-0.ci.test-2020-12-23-033722-ci-ln-2lsq5dt

$ oc -n openshift-ingress get secret
NAME                           TYPE                                  DATA   AGE
builder-dockercfg-7ts8x        kubernetes.io/dockercfg               1      56m
builder-token-7z7rx            kubernetes.io/service-account-token   4      56m
builder-token-mvz2h            kubernetes.io/service-account-token   4      56m
default-dockercfg-t7kvj        kubernetes.io/dockercfg               1      56m
default-token-mddzf            kubernetes.io/service-account-token   4      56m
default-token-w67cs            kubernetes.io/service-account-token   4      56m
deployer-dockercfg-b4sjn       kubernetes.io/dockercfg               1      56m
deployer-token-4zpcc           kubernetes.io/service-account-token   4      56m
deployer-token-5hs8g           kubernetes.io/service-account-token   4      56m
router-certs-default           kubernetes.io/tls                     2      56m
router-dockercfg-nbvck         kubernetes.io/dockercfg               1      56m
router-metrics-certs-default   kubernetes.io/tls                     2      55m
router-stats-default           Opaque                                2      56m
router-token-dfh72             kubernetes.io/service-account-token   4      56m
router-token-svglw             kubernetes.io/service-account-token   4      56m

$ oc -n openshift-ingress get secret router-certs-default
NAME                   TYPE                DATA   AGE
router-certs-default   kubernetes.io/tls   2      56m


After adding incorrect cert secret:
~~
  defaultCertificate:
    name: router-certs-test
~~

$ oc -n openshift-ingress-operator edit ingresscontroller default 
ingresscontroller.operator.openshift.io/default edited

$ oc -n openshift-ingress get pods -o wide
NAME                              READY   STATUS              RESTARTS   AGE   IP            NODE                                       NOMINATED NODE   READINESS GATES
router-default-649b9cb8cb-cbgns   0/1     ContainerCreating   0          11s   <none>        ci-ln-2lsq5dt-f76d1-7gb45-worker-c-9vxjg   <none>           <none>
router-default-649b9cb8cb-m57j9   0/1     ContainerCreating   0          11s   <none>        ci-ln-2lsq5dt-f76d1-7gb45-worker-d-l76tl   <none>           <none>
router-default-76b758ff8b-fg6v4   1/1     Terminating         0          57m   10.128.2.10   ci-ln-2lsq5dt-f76d1-7gb45-worker-d-l76tl   <none>           <none>
router-default-76b758ff8b-n7dzc   1/1     Running             0          57m   10.131.0.20   ci-ln-2lsq5dt-f76d1-7gb45-worker-c-9vxjg   <none>           <none>

$ oc -n openshift-ingress get secret
NAME                           TYPE                                  DATA   AGE
builder-dockercfg-7ts8x        kubernetes.io/dockercfg               1      57m
builder-token-7z7rx            kubernetes.io/service-account-token   4      57m
builder-token-mvz2h            kubernetes.io/service-account-token   4      57m
default-dockercfg-t7kvj        kubernetes.io/dockercfg               1      57m
default-token-mddzf            kubernetes.io/service-account-token   4      57m
default-token-w67cs            kubernetes.io/service-account-token   4      57m
deployer-dockercfg-b4sjn       kubernetes.io/dockercfg               1      57m
deployer-token-4zpcc           kubernetes.io/service-account-token   4      57m
deployer-token-5hs8g           kubernetes.io/service-account-token   4      57m
router-certs-default           kubernetes.io/tls                     2      57m
router-dockercfg-nbvck         kubernetes.io/dockercfg               1      57m
router-metrics-certs-default   kubernetes.io/tls                     2      57m
router-stats-default           Opaque                                2      57m
router-token-dfh72             kubernetes.io/service-account-token   4      57m
router-token-svglw             kubernetes.io/service-account-token   4      57m

$ oc -n openshift-ingress get secret router-certs-default
NAME                   TYPE                DATA   AGE
router-certs-default   kubernetes.io/tls   2      57m
------

Comment 9 errata-xmlrpc 2021-02-24 15:24:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.