Description of problem: Configuring mTLS on default IngressController breaks ingress canary check & console health checks which in turn makes the ingress and console cluster operators into a degraded state. OpenShift release version: OCP-4.9.5 Cluster Platform: UPI on Baremetal (Disconnected cluster) How reproducible: Configure mutual TLS/mTLS using default IngressController as described in the doc(https://docs.openshift.com/container-platform/4.9/networking/ingress-operator.html#nw-mutual-tls-auth_configuring-ingress) Steps to Reproduce (in detail): 1. Create a config map that is in the openshift-config namespace. 2. Edit the IngressController resource in the openshift-ingress-operator project 3.Add the spec.clientTLS field and subfields to configure mutual TLS: ~~~ apiVersion: operator.openshift.io/v1 kind: IngressController metadata: name: default namespace: openshift-ingress-operator spec: clientTLS: clientCertificatePolicy: Required clientCA: name: router-ca-certs-default allowedSubjectPatterns: - "^/CN=example.com/ST=NC/C=US/O=Security/OU=OpenShift$" ~~~ Actual results: setting up mTLS using documented steps breaks canary and console health checks as clientCertificatePolicy is set as Required these health checks are looking for the client Certs and hence failing and in turn Ingress and Console operators are in a degraded state. Expected results: mTLS setup should work properly without degrading the Ingress and Console operators. Impact of the problem: Instable cluster with Ingress and Console operators into Degraded state. Additional info: The following is the Error message for your reference: The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing) // Canary checks looking for required tls certificate. 2021-11-19T17:17:58.237Z ERROR operator.canary_controller wait/wait.go:155 error performing canary route check {"error": "error sending canary HTTP request to \"canary-openshift-ingress-canary.apps.bruce.openshift.local\": Get \"https://canary-openshift-ingress-canary.apps.bruce.openshift.local\": remote error: tls: certificate required"} // Console operator: RouteHealthDegraded: failed to GET route (https://console-openshift-console.apps.bruce.openshift.local): Get "https://console-openshift-console.apps.bruce.openshift.local": remote error: tls: certificate required ** Please do not disregard the report template; filling the template out as much as possible will allow us to help you. Please consider attaching a must-gather archive (via `oc adm must-gather`). Please review must-gather contents for sensitive information before attaching any must-gathers to a bugzilla report. You may also mark the bug private if you wish.
Using the openshift documentation and creating a configmap for the clientCA (i.e., "router-ca-certs-default") specified here: apiVersion: operator.openshift.io/v1 kind: IngressController metadata: name: default namespace: openshift-ingress-operator spec: clientTLS: clientCertificatePolicy: Required clientCA: name: router-ca-certs-default I see the same issue; both console and ingress go degraded: console 4.9.0-0.nightly-2021-12-01-185844 False False False 22m RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.ci-ln-l2gfksk-72292.origin-ci-int-gce.dev.rhcloud.com): Get "https://console-openshift-console.apps.ci-ln-l2gfksk-72292.origin-ci-int-gce.dev.rhcloud.com": remote error: tls: certificate required csi-snapshot-controller 4.9.0-0.nightly-2021-12-01-185844 True False False 94m dns 4.9.0-0.nightly-2021-12-01-185844 True False False 93m etcd 4.9.0-0.nightly-2021-12-01-185844 True False False 93m image-registry 4.9.0-0.nightly-2021-12-01-185844 True False False 88m ingress 4.9.0-0.nightly-2021-12-01-185844 True False True 4m13s The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing) insights 4.9.0-0.nightly-2021-12-01-185844 True False False 87m If I revert the change so that clientTLS has: spec: clientTLS: clientCertificatePolicy: "" clientCA: name: "" then the console and ingress are no longer degraded. Marking this as blocker+ and we will investigate the must gather to see if this is a configuration issue.
Setting blocker- as this is not a regression or upgrade issue but rather a caveat in certain configurations involving a new but already shipped feature. We can make this configuration work by doing the following: * Add an additional canary route that uses passthrough and use this route when the default ingresscontroller requires client certificates. * Add logic in the console operator’s health check to report healthy if the health probe gets a “tls: certificate required” error.
Created attachment 1847178 [details] KCS Link : https://access.redhat.com/solutions/6551251
Comment on attachment 1847178 [details] KCS Link : https://access.redhat.com/solutions/6551251 >https://access.redhat.com/solutions/6551251
Moving off of 4.10.0. We'll work on this in the next release. Meanwhile, users should not configure the default ingresscontroller to require client certificates.
Hello Team, Can we get some traction on this please, Cu is looking for an update on this? Can you please assist with what release this fix has been targeted for? Regards, Nirupma
Hi Team, The customer mentioned that this bug has been open for over 1 year and wants to know is there any timeline for when it will be fixed.
OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira. https://issues.redhat.com/browse/OCPBUGS-9037