Bug 1723400 - EXTENDED_VALIDATION doesn't capture certificate / key mismatch, causing the router to misbehave
Summary: EXTENDED_VALIDATION doesn't capture certificate / key mismatch, causing the r...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Routing
Version: 3.9.0
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 4.3.0
Assignee: Dan Mace
QA Contact: Hongan Li
URL:
Whiteboard:
: 1629624 1749653 (view as bug list)
Depends On:
Blocks: 1758373 1758375 1794487
TreeView+ depends on / blocked
 
Reported: 2019-06-24 12:56 UTC by Simon Reber
Modified: 2020-01-23 20:11 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1758373 1758374 1758375 1794487 (view as bug list)
Environment:
Last Closed: 2020-01-23 11:04:11 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift router pull 39 0 None closed Bug 1723400: fix haproxy reload crash when processing ECDSA keys 2021-02-15 04:37:07 UTC
Red Hat Knowledge Base (Solution) 4767321 0 None None None 2020-01-23 20:11:21 UTC
Red Hat Product Errata RHBA-2020:0062 0 None None None 2020-01-23 11:04:41 UTC

Description Simon Reber 2019-06-24 12:56:23 UTC
Description of problem:

Even though the EXTENDED_VALIDATION is set to true, adding a certificate to a specific `route` with the wrong key will cause the `router` to fail during re-load which can impact production services as changes within the service are not properly reflected.

With EXTENDED_VALIDATION on, it's expected to decline such route from being created and prevent the `router` from failing.

Version-Release number of selected component (if applicable):

> oc v3.9.74
> kubernetes v1.9.1+a0ce1bc657
> features: Basic-Auth GSSAPI Kerberos SPNEGO
> 
> Server https://openshift.example.com:443
> openshift v3.9.74
> kubernetes v1.9.1+a0ce1bc657

How reproducible:
Always


Steps to Reproduce:
1. Make sure EXTENDED_VALIDATION is set to `true` on the `router`
2. Create a route with Edge termination set and apply a custom certificate.
3. Add a wrong key for the certificate (not matching) and create the route

Actual results:

`router` is failing to reload and thus apply changes within it's configuration. Error reported by `router` is as following.

E0620 09:53:43.202882       1 limiter.go:137] error reloading router: exit status 1
[ALERT] 170/095343 (13510) : parsing [/var/lib/haproxy/conf/haproxy.config:116] : 'bind 127.0.0.1:10444' : 'crt-list' : error processing line 1 in file '/var/lib/haproxy/conf/cert_config.map' : unable to load SSL private key from PEM file '/var/lib/haproxy/router/certs/example-route:wildcard.pem'.
[ALERT] 170/095343 (13510) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 170/095343 (13510) : Fatal errors found in configuration.
E0620 09:54:08.868115       1 limiter.go:137] error reloading router: exit status 1
[ALERT] 170/095408 (13581) : parsing [/var/lib/haproxy/conf/haproxy.config:116] : 'bind 127.0.0.1:10444' : 'crt-list' : error processing line 1 in file '/var/lib/haproxy/conf/cert_config.map' : unable to load SSL private key from PEM file '/var/lib/haproxy/router/certs/example-route:wildcard.pem'.
[ALERT] 170/095408 (13581) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 170/095408 (13581) : Fatal errors found in configuration.

Expected results:

`router` to reject the `route` in order to continue to function properly and simply notify the creator of the `route` that it was not possible to create the `route` due to validation error.

Additional info:

Comment 5 Dan Mace 2019-10-02 20:20:19 UTC
I'll be looking into this one.

Comment 7 Dan Mace 2019-10-02 21:46:19 UTC
We've identified what looks to be an issue with ECDSA key handling. More details to come, but this is definitely a bug we need to fix, and soon. I'm going to increase the priority and severity of this bug given the DoS potential.

Comment 10 Dan Mace 2019-10-07 15:13:15 UTC
*** Bug 1749653 has been marked as a duplicate of this bug. ***

Comment 11 Hongan Li 2019-10-10 07:01:11 UTC
verified with 4.3.0-0.ci-2019-10-09-222432 and issue has been fixed.

Note: firstly should disable CVO then disable ingress operator (only disabling ingress operator doesn't work since it will be restored by CVO)
1. oc scale deployment/cluster-version-operator --replicas=0 -n openshift-cluster-version
2. oc scale deployment/ingress-operator --replicas=0 -n openshift-ingress-operator
3. oc set env deployment/router-default EXTENDED_VALIDATION=true -n openshift-ingress
4. create project, pod, svc and customer route
5. ensure no error with router reloading.

Comment 12 Dan Mace 2019-10-11 16:02:42 UTC
*** Bug 1629624 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2020-01-23 11:04:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062


Note You need to log in before you can comment on or make changes to this bug.