Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1723400

Summary: EXTENDED_VALIDATION doesn't capture certificate / key mismatch, causing the router to misbehave
Product: OpenShift Container Platform Reporter: Simon Reber <sreber>
Component: NetworkingAssignee: Dan Mace <dmace>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: alwyn, aos-bugs, dmace, farandac, freark+1, nbhatt, pkanthal, stwalter, wgordon
Version: 3.9.0Keywords: NeedsTestCase
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1758373 1758374 1758375 1794487 (view as bug list) Environment:
Last Closed: 2020-01-23 11:04:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1758373, 1758375, 1794487    

Description Simon Reber 2019-06-24 12:56:23 UTC
Description of problem:

Even though the EXTENDED_VALIDATION is set to true, adding a certificate to a specific `route` with the wrong key will cause the `router` to fail during re-load which can impact production services as changes within the service are not properly reflected.

With EXTENDED_VALIDATION on, it's expected to decline such route from being created and prevent the `router` from failing.

Version-Release number of selected component (if applicable):

> oc v3.9.74
> kubernetes v1.9.1+a0ce1bc657
> features: Basic-Auth GSSAPI Kerberos SPNEGO
> 
> Server https://openshift.example.com:443
> openshift v3.9.74
> kubernetes v1.9.1+a0ce1bc657

How reproducible:
Always


Steps to Reproduce:
1. Make sure EXTENDED_VALIDATION is set to `true` on the `router`
2. Create a route with Edge termination set and apply a custom certificate.
3. Add a wrong key for the certificate (not matching) and create the route

Actual results:

`router` is failing to reload and thus apply changes within it's configuration. Error reported by `router` is as following.

E0620 09:53:43.202882       1 limiter.go:137] error reloading router: exit status 1
[ALERT] 170/095343 (13510) : parsing [/var/lib/haproxy/conf/haproxy.config:116] : 'bind 127.0.0.1:10444' : 'crt-list' : error processing line 1 in file '/var/lib/haproxy/conf/cert_config.map' : unable to load SSL private key from PEM file '/var/lib/haproxy/router/certs/example-route:wildcard.pem'.
[ALERT] 170/095343 (13510) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 170/095343 (13510) : Fatal errors found in configuration.
E0620 09:54:08.868115       1 limiter.go:137] error reloading router: exit status 1
[ALERT] 170/095408 (13581) : parsing [/var/lib/haproxy/conf/haproxy.config:116] : 'bind 127.0.0.1:10444' : 'crt-list' : error processing line 1 in file '/var/lib/haproxy/conf/cert_config.map' : unable to load SSL private key from PEM file '/var/lib/haproxy/router/certs/example-route:wildcard.pem'.
[ALERT] 170/095408 (13581) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 170/095408 (13581) : Fatal errors found in configuration.

Expected results:

`router` to reject the `route` in order to continue to function properly and simply notify the creator of the `route` that it was not possible to create the `route` due to validation error.

Additional info:

Comment 5 Dan Mace 2019-10-02 20:20:19 UTC
I'll be looking into this one.

Comment 7 Dan Mace 2019-10-02 21:46:19 UTC
We've identified what looks to be an issue with ECDSA key handling. More details to come, but this is definitely a bug we need to fix, and soon. I'm going to increase the priority and severity of this bug given the DoS potential.

Comment 10 Dan Mace 2019-10-07 15:13:15 UTC
*** Bug 1749653 has been marked as a duplicate of this bug. ***

Comment 11 Hongan Li 2019-10-10 07:01:11 UTC
verified with 4.3.0-0.ci-2019-10-09-222432 and issue has been fixed.

Note: firstly should disable CVO then disable ingress operator (only disabling ingress operator doesn't work since it will be restored by CVO)
1. oc scale deployment/cluster-version-operator --replicas=0 -n openshift-cluster-version
2. oc scale deployment/ingress-operator --replicas=0 -n openshift-ingress-operator
3. oc set env deployment/router-default EXTENDED_VALIDATION=true -n openshift-ingress
4. create project, pod, svc and customer route
5. ensure no error with router reloading.

Comment 12 Dan Mace 2019-10-11 16:02:42 UTC
*** Bug 1629624 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2020-01-23 11:04:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062