Bug 1424484 - Single bad SSL certificate causes cascading failure
Summary: Single bad SSL certificate causes cascading failure
Keywords:
Status: CLOSED DUPLICATE of bug 1389165
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Ben Bennett
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-17 17:17 UTC by Ed Seymour
Modified: 2022-08-04 22:20 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-20 11:53:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ed Seymour 2017-02-17 17:17:29 UTC
Description of problem:
Project A creates a new route for their application, and decides to use edge termination, providing their own certificates. However, there is an issue with the certificate, perhaps the text is corrupt or invalid. 

The HAProxy configuration fails to load due to the error in the certificate. Attempting to access the application via the configured route results in a HTTP 503 error - despite the application actually running successfully in the background. 

Project B creates a new application, and exposes a route. The application operates correctly, and can be accessed within the OpenShift SDN (for example, curl from one Pod to another). 

However, Project B's application route does not work, and returns an HTTP 503 error. Project B is unable to determine the cause of the problem, as all checks from a project perspective show that the service is available. 


Version-Release number of selected component (if applicable): 3.4


How reproducible: 5 minutes


Steps to Reproduce:
1. Create an App, edit route, select edge termination, and insert corrupt certificates
2. Create another independent app (could be a different project), expose an operational service via a route
3. Attempt to access the 2nd application URL
4. As a platform admin view the OpenShift router logs

Actual results:

All new routes for all projects, subject to the router(s) targeted by the first project no longer serve new applications

Expected results:

Bad configuration is rejected, and the route blacklisted in a way that does not impact other users of the platform. 

Additional info:

Example error in HAProxy log:
E0217 17:16:13.245529       1 controller.go:84] error reloading router: exit status 1
---
+ config_file=/var/lib/haproxy/conf/haproxy.config
+ pid_file=/var/lib/haproxy/run/haproxy.pid
+ old_pid=
+ haproxy_conf_dir=/var/lib/haproxy/conf
+ for mapfile in '"$haproxy_conf_dir"/*.map'
+ sort -r /var/lib/haproxy/conf/os_edge_http_be.map -o /var/lib/haproxy/conf/os_edge_http_be.map
+ for mapfile in '"$haproxy_conf_dir"/*.map'
+ sort -r /var/lib/haproxy/conf/os_edge_http_expose.map -o /var/lib/haproxy/conf/os_edge_http_expose.map
+ for mapfile in '"$haproxy_conf_dir"/*.map'
+ sort -r /var/lib/haproxy/conf/os_edge_http_redirect.map -o /var/lib/haproxy/conf/os_edge_http_redirect.map
+ for mapfile in '"$haproxy_conf_dir"/*.map'
+ sort -r /var/lib/haproxy/conf/os_http_be.map -o /var/lib/haproxy/conf/os_http_be.map
+ for mapfile in '"$haproxy_conf_dir"/*.map'
+ sort -r /var/lib/haproxy/conf/os_reencrypt.map -o /var/lib/haproxy/conf/os_reencrypt.map
+ for mapfile in '"$haproxy_conf_dir"/*.map'
+ sort -r /var/lib/haproxy/conf/os_sni_passthrough.map -o /var/lib/haproxy/conf/os_sni_passthrough.map
+ for mapfile in '"$haproxy_conf_dir"/*.map'
+ sort -r /var/lib/haproxy/conf/os_tcp_be.map -o /var/lib/haproxy/conf/os_tcp_be.map
+ '[' -f /var/lib/haproxy/run/haproxy.pid ']'
+ old_pid=137
+ '[' -n 137 ']'
+ /usr/sbin/haproxy -f /var/lib/haproxy/conf/haproxy.config -p /var/lib/haproxy/run/haproxy.pid -sf 137
[ALERT] 047/171613 (147) : parsing [/var/lib/haproxy/conf/haproxy.config:111] : 'bind 127.0.0.1:10444' : unable to load SSL private key from PEM file '/var/lib/containers/router/certs/myproject_badcert.pem'.
[ALERT] 047/171613 (147) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 047/171613 (147) : Fatal errors found in configuration.

Comment 1 Eric Paris 2017-02-17 20:13:25 UTC
Thank you so much for the report. Can you supply the/a example of the bad certificate? We have added extensive verification to try to prevent bad certificates. Your example can help use fix our validation.

Comment 2 Ed Seymour 2017-02-18 11:58:27 UTC
Any file will do, I used the following: 

$ cat garbage.pem

-----BEGIN CERTIFICATE-----
This
is
utter
garbage
-----END CERTIFICATE-----


In my test I used this file for all three entries (cert, key, and ca).

Comment 3 Ed Seymour 2017-02-20 10:24:22 UTC
I've retested using openshift3/ose:latest and openshift3/ose-haproxy-router:latest and this issue has been fixed. The router now identifies the bad certs and does not load this route.

E0220 10:20:48.867752       1 extended_validator.go:67] Skipping route myproject/badcert due to invalid configuration: 
  - spec.tls.caCertificate: Invalid value: "\u003cca certificate data\u003e": failed to parse CA certificate: Could not read any certificates
  - spec.tls.certificate: Invalid value: "\u003ccertificate data\u003e": Could not read any certificates
  - spec.tls.key: Invalid value: "\u003ckey data\u003e": tls: failed to find any PEM data in certificate input


Other routes remain unaffected, and new routes are accepted and loaded. This suggests the issue has been resolved in OpenShift 3.4.

Comment 4 Ed Seymour 2017-02-20 11:53:08 UTC

*** This bug has been marked as a duplicate of bug 1389165 ***


Note You need to log in before you can comment on or make changes to this bug.