Bug 1379701

Summary: Customer cert for route caused fatal error in haproxy router
Product: OpenShift Container Platform Reporter: Steven Walter <stwalter>
Component: NetworkingAssignee: Ram Ranganathan <ramr>
Networking sub component: router QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: aos-bugs, bbennett, bperkins, jliggitt, sreber, stwalter, tdawson
Version: 3.2.0   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: By default extended certificate validation was not enabled so bad certs could crash the router. Consequence: Bad certificates in routes could crash the router. Fix: We changed the default in 'oadm router' to turn on extended validation when a router is created. Result: Bad certificates are caught and the route they are associated with is not used (and an appropriate status is set on it)
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-01-18 12:41:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Steven Walter 2016-09-27 13:24:01 UTC
Description of problem:

Using an improperly formatted cert for a route, using a key that had a passphrase on it, caused an outage in the router for all customers.


It appears the offending cert causes issues when re-encrypting the route on the F5. When the certificate was created they had a passphrase on it, which needs to be removed for it to work. Apparently this caused a fatal error which required deleting the route to recover from. One customer using an improperly formatted certificate caused a router outage for the whole environment.



Version-Release number of selected component (if applicable):
# openshift version
openshift v3.2.1.4-1-g1864c8f
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5


How reproducible:
Unverified


Actual results:

The application which had the bad cert showed these logs:

[ALERT] 255/130843 (51214) : parsing [/var/lib/haproxy/conf/haproxy.config:112] : 'bind 127.0.0.1:10444' :
  unable to load SSL private key from PEM file '/var/lib/containers/router/certs/example.pem'.
  unable to load SSL private key from PEM file '/var/lib/containers/router/certs/example.pem'.
  unable to load SSL private key from PEM file '/var/lib/containers/router/certs/example.pem'.
[ALERT] 255/130843 (51214) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 255/130843 (51214) : Fatal errors found in configuration.

The fatal error from the logs in the router was:

[ALERT] 255/130843 (51214) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 255/130843 (51214) : Fatal errors found in configuration.

Expected results:

Either throw a warning and refuse to serve route or else work.

Additional info:
Is there any way to automatically sanity check certificates before they are used in routes? The problem can be rectified by deleting the route.

Comment 2 Ben Bennett 2016-09-27 14:35:40 UTC
Is the cert in the route broken?  If so, can they include the route yaml.

Obviously, if it contains sensitive keys, they shouldn't give it to us.

Comment 18 Ram Ranganathan 2016-10-26 17:44:12 UTC
Defaults set to true with PR: https://github.com/openshift/origin/pull/11218

Comment 19 Troy Dawson 2016-10-27 16:14:07 UTC
This has been merged into ose and is in OSE v3.4.0.16 or newer.

Comment 21 zhaozhanqi 2016-10-28 03:03:18 UTC
Verified this bug on haproxy images (id: 227ebcf6c7d8). the default EXTENDED_VALIDATION is true. and the invalid route will be skip.

Comment 24 errata-xmlrpc 2017-01-18 12:41:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0066