Bug 1389133

Summary: Using Wildcard Certificates
Product: OpenShift Container Platform Reporter: Babak Mozaffari <bmozaffa>
Component: NetworkingAssignee: Phil Cameron <pcameron>
Networking sub component: router QA Contact: zhaozhanqi <zzhao>
Status: CLOSED NOTABUG Docs Contact:
Severity: high    
Priority: high CC: aloughla, aos-bugs, bbennett, bmozaffa
Version: 3.3.0Keywords: UpcomingRelease
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-17 15:24:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Router certificate PEM
none
Router dc
none
Further diagnostic info
none
Router secret yaml none

Description Babak Mozaffari 2016-10-27 00:38:10 UTC
Created attachment 1214454 [details]
Router certificate PEM

Description of problem:
Unable to configure router with custom wildcard certificate

Version-Release number of selected component (if applicable):
3.3.0.35

How reproducible:
Configure new router service with a wildcard certificate as per published documentation. The router pod fails to run.


Steps to Reproduce:
1. Create a certificate:
oadm ca create-server-cert --signer-cert=$CA/ca.crt --signer-key=$CA/ca.key --signer-serial=$CA/ca.serial.txt --hostnames='*.cloudapps.example.com' --cert=cloudapps.crt --key=cloudapps.key
2. Concatenate into pem file:
cat cloudapps.crt cloudapps.key $CA/ca.crt > cloudapps.router.pem
3. Create router:
oadm router --default-cert=cloudapps.router.pem --service-account=router

Actual results:
Router pod fails with the following log:

oc logs -f router-1-9x71t 

I1023 17:03:21.213317       1 router.go:161] Router is including routes in all namespaces
E1023 17:03:21.625055       1 ratelimiter.go:52] error reloading router: exit status 1
[ALERT] 296/170321 (28) : parsing [/var/lib/haproxy/conf/haproxy.config:129] : 'bind 127.0.0.1:10444' : unable to load SSL private key from PEM file '/etc/pki/tls/private/tls.crt'.
[ALERT] 296/170321 (28) : parsing [/var/lib/haproxy/conf/haproxy.config:166] : 'bind 127.0.0.1:10443' : unable to load SSL private key from PEM file '/etc/pki/tls/private/tls.crt'.
[ALERT] 296/170321 (28) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 296/170321 (28) : Proxy 'fe_sni': no SSL certificate specified for bind '127.0.0.1:10444' at [/var/lib/haproxy/conf/haproxy.config:129] (use 'crt').
[ALERT] 296/170321 (28) : Proxy 'fe_no_sni': no SSL certificate specified for bind '127.0.0.1:10443' at [/var/lib/haproxy/conf/haproxy.config:166] (use 'crt').
[ALERT] 296/170321 (28) : Fatal errors found in configuration.

Expected results:
Router pod is successfully created and in running state with the same PEM file used in version 3.2

Additional info:
The PEM file is attached. This file was created in an OCP 3.3 environment where it failed to create a router, but was used successfully in v3.2

Comment 2 Babak Mozaffari 2016-10-31 21:46:52 UTC
Created attachment 1215931 [details]
Router dc

Comment 3 Babak Mozaffari 2016-10-31 21:47:22 UTC
Created attachment 1215932 [details]
Further diagnostic info

Comment 4 Babak Mozaffari 2016-10-31 21:47:53 UTC
Created attachment 1215933 [details]
Router secret yaml

Comment 13 Phil Cameron 2016-11-17 15:24:13 UTC
Avoiding this problem:

When a router is created it creates a service and a secret. When you delete the router you must delete both the service and the secret before recreating the router. If you create a new router with the name of a previously deleted router and the secret exists this problem can occur.

The workaround is to make sure the router, service and secret are deleted when deleting a router.

oc delete dc myroutername
oc delete svc myroutername
oc delete secret myroutername-certs

Before creating the router verify that the secret is not present.

oc get secret myroutername-certs

openshift/origin/issues/11927 that complains that a secret that is created by a service annotation should delete the secret when the service is deleted. This is the root cause of this problem.

=========

Background context:

When you create a router and provide --default-cert your default cert is used in creating a secret (routername-certs).

When you create a router and do not provide a --default-cert one is automatically created for you (with name routername-certs). This is done by adding an annotation to the Router's service.

This secret should be deleted when the service is deleted (see issues/11927 above)

You can run into this because the default secret (when --default-cert= is not provided) is not in pem format, it is a crt and a key. When the router is started the crt and key are concatenated. This is not done when the router is expecting a pem. 

If you create the default router and it generates the crt/key version of the secret. The next time you create with a --default-cert the secret is not updated and that causes the problem.