Bug 1894431 - Router pods fail to boot if the SSL certificate applied is missing an empty line at the bottom
Summary: Router pods fail to boot if the SSL certificate applied is missing an empty l...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.5
Hardware: All
OS: All
low
low
Target Milestone: ---
: 4.10.0
Assignee: Miciah Dashiel Butler Masters
QA Contact: Melvin Joseph
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-04 09:26 UTC by rdomnu
Modified: 2022-08-04 22:30 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: If the cluster administrator provided a default ingress certificate that was missing the newline character for the last line, OpenShift router would write out a corrupt PEM file for HAProxy. Consequence: Providing a default ingress certificate that was missing the final newline character caused HAProxy to fail to start, which would break all ingress traffic. Fix: OpenShift router was changed so that it adds the missing newline character to an incomplete line when writing out the PEM file. Result: OpenShift router now writes out a valid PEM file so that HAProxy can start and ingress works properly even if the input is missing a newline character.
Clone Of:
Environment:
Last Closed: 2022-03-10 16:02:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift router pull 359 0 None open Bug 1894431: Add missing newlines to default certificate and key 2021-11-12 01:15:23 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:02:50 UTC

Description rdomnu 2020-11-04 09:26:39 UTC
Description of problem: When applying custom SSL certificates to the IngressController and when an empty line is missing at the end of the SSL certificate, the router pods fail to boot. The haproxy.config is empty and the certificates are not present at the expected path.

Version-Release number of selected component (if applicable):

Steps to Reproduce:
1. Create a new TLS secret which include a certificate without an empty line at the bottom.
2. Patch the Ingress controller to use this certificate
3. Router pods are failing

Actual results:
Router pods are stuck as they are missing both the config and certificates

Expected results:
Router pods configure new certificates

Additional info:
Router logs:
[ALERT] 308/092253 (34) : parsing [/var/lib/haproxy/conf/haproxy.config:119] : 'bind 127.0.0.1:10444' : unable to load SSL private key from PEM file '/var/lib/haproxy/router/certs/default.pem'.
[ALERT] 308/092253 (34) : parsing [/var/lib/haproxy/conf/haproxy.config:156] : 'bind 127.0.0.1:10443' : unable to load SSL private key from PEM file '/var/lib/haproxy/router/certs/default.pem'.
[ALERT] 308/092253 (34) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 308/092253 (34) : Fatal errors found in configuration.
I1104 09:22:53.466419       1 healthz.go:200] [+]controller ok
[-]backend-http failed: reason withheld
healthz check failed

From a debug pod:
sh-4.2$ cat /var/lib/haproxy/conf/haproxy.config
sh-4.2$ ls -l /var/lib/haproxy/router/certs/
total 0

Comment 1 Andrew McDermott 2020-11-05 17:34:13 UTC
As the workaround is to add a newline dropping the priority and severity to low.

Comment 2 Miciah Dashiel Butler Masters 2020-11-14 00:58:16 UTC
We'll consider this for the upcoming sprint.

Comment 3 Miciah Dashiel Butler Masters 2020-12-07 03:16:45 UTC
We'll look into this in the upcoming sprint.

Comment 4 Miciah Dashiel Butler Masters 2021-02-06 00:12:31 UTC
Haven't had time to work on this one.

Comment 5 Melvin Joseph 2021-11-16 08:05:30 UTC
The same is reproducing in 4.9 build


1)Create key and crt for secret:
 openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -keyout tls.key -out tls.crt
2)Create a custom template with the key and crt. But delete the empty line at the bottom of the crt file.
 oc -n openshift-ingress create secret generic custom-default-cert3 --from-file=tls.crt --from-file=tls.key=tls.key
3) Patch the custom template
 oc patch --type=merge --namespace openshift-ingress-operator ingresscontrollers/default --patch '{"spec":{"defaultCertificate":{"name":"custom-default-cert3"}}}'
4) oc -n openshift-ingress get pods -o wide
NAME                              READY   STATUS    RESTARTS      AGE     IP            NODE                                         NOMINATED NODE   READINESS GATES
router-default-6946bc6bfd-q78br   1/1     Running   0             9m12s   10.131.0.32   ip-10-0-153-197.us-east-2.compute.internal   <none>           <none>
router-default-7d6d9d75fc-669wq   0/1     Running   3 (32s ago)   7m5s    10.131.0.33   ip-10-0-153-197.us-east-2.compute.internal   <none>           <none>
router-default-7d6d9d75fc-lfl8b   0/1     Running   3 (32s ago)   7m5s    10.128.2.22   ip-10-0-191-49.us-east-2.compute.internal    <none>           <none>
5) oc -n openshift-ingress logs router-default-7d6d9d75fc-lfl8b   -c router --tail 50  
I1116 07:36:01.860015       1 template.go:437] router "msg"="starting router"  "version"="majorFromGit: \nminorFromGit: \ncommitFromGit: 2d1e1f4b\nversionFromGit: v0.0.0-unknown\ngitTreeState: dirty\nbuildDate: 2021-08-02T17:33:06Z\n"
I1116 07:36:01.861605       1 metrics.go:155] metrics "msg"="router health and metrics port listening on HTTP and HTTPS"  "address"="0.0.0.0:1936"
I1116 07:36:01.867639       1 router.go:191] template "msg"="creating a new template router"  "writeDir"="/var/lib/haproxy"
I1116 07:36:01.867698       1 router.go:273] template "msg"="router will coalesce reloads within an interval of each other"  "interval"="5s"
I1116 07:36:01.867943       1 router.go:337] template "msg"="watching for changes"  "path"="/etc/pki/tls/private"
I1116 07:36:01.868000       1 router.go:262] router "msg"="router is including routes in all namespaces"  
E1116 07:36:01.978350       1 haproxy.go:418] can't scrape HAProxy: dial unix /var/lib/haproxy/run/haproxy.sock: connect: no such file or directory
E1116 07:36:01.988628       1 limiter.go:165] error reloading router: exit status 1
[NOTICE] 319/073601 (15) : haproxy version is 2.2.15-5e8f49d
[NOTICE] 319/073601 (15) : path to executable is /usr/sbin/haproxy
[ALERT] 319/073601 (15) : parsing [/var/lib/haproxy/conf/haproxy.config:120] : 'bind 127.0.0.1:10444' : unable to load certificate from file '/var/lib/haproxy/router/certs/default.pem'.
[ALERT] 319/073601 (15) : parsing [/var/lib/haproxy/conf/haproxy.config:157] : 'bind 127.0.0.1:10443' : unable to load certificate from file '/var/lib/haproxy/router/certs/default.pem'.
[ALERT] 319/073601 (15) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 319/073601 (15) : Fatal errors found in configuration.
E1116 07:36:06.983435       1 haproxy.go:418] can't scrape HAProxy: dial unix /var/lib/haproxy/run/haproxy.sock: connect: no such file or directory
E1116 07:36:06.992908       1 limiter.go:165] error reloading router: exit status 1
[NOTICE] 319/073606 (19) : haproxy version is 2.2.15-5e8f49d
[NOTICE] 319/073606 (19) : path to executable is /usr/sbin/haproxy
[ALERT] 319/073606 (19) : parsing [/var/lib/haproxy/conf/haproxy.config:120] : 'bind 127.0.0.1:10444' : unable to load certificate from file '/var/lib/haproxy/router/certs/default.pem'.
[ALERT] 319/073606 (19) : parsing [/var/lib/haproxy/conf/haproxy.config:157] : 'bind 127.0.0.1:10443' : unable to load certificate from file '/var/lib/haproxy/router/certs/default.pem'.
[ALERT] 319/073606 (19) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 319/073606 (19) : Fatal errors found in configuration.
E1116 07:36:11.498035       1 haproxy.go:418] can't scrape HAProxy: dial unix /var/lib/haproxy/run/haproxy.sock: connect: no such file or directory

Comment 6 Melvin Joseph 2021-11-16 08:51:49 UTC
melvinjoseph@mjoseph-mac Downloads % oc get clusterversion 
NAME      VERSION                                               AVAILABLE   PROGRESSING   SINCE   STATUS
version   0.0.1-0.test-2021-11-16-074504-ci-ln-0w93j0t-latest   True        False         26m     Cluster version is 0.0.1-0.test-2021-11-16-074504-ci-ln-0w93j0t-latest
melvinjoseph@mjoseph-mac Downloads % oc -n openshift-ingress get pods -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP            NODE                                         NOMINATED NODE   READINESS GATES
router-default-54c95fdb78-dnpf4   1/1     Running   0          34m   10.128.2.6    ip-10-0-140-241.us-west-1.compute.internal   <none>           <none>
router-default-54c95fdb78-mnlgp   1/1     Running   0          34m   10.131.0.18   ip-10-0-140-58.us-west-1.compute.internal    <none>           <none>
melvinjoseph@mjoseph-mac Downloads % openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -keyout tls.key -out tls.crt
Generating a 2048 bit RSA private key
.................+++
...+++
writing new private key to 'tls.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) []:IN
State or Province Name (full name) []:KERALA
Locality Name (eg, city) []:COCHIN
Organization Name (eg, company) []:REDHAT
Organizational Unit Name (eg, section) []:NE
Common Name (eg, fully qualified host name) []:OPENSHIFT  
melvinjoseph@mjoseph-mac Downloads %

Remove the empty line at the bottom of the crt file.

melvinjoseph@mjoseph-mac Downloads % oc -n openshift-ingress create secret generic custom-default-cert3 --from-file=tls.crt --from-file=tls.key=tls.key
secret/custom-default-cert3 created
melvinjoseph@mjoseph-mac Downloads % oc patch --type=merge --namespace openshift-ingress-operator ingresscontrollers/default --patch '{"spec":{"defaultCertificate":{"name":"custom-default-cert3"}}}'
ingresscontroller.operator.openshift.io/default patched
melvinjoseph@mjoseph-mac Downloads % oc -n openshift-ingress get pods -o wide                                                                                                                         
NAME                              READY   STATUS        RESTARTS   AGE   IP            NODE                                         NOMINATED NODE   READINESS GATES
router-default-54c95fdb78-dnpf4   1/1     Running       0          35m   10.128.2.6    ip-10-0-140-241.us-west-1.compute.internal   <none>           <none>
router-default-54c95fdb78-mnlgp   1/1     Terminating   0          35m   10.131.0.18   ip-10-0-140-58.us-west-1.compute.internal    <none>           <none>
router-default-8644bd544b-vzxz6   1/1     Running       0          10s   10.129.2.24   ip-10-0-237-212.us-west-1.compute.internal   <none>           <none>
router-default-8644bd544b-w64wc   1/1     Running       0          10s   10.128.2.13   ip-10-0-140-241.us-west-1.compute.internal   <none>           <none>
melvinjoseph@mjoseph-mac Downloads % oc -n openshift-ingress get pods -o wide                                           
NAME                              READY   STATUS    RESTARTS   AGE   IP            NODE                                         NOMINATED NODE   READINESS GATES
router-default-8644bd544b-vzxz6   1/1     Running   0          92s   10.129.2.24   ip-10-0-237-212.us-west-1.compute.internal   <none>           <none>
router-default-8644bd544b-w64wc   1/1     Running   0          92s   10.128.2.13   ip-10-0-140-241.us-west-1.compute.internal   <none>           <none>
melvinjoseph@mjoseph-mac Downloads % 
melvinjoseph@mjoseph-mac Downloads % oc -n openshift-ingress logs router-default-8644bd544b-vzxz6   -c router --tail 50                                                
I1116 08:42:29.570145       1 template.go:437] router "msg"="starting router"  "version"="majorFromGit: \nminorFromGit: \ncommitFromGit: a5d2f3e2\nversionFromGit: v0.0.0-unknown\ngitTreeState: dirty\nbuildDate: 2021-11-16T07:42:20Z\n"
I1116 08:42:29.571854       1 metrics.go:155] metrics "msg"="router health and metrics port listening on HTTP and HTTPS"  "address"="0.0.0.0:1936"
I1116 08:42:29.577207       1 router.go:191] template "msg"="creating a new template router"  "writeDir"="/var/lib/haproxy"
I1116 08:42:29.577264       1 router.go:273] template "msg"="router will coalesce reloads within an interval of each other"  "interval"="5s"
I1116 08:42:29.577493       1 router.go:343] template "msg"="watching for changes"  "path"="/etc/pki/tls/private"
I1116 08:42:29.577569       1 router.go:262] router "msg"="router is including routes in all namespaces"  
E1116 08:42:29.683109       1 haproxy.go:418] can't scrape HAProxy: dial unix /var/lib/haproxy/run/haproxy.sock: connect: no such file or directory
I1116 08:42:29.734068       1 router.go:618] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 using PROXY protocol ...\n - Health check ok : 0 retry attempt(s).\n"
I1116 08:43:06.153419       1 router.go:618] template "msg"="router reloaded"  "output"=" - Checking http://localhost:80 using PROXY protocol ...\n - Health check ok : 0 retry attempt(s).\n"
melvinjoseph@mjoseph-mac Downloads %

Comment 12 Brandi Munilla 2022-02-10 20:05:48 UTC
Hi, if there is anything that customers should know about this bug or if there are any important workarounds that should be outlined in the bug fixes section OpenShift Container Platform 4.10 release notes, please update the Doc Type and Doc Text fields. If not, can you please mark it as "no doc update"? Thanks!

Comment 15 errata-xmlrpc 2022-03-10 16:02:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.