Bug 1874278

Summary: HAProxy reloads fail on "unable to load SSL certificate from PEM file '/var/lib/haproxy/conf/default_pub_keys.pem'".
Product: OpenShift Container Platform Reporter: Stephen Greene <sgreene>
Component: NetworkingAssignee: Stephen Greene <sgreene>
Networking sub component: router QA Contact: Arvind iyengar <aiyengar>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: unspecified CC: aiyengar, aos-bugs, bperkins
Version: 4.6   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:36:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Stephen Greene 2020-08-31 21:20:53 UTC
Router CI is currently broken on the router e2e job.

A subset of the router e2e tests are failing. Log files would suggest that the e2e router pods are not able to load the default cert baked into the router image.


[ALERT] 243/205744 (21) : parsing [/var/lib/haproxy/conf/haproxy.config:116] : 'bind 127.0.0.1:10444' : unable to load SSL certificate from PEM file '/var/lib/haproxy/conf/default_pub_keys.pem'.
[ALERT] 243/205744 (21) : parsing [/var/lib/haproxy/conf/haproxy.config:153] : 'bind 127.0.0.1:10443' : unable to load SSL certificate from PEM file '/var/lib/haproxy/conf/default_pub_keys.pem'.
[ALERT] 243/205744 (21) : Error(s) found in configuration file : /var/lib/haproxy/conf/haproxy.config
[ALERT] 243/205744 (21) : Fatal errors found in configuration.

The router e2e jobs depend on the default cert in the image to run properly.

This may be related to recent base image changes through ART. 


How reproducible: 
100%, see router CI jobs.
https://prow.ci.openshift.org/pr-history?org=openshift&repo=router&pr=170

Reproduce outside of CI:

Launch a 4.6 cluster bot cluster with a reference to a router PR

ie

launch openshift/router#170 gcp

Comment 3 Arvind iyengar 2020-09-08 07:51:49 UTC
Verified in the latest "4.6.0-0.ci-2020-09-04-224216" payload. The environment now includes "SECLEVEL=1" for the SSL:
----
$ oc get clusterversion
NAME      VERSION                        AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.ci-2020-09-04-224216   True        False         39m     Cluster version is 4.6.0-0.ci-2020-09-04-224216

sh-4.4$ cat /etc/crypto-policies/back-ends/openssl
openssl.config     opensslcnf.config  

sh-4.4$ cat /etc/crypto-policies/back-ends/opensslcnf.config 
CipherString = @SECLEVEL=1:kEECDH:kRSA:kEDH:kPSK:kDHEPSK:kECDHEPSK:-aDSS:-3DES:!DES:!RC4:!RC2:!IDEA:-SEED:!eNULL:!aNULL:!MD5:-SHA384:-CAMELLIA:-ARIA:-AESCCM8
Ciphersuites = TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:TLS_AES_128_CCM_SHA256
----

Comment 5 errata-xmlrpc 2020-10-27 16:36:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196