Bug 1763605

Summary: Router pod keep restarting when set tlsSecurityProfile to Modern
Product: OpenShift Container Platform Reporter: Hongan Li <hongli>
Component: NetworkingAssignee: Daneyon Hansen <dhansen>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aos-bugs, bbennett, dhansen, knewcome
Version: 4.3.0   
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-23 11:08:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hongan Li 2019-10-21 07:43:59 UTC
Description of problem:
Create a test ingresscontroller and set tlsSecurityProfile to "Modern", and the router pod keeps restarting.

Version-Release number of selected component (if applicable):
4.3.0-0.nightly-2019-10-20-140322

How reproducible:
100%

Steps to Reproduce:
1. oc create -f ingresscontroller-tlspolicy.yaml 
spec:
  defaultCertificate:
    name: router-certs-default
  domain: tlspolicy.qe-hongli322.qe.devcluster.openshift.com
  replicas: 1
  tlsSecurityProfile:
    type: Modern

2.

Actual results:
$ oc -n openshift-ingress get pod
NAME                                READY   STATUS    RESTARTS   AGE
router-default-5976b747bd-btqvg     1/1     Running   0          4h17m
router-default-5976b747bd-gbbq4     1/1     Running   0          4h17m
router-tlspolicy-55f8796c8f-rxnvv   0/1     Running   3          104s

$ oc -n openshift-ingress logs router-tlspolicy-55f8796c8f-rxnvv
2019-10-21T05:52:59.798Z	INFO	router.router	router/template.go:293	starting router	{"version": "v0.0.0-master+$Format:%h$"}
2019-10-21T05:52:59.801Z	INFO	router.metrics	metrics/metrics.go:153	router health and metrics port listening on HTTP and HTTPS	{"address": "0.0.0.0:1936"}
2019-10-21T05:52:59.806Z	INFO	router.template	template/router.go:294	watching for changes	{"path": "/etc/pki/tls/private"}
E1021 05:52:59.809115       1 haproxy.go:395] can't scrape HAProxy: dial unix /var/lib/haproxy/run/haproxy.sock: connect: no such file or directory
2019-10-21T05:52:59.824Z	INFO	router.router	router/router.go:257	router is including routes in all namespaces
E1021 05:53:00.031890       1 haproxy.go:395] can't scrape HAProxy: dial unix /var/lib/haproxy/run/haproxy.sock: connect: no such file or directory
E1021 05:53:00.049139       1 limiter.go:140] error reloading router: exit status 1
[ALERT] 293/055300 (27) : Proxy 'fe_sni': all SSL/TLS versions are disabled for bind '127.0.0.1:10444' at [/var/lib/haproxy/conf/haproxy.config:117].
[ALERT] 293/055300 (27) : Proxy 'fe_sni': unable to set SSL cipher list to 'TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256' for bind '127.0.0.1:10444' at [/var/lib/haproxy/conf/haproxy.config:117].
[ALERT] 293/055300 (27) : Proxy 'fe_sni': unable to set SSL cipher list to 'TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256' for bind '127.0.0.1:10444' at [/var/lib/haproxy/conf/haproxy.config:117].
[ALERT] 293/055300 (27) : Proxy 'fe_no_sni': all SSL/TLS versions are disabled for bind '127.0.0.1:10443' at [/var/lib/haproxy/conf/haproxy.config:154].
[ALERT] 293/055300 (27) : Proxy 'fe_no_sni': unable to set SSL cipher list to 'TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256' for bind '127.0.0.1:10443' at [/var/lib/haproxy/conf/haproxy.config:154].
[ALERT] 293/055300 (27) : Proxy 'fe_no_sni': unable to set SSL cipher list to 'TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256' for bind '127.0.0.1:10443' at [/var/lib/haproxy/conf/haproxy.config:154].
[ALERT] 293/055300 (27) : Fatal errors found in configuration.


Expected results:
haproxy router pod should works well

Additional info:
$ oc -n openshift-ingress-operator get ingresscontrollers.operator.openshift.io tlspolicy -o yaml
<---snip--->
  tlsSecurityProfile:
    type: Modern
status:
  tlsProfile:
    ciphers:
    - TLS_AES_128_GCM_SHA256
    - TLS_AES_256_GCM_SHA384
    - TLS_CHACHA20_POLY1305_SHA256
    minTLSVersion: VersionTLSv13

$ oc -n openshift-ingress get deployment router-tlspolicy -o yaml
<---snip--->
        - name: ROUTER_CIPHERS
          value: TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
        - name: SSL_MIN_VERSION
          value: TLSv1.3

Comment 1 Daneyon Hansen 2019-10-23 04:29:59 UTC
The TLSSecurityPolicy ciphers [1] are based off newer versions of OpenSSL [2] than what is used by the ingress router:

$ oc exec -n openshift-ingress router-default-696f5bcb57-mdjxs -- haproxy -vv
HA-Proxy version 1.8.17 2019/01/08
<SNIP>
Built with OpenSSL version : OpenSSL 1.0.2k-fips  26 Jan 2017
Running on OpenSSL version : OpenSSL 1.0.2k-fips  26 Jan 2017
<SNIP>

Therefore, the modern and intermediate profiles are incompatible with the current version of haproxy and openssl.

Possible solutions to fix this issue are:

1. Rebuild HAProxy 1.8.17 with OpenSSL v1.1.1. Update `haproxy-config.template` to use `ssl-default-bind-ciphersuites` to specify the TLS v1.3 ciphers for all applicable TLS security profiles.
2. Have ingress operator mutate the cipher suites of TLS security profiles to conform to OpenSSL v1.0.2.

Note: The current version of openssl is fips certified. Upgrading openssl to 1.1.1 (required for TLS v1.3 support) is not fips certified.

TLS v1.3 support should be removed from the ingress operator TLS security profile implementation until HAProxy is built with a version of OpenSSL that supports TLS v1.3. FIPS is planned for OpenSSL v3 (under development) [3].

Additional details can be found at [4].

[1] https://github.com/openshift/api/blob/master/config/v1/types_tlssecurityprofile.go#L179-L233
[2] https://wiki.mozilla.org/Security/Server_Side_TLS#Recommended_configurations
[3] https://wiki.openssl.org/index.php/FIPS_modules
[4] https://gist.github.com/danehans/6ba50f05eb6e7d9f684c1cbf3db3d867

Comment 3 Ben Bennett 2019-10-28 13:47:05 UTC
The enhancement spec says that things using the cipher policies should make their best effort to support the profiles.

And, since TLS 1.2 has no security problems, and 1.3 isn't supported in the router image yet, we should select the equivalent ciphers for TLS1.2 to the recommended intermediate ones, and then use TLS 1.2 for now.

Comment 5 Daneyon Hansen 2019-10-28 15:40:03 UTC
Kirsten,

https://github.com/openshift/cluster-ingress-operator/pull/315 has been submitted to convert Modern profiles to Intermediate until haproxy is upgraded to a version of openssl that includes tls 1.3.

Comment 7 Hongan Li 2019-11-07 06:39:21 UTC
verified with 4.3.0-0.nightly-2019-11-06-184828 and issue has been fixed. No error is router pod.

$ oc get ingresscontroller/default -n openshift-ingress-operator -o yaml
<---snip--->
spec:
  replicas: 2
  tlsSecurityProfile:
    type: Modern
status:
  tlsProfile:
    ciphers:
    - TLS_AES_128_GCM_SHA256
    - TLS_AES_256_GCM_SHA384
    - TLS_CHACHA20_POLY1305_SHA256
    - ECDHE-ECDSA-AES128-GCM-SHA256
    - ECDHE-RSA-AES128-GCM-SHA256
    - ECDHE-ECDSA-AES256-GCM-SHA384
    - ECDHE-RSA-AES256-GCM-SHA384
    - ECDHE-ECDSA-CHACHA20-POLY1305
    - ECDHE-RSA-CHACHA20-POLY1305
    - DHE-RSA-AES128-GCM-SHA256
    - DHE-RSA-AES256-GCM-SHA384
    minTLSVersion: VersionTLS12

Comment 9 errata-xmlrpc 2020-01-23 11:08:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062