Bug 1973770 - [4.6.z] 4.5 -> 4.6 upgrade failed with ovn pod error: SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
Summary: [4.6.z] 4.5 -> 4.6 upgrade failed with ovn pod error: SSL_connect: error:141A...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.z
Assignee: Jaime Caamaño Ruiz
QA Contact: Arti Sood
URL:
Whiteboard:
: 1974424 (view as bug list)
Depends On: 1973768
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-18 16:40 UTC by Jaime Caamaño Ruiz
Modified: 2021-11-16 22:12 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: During 4.5 to 4.6 upgrade, stricter security requirements of openssl versions included in 4.6 ovn-kubernetes components prevented the upgrade to complete successfully. Specifically the use of 1024 bit based DH params was disallowed on those openssl versions. Consequence: Upgrade of ovn-kuberentes and thus the cluster-network -operator does not progress to complete status and upgrade is stuck. Fix: Soften the openssl security requirements to allow the use of 1024 bit based DH params in ovn-kuberenetes componenets. Result: The use of 1024 bits based DH params with openssl no longer prevents the 4.5 to 4.6 upgrade to complete.
Clone Of: 1961528
Environment:
Last Closed: 2021-09-09 01:52:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1131 0 None open Bug 1973770: Reduce SSL sec level for ovsdb connections 2021-06-24 14:36:12 UTC
Red Hat Product Errata RHBA-2021:3395 0 None None None 2021-09-09 01:53:14 UTC

Comment 1 Jack Ottofaro 2021-06-21 14:21:31 UTC
We're still looking for an impact assessment for this bug and its parent since both are marked UpgradeBlocker.

Please answer the following questions to evaluate whether or not this bug warrants blocking an upgrade edge from either the previous X.Y or X.Y.Z.  The ultimate goal is to avoid delivering an update which introduces new risk or reduces cluster functionality in any way.  Sample answers are provided to give more context and the ImpactStatementRequested label has been added to this bug.  When responding, please remove ImpactStatementRequested and set the ImpactStatementProposed label.
The expectation is that the assignee answers these questions.

Who is impacted?  If we have to block upgrade edges based on this issue, which edges would need blocking?
* example: Customers upgrading from 4.y.Z to 4.y+1.z running on GCP with thousands of namespaces, approximately 5% of the subscribed fleet
* example: All customers upgrading from 4.y.z to 4.y+1.z fail approximately 10% of the time

What is the impact?  Is it serious enough to warrant blocking edges?
* example: Up to 2 minute disruption in edge routing
* example: Up to 90 seconds of API downtime
* example: etcd loses quorum and you have to restore from backup

How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)?
* example: Issue resolves itself after five minutes
* example: Admin uses oc to fix things
* example: Admin must SSH to hosts, restore from backups, or other non standard admin activities

Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)?
* example: No, it has always been like this we just never noticed
* example: Yes, from 4.y.z to 4.y+1.z Or 4.y.z to 4.y.z+1

Comment 2 Xingxing Xia 2021-06-22 02:45:00 UTC
*** Bug 1974424 has been marked as a duplicate of this bug. ***

Comment 3 Jaime Caamaño Ruiz 2021-06-24 08:52:47 UTC
Who is impacted? 
* All customers upgrading from 4.5 to 4.6 using ovn-kubernetes networking.

What is the impact?  Is it serious enough to warrant blocking edges?
* Upgrade does not complete successfully

How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)?
* Unknown remediation

Is this a regression?
* No that I know of

Comment 8 Ross Brattain 2021-07-20 13:39:33 UTC
Closing.

It seems no one is using 4.5 OVN.

Comment 10 W. Trevor King 2021-08-18 22:18:24 UTC
Only one bug in the series needs UpgradeBlocker, so I'm removing it here.  If folks think this series deserves blocking edges, please follow up after [1].

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1961528#c28

Comment 14 Arti Sood 2021-08-24 00:25:05 UTC
Upgrade from 4.5.41 -> 4.6.43 still has the issue referred in the bug. I see the PR is merged.

oc get pods -n openshift-ovn-kubernetes
NAME                   READY   STATUS             RESTARTS   AGE
ovnkube-master-4mlvt   4/4     Running            1          3h19m
ovnkube-master-8mjg7   4/4     Running            0          3h19m
ovnkube-master-qtgx2   4/4     Running            0          3h19m
ovnkube-node-4sz6r     2/2     Running            0          3h8m
ovnkube-node-clb4n     2/2     Running            0          3h19m
ovnkube-node-k2vfv     2/2     Running            0          3h19m
ovnkube-node-nzw5v     2/3     CrashLoopBackOff   24         127m
ovnkube-node-shff4     2/2     Running            0          3h8m
ovnkube-node-vjtpk     2/2     Running            0          3h19m
ovs-node-5h6hq         1/1     Running            0          126m
ovs-node-822rv         1/1     Running            0          127m
ovs-node-cpbz4         1/1     Running            0          124m
ovs-node-g982r         1/1     Running            0          127m
ovs-node-gvvwm         1/1     Running            0          124m
ovs-node-q25nk         1/1     Running            0          125m

:55+00:00 - starting ovn-controller
2021-08-23T22:10:55Z|00001|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2021-08-23T22:10:55Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2021-08-23T22:10:55Z|00003|main|INFO|OVN internal version is : [20.12.0-20.16.1-56.0]
2021-08-23T22:10:55Z|00004|main|INFO|OVS IDL reconnected, force recompute.
2021-08-23T22:10:55Z|00005|reconnect|INFO|ssl:10.0.160.221:9642: connecting...
2021-08-23T22:10:55Z|00006|main|INFO|OVNSB IDL reconnected, force recompute.
2021-08-23T22:10:55Z|00007|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:10:55Z|00008|reconnect|INFO|ssl:10.0.160.221:9642: connection attempt failed (Protocol error)
2021-08-23T22:10:55Z|00009|reconnect|INFO|ssl:10.0.212.213:9642: connecting...
2021-08-23T22:10:55Z|00010|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:10:55Z|00011|reconnect|INFO|ssl:10.0.212.213:9642: connection attempt failed (Protocol error)
2021-08-23T22:10:55Z|00012|reconnect|INFO|ssl:10.0.145.82:9642: connecting...
2021-08-23T22:10:55Z|00013|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:10:55Z|00014|reconnect|INFO|ssl:10.0.145.82:9642: connection attempt failed (Protocol error)
2021-08-23T22:10:56Z|00015|reconnect|INFO|ssl:10.0.160.221:9642: connecting...
2021-08-23T22:10:56Z|00016|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:10:56Z|00017|reconnect|INFO|ssl:10.0.160.221:9642: connection attempt failed (Protocol error)
2021-08-23T22:10:56Z|00018|reconnect|INFO|ssl:10.0.160.221:9642: waiting 2 seconds before reconnect
2021-08-23T22:10:58Z|00019|reconnect|INFO|ssl:10.0.212.213:9642: connecting...
2021-08-23T22:10:58Z|00020|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:10:58Z|00021|reconnect|INFO|ssl:10.0.212.213:9642: connection attempt failed (Protocol error)
2021-08-23T22:10:58Z|00022|reconnect|INFO|ssl:10.0.212.213:9642: waiting 4 seconds before reconnect
2021-08-23T22:11:02Z|00023|reconnect|INFO|ssl:10.0.145.82:9642: connecting...
2021-08-23T22:11:02Z|00024|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:11:02Z|00025|reconnect|INFO|ssl:10.0.145.82:9642: connection attempt failed (Protocol error)
2021-08-23T22:11:02Z|00026|reconnect|INFO|ssl:10.0.145.82:9642: continuing to reconnect in the background but suppressing further logging
2021-08-23T22:11:10Z|00027|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:11:18Z|00028|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:11:26Z|00029|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:11:34Z|00030|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:11:42Z|00031|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:11:50Z|00032|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:11:58Z|00033|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:12:06Z|00034|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:12:14Z|00035|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:12:22Z|00036|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:12:30Z|00037|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:12:38Z|00038|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:12:46Z|00039|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:12:54Z|00040|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:13:02Z|00041|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:13:10Z|00042|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:13:18Z|00043|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:13:26Z|00044|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:13:32Z|00045|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connection closed by peer
2021-08-23T22:13:33Z|00046|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2021-08-23T22:13:33Z|00047|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connection attempt failed (No such file or directory)
2021-08-23T22:13:33Z|00048|reconnect|INFO|unix:/var/run/openvswitch/db.sock: waiting 2 seconds before reconnect
2021-08-23T22:13:34Z|00049|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:13:35Z|00050|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2021-08-23T22:13:35Z|00051|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connection attempt failed (No such file or directory)
2021-08-23T22:13:35Z|00052|reconnect|INFO|unix:/var/run/openvswitch/db.sock: waiting 4 seconds before reconnect
2021-08-23T22:13:39Z|00053|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2021-08-23T22:13:39Z|00054|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connection attempt failed (No such file or directory)
2021-08-23T22:13:39Z|00055|reconnect|INFO|unix:/var/run/openvswitch/db.sock: continuing to reconnect in the background but suppressing further logging
2021-08-23T22:13:42Z|00056|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:13:47Z|00057|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2021-08-23T22:13:50Z|00058|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small
2021-08-23T22:13:58Z|00059|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small


oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.6.43    True        False         False      129m
cloud-credential                           4.6.43    True        False         False      3h34m
cluster-autoscaler                         4.6.43    True        False         False      3h18m
config-operator                            4.6.43    True        False         False      3h18m
console                                    4.6.43    True        False         False      142m
csi-snapshot-controller                    4.6.43    True        False         False      3h19m
dns                                        4.5.41    True        True          False      3h21m
etcd                                       4.6.43    True        False         False      3h22m
image-registry                             4.6.43    True        True          False      3h12m
ingress                                    4.6.43    True        False         False      143m
insights                                   4.6.43    True        False         False      3h19m
kube-apiserver                             4.6.43    True        False         False      3h21m
kube-controller-manager                    4.6.43    True        False         False      3h20m
kube-scheduler                             4.6.43    True        False         False      3h21m
kube-storage-version-migrator              4.6.43    True        False         False      3h12m
machine-api                                4.6.43    True        False         False      3h16m
machine-approver                           4.6.43    True        False         False      3h21m
machine-config                             4.5.41    True        False         False      3h22m
marketplace                                4.6.43    True        False         False      142m
monitoring                                 4.6.43    False       True          True       123m
network                                    4.5.41    True        True          True       3h24m
node-tuning                                4.6.43    True        False         False      142m
openshift-apiserver                        4.6.43    True        False         False      131m
openshift-controller-manager               4.6.43    True        False         False      143m
openshift-samples                          4.6.43    True        False         False      141m
operator-lifecycle-manager                 4.6.43    True        False         False      3h23m
operator-lifecycle-manager-catalog         4.6.43    True        False         False      3h22m
operator-lifecycle-manager-packageserver   4.6.43    True        False         False      131m
service-ca                                 4.6.43    True        False         False      3h24m
storage                                    4.6.43    True        False         False      142m

oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.41    True        True          169m    Unable to apply 4.6.43: the cluster operator monitoring is degraded

Comment 19 Arti Sood 2021-08-25 16:01:35 UTC
Verified the issue referred in the bug is fixed but overall upgrade of 4.5.z to 4.6 fails.

Comment 22 errata-xmlrpc 2021-09-09 01:52:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.44 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3395


Note You need to log in before you can comment on or make changes to this bug.