We're still looking for an impact assessment for this bug and its parent since both are marked UpgradeBlocker. Please answer the following questions to evaluate whether or not this bug warrants blocking an upgrade edge from either the previous X.Y or X.Y.Z. The ultimate goal is to avoid delivering an update which introduces new risk or reduces cluster functionality in any way. Sample answers are provided to give more context and the ImpactStatementRequested label has been added to this bug. When responding, please remove ImpactStatementRequested and set the ImpactStatementProposed label. The expectation is that the assignee answers these questions. Who is impacted? If we have to block upgrade edges based on this issue, which edges would need blocking? * example: Customers upgrading from 4.y.Z to 4.y+1.z running on GCP with thousands of namespaces, approximately 5% of the subscribed fleet * example: All customers upgrading from 4.y.z to 4.y+1.z fail approximately 10% of the time What is the impact? Is it serious enough to warrant blocking edges? * example: Up to 2 minute disruption in edge routing * example: Up to 90 seconds of API downtime * example: etcd loses quorum and you have to restore from backup How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)? * example: Issue resolves itself after five minutes * example: Admin uses oc to fix things * example: Admin must SSH to hosts, restore from backups, or other non standard admin activities Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)? * example: No, it has always been like this we just never noticed * example: Yes, from 4.y.z to 4.y+1.z Or 4.y.z to 4.y.z+1
*** Bug 1974424 has been marked as a duplicate of this bug. ***
Who is impacted? * All customers upgrading from 4.5 to 4.6 using ovn-kubernetes networking. What is the impact? Is it serious enough to warrant blocking edges? * Upgrade does not complete successfully How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)? * Unknown remediation Is this a regression? * No that I know of
Closing. It seems no one is using 4.5 OVN.
Only one bug in the series needs UpgradeBlocker, so I'm removing it here. If folks think this series deserves blocking edges, please follow up after [1]. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1961528#c28
Upgrade from 4.5.41 -> 4.6.43 still has the issue referred in the bug. I see the PR is merged. oc get pods -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE ovnkube-master-4mlvt 4/4 Running 1 3h19m ovnkube-master-8mjg7 4/4 Running 0 3h19m ovnkube-master-qtgx2 4/4 Running 0 3h19m ovnkube-node-4sz6r 2/2 Running 0 3h8m ovnkube-node-clb4n 2/2 Running 0 3h19m ovnkube-node-k2vfv 2/2 Running 0 3h19m ovnkube-node-nzw5v 2/3 CrashLoopBackOff 24 127m ovnkube-node-shff4 2/2 Running 0 3h8m ovnkube-node-vjtpk 2/2 Running 0 3h19m ovs-node-5h6hq 1/1 Running 0 126m ovs-node-822rv 1/1 Running 0 127m ovs-node-cpbz4 1/1 Running 0 124m ovs-node-g982r 1/1 Running 0 127m ovs-node-gvvwm 1/1 Running 0 124m ovs-node-q25nk 1/1 Running 0 125m :55+00:00 - starting ovn-controller 2021-08-23T22:10:55Z|00001|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting... 2021-08-23T22:10:55Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected 2021-08-23T22:10:55Z|00003|main|INFO|OVN internal version is : [20.12.0-20.16.1-56.0] 2021-08-23T22:10:55Z|00004|main|INFO|OVS IDL reconnected, force recompute. 2021-08-23T22:10:55Z|00005|reconnect|INFO|ssl:10.0.160.221:9642: connecting... 2021-08-23T22:10:55Z|00006|main|INFO|OVNSB IDL reconnected, force recompute. 2021-08-23T22:10:55Z|00007|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:10:55Z|00008|reconnect|INFO|ssl:10.0.160.221:9642: connection attempt failed (Protocol error) 2021-08-23T22:10:55Z|00009|reconnect|INFO|ssl:10.0.212.213:9642: connecting... 2021-08-23T22:10:55Z|00010|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:10:55Z|00011|reconnect|INFO|ssl:10.0.212.213:9642: connection attempt failed (Protocol error) 2021-08-23T22:10:55Z|00012|reconnect|INFO|ssl:10.0.145.82:9642: connecting... 2021-08-23T22:10:55Z|00013|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:10:55Z|00014|reconnect|INFO|ssl:10.0.145.82:9642: connection attempt failed (Protocol error) 2021-08-23T22:10:56Z|00015|reconnect|INFO|ssl:10.0.160.221:9642: connecting... 2021-08-23T22:10:56Z|00016|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:10:56Z|00017|reconnect|INFO|ssl:10.0.160.221:9642: connection attempt failed (Protocol error) 2021-08-23T22:10:56Z|00018|reconnect|INFO|ssl:10.0.160.221:9642: waiting 2 seconds before reconnect 2021-08-23T22:10:58Z|00019|reconnect|INFO|ssl:10.0.212.213:9642: connecting... 2021-08-23T22:10:58Z|00020|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:10:58Z|00021|reconnect|INFO|ssl:10.0.212.213:9642: connection attempt failed (Protocol error) 2021-08-23T22:10:58Z|00022|reconnect|INFO|ssl:10.0.212.213:9642: waiting 4 seconds before reconnect 2021-08-23T22:11:02Z|00023|reconnect|INFO|ssl:10.0.145.82:9642: connecting... 2021-08-23T22:11:02Z|00024|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:11:02Z|00025|reconnect|INFO|ssl:10.0.145.82:9642: connection attempt failed (Protocol error) 2021-08-23T22:11:02Z|00026|reconnect|INFO|ssl:10.0.145.82:9642: continuing to reconnect in the background but suppressing further logging 2021-08-23T22:11:10Z|00027|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:11:18Z|00028|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:11:26Z|00029|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:11:34Z|00030|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:11:42Z|00031|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:11:50Z|00032|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:11:58Z|00033|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:12:06Z|00034|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:12:14Z|00035|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:12:22Z|00036|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:12:30Z|00037|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:12:38Z|00038|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:12:46Z|00039|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:12:54Z|00040|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:13:02Z|00041|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:13:10Z|00042|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:13:18Z|00043|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:13:26Z|00044|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:13:32Z|00045|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connection closed by peer 2021-08-23T22:13:33Z|00046|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting... 2021-08-23T22:13:33Z|00047|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connection attempt failed (No such file or directory) 2021-08-23T22:13:33Z|00048|reconnect|INFO|unix:/var/run/openvswitch/db.sock: waiting 2 seconds before reconnect 2021-08-23T22:13:34Z|00049|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:13:35Z|00050|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting... 2021-08-23T22:13:35Z|00051|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connection attempt failed (No such file or directory) 2021-08-23T22:13:35Z|00052|reconnect|INFO|unix:/var/run/openvswitch/db.sock: waiting 4 seconds before reconnect 2021-08-23T22:13:39Z|00053|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting... 2021-08-23T22:13:39Z|00054|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connection attempt failed (No such file or directory) 2021-08-23T22:13:39Z|00055|reconnect|INFO|unix:/var/run/openvswitch/db.sock: continuing to reconnect in the background but suppressing further logging 2021-08-23T22:13:42Z|00056|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:13:47Z|00057|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected 2021-08-23T22:13:50Z|00058|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small 2021-08-23T22:13:58Z|00059|stream_ssl|WARN|SSL_connect: error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.6.43 True False False 129m cloud-credential 4.6.43 True False False 3h34m cluster-autoscaler 4.6.43 True False False 3h18m config-operator 4.6.43 True False False 3h18m console 4.6.43 True False False 142m csi-snapshot-controller 4.6.43 True False False 3h19m dns 4.5.41 True True False 3h21m etcd 4.6.43 True False False 3h22m image-registry 4.6.43 True True False 3h12m ingress 4.6.43 True False False 143m insights 4.6.43 True False False 3h19m kube-apiserver 4.6.43 True False False 3h21m kube-controller-manager 4.6.43 True False False 3h20m kube-scheduler 4.6.43 True False False 3h21m kube-storage-version-migrator 4.6.43 True False False 3h12m machine-api 4.6.43 True False False 3h16m machine-approver 4.6.43 True False False 3h21m machine-config 4.5.41 True False False 3h22m marketplace 4.6.43 True False False 142m monitoring 4.6.43 False True True 123m network 4.5.41 True True True 3h24m node-tuning 4.6.43 True False False 142m openshift-apiserver 4.6.43 True False False 131m openshift-controller-manager 4.6.43 True False False 143m openshift-samples 4.6.43 True False False 141m operator-lifecycle-manager 4.6.43 True False False 3h23m operator-lifecycle-manager-catalog 4.6.43 True False False 3h22m operator-lifecycle-manager-packageserver 4.6.43 True False False 131m service-ca 4.6.43 True False False 3h24m storage 4.6.43 True False False 142m oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.41 True True 169m Unable to apply 4.6.43: the cluster operator monitoring is degraded
Verified the issue referred in the bug is fixed but overall upgrade of 4.5.z to 4.6 fails.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6.44 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3395