Created attachment 1857210 [details] ovs-configure service journal Description of problem: We're mis-detecting the ipv6 status and setting ipv6.may-fail to false, which causes the connection to fail to come up since there is no ipv6 address available here: Jan 27 16:42:28 master-0-0 configure-ovs.sh[1837]: ++ nmcli -m multiline --get-values ip6.address conn show 84a523ff-ee8a-4a29-94ca-47590eb0cb76 Jan 27 16:42:28 master-0-0 configure-ovs.sh[1837]: ++ wc -l Jan 27 16:42:28 master-0-0 configure-ovs.sh[1837]: + num_ip6_addrs=2 Jan 27 16:42:28 master-0-0 configure-ovs.sh[1837]: + '[' 2 -gt 1 ']' Jan 27 16:42:28 master-0-0 configure-ovs.sh[1837]: + extra_if_brex_args+='ipv6.may-fail no ' Version-Release number of selected component (if applicable): 4.10.0-0.nightly-2022-01-27-104747 How reproducible: Deploy IPv4 cluster Steps to Reproduce: 1. 2. 3. Actual results: Deployment bootstrap failure, masters NotReady, Network operator degraded, ovn-pods CrashLoopBack oc logs network-operator-78ccc94f66-mww95 -n openshift-network-operator I0127 19:57:34.823444 1 log.go:184] Set ClusterOperator conditions: - lastTransitionTime: "2022-01-27T16:54:02Z" message: |- DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - pod ovnkube-node-d2hpt is in CrashLoopBackOff State DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - pod ovnkube-node-5djsx is in CrashLoopBackOff State DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - pod ovnkube-node-smpm6 is in CrashLoopBackOff State DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - last change 2022-01-27T16:51:32Z reason: RolloutHung status: "True" type: Degraded - lastTransitionTime: "2022-01-27T16:51:05Z" status: "False" type: ManagementStateDegraded - lastTransitionTime: "2022-01-27T16:51:05Z" status: "True" type: Upgradeable Expected results: Deployment to succeed Additional info: [kni@provisionhost-0-0 ~]$ oc get pods -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE ovnkube-master-l8x2m 6/6 Running 6 (3h19m ago) 3h20m ovnkube-master-mmqs7 6/6 Running 6 (3h19m ago) 3h20m ovnkube-master-xjqfx 6/6 Running 1 (76m ago) 3h20m ovnkube-node-5djsx 4/5 CrashLoopBackOff 43 (2m56s ago) 3h20m ovnkube-node-d2hpt 4/5 CrashLoopBackOff 43 (2m49s ago) 3h20m ovnkube-node-smpm6 4/5 CrashLoopBackOff 43 (2m56s ago) 3h20m
master-0-0: [core@master-0-0 ~]$ nmcli con show NAME UUID TYPE DEVICE Wired Connection 9d5c7c3b-9130-4a40-b31f-c99cad4da283 ethernet enp0s3 Wired Connection 84a523ff-ee8a-4a29-94ca-47590eb0cb76 ethernet enp0s4 [core@master-0-0 ~]$ nmcli -m multiline --get-values ip6.address conn show 84a523ff-ee8a-4a29-94ca-47590eb0cb76 IP6.ADDRESS[1]:fe80::5054:ff:fe6e:6923/64 [core@master-0-0 ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:72:6c:90 brd ff:ff:ff:ff:ff:ff 3: enp0s4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:6e:69:23 brd ff:ff:ff:ff:ff:ff inet 192.168.123.109/24 brd 192.168.123.255 scope global dynamic noprefixroute enp0s4 valid_lft 2621sec preferred_lft 2621sec inet6 fe80::5054:ff:fe6e:6923/64 scope link noprefixroute valid_lft forever preferred_lft forever 22: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 82:8d:01:2c:b3:41 brd ff:ff:ff:ff:ff:ff 23: br-int: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000 link/ether 9a:6d:68:a6:d5:36 brd ff:ff:ff:ff:ff:ff 24: ovn-k8s-mp0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether 5a:e5:2b:53:5c:5e brd ff:ff:ff:ff:ff:ff inet 10.130.0.2/23 brd 10.130.1.255 scope global ovn-k8s-mp0 valid_lft forever preferred_lft forever inet6 fe80::58e5:2bff:fe53:5c5e/64 scope link valid_lft forever preferred_lft forever 25: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000 link/ether 3e:b2:5b:bc:a6:7f brd ff:ff:ff:ff:ff:ff inet6 fe80::3cb2:5bff:febc:a67f/64 scope link valid_lft forever preferred_lft forever [core@master-0-0 ~]$ journalctl -u NetworkManager-wait-online.service -- Logs begin at Thu 2022-01-27 16:39:23 UTC, end at Thu 2022-01-27 21:15:28 UTC. -- Jan 27 16:39:37 master-0-0 systemd[1]: Starting Network Manager Wait Online... Jan 27 16:39:37 master-0-0 systemd[1]: Started Network Manager Wait Online. Jan 27 16:41:05 master-0-0 systemd[1]: NetworkManager-wait-online.service: Succeeded. Jan 27 16:41:05 master-0-0 systemd[1]: Stopped Network Manager Wait Online. Jan 27 16:41:05 master-0-0 systemd[1]: NetworkManager-wait-online.service: Consumed 0 CPU time -- Reboot -- Jan 27 16:41:26 localhost systemd[1]: Starting Network Manager Wait Online... Jan 27 16:42:27 master-0-0 systemd[1]: NetworkManager-wait-online.service: Main process exited, code=exited, status=1/FAILURE Jan 27 16:42:27 master-0-0 systemd[1]: NetworkManager-wait-online.service: Failed with result 'exit-code'. Jan 27 16:42:27 master-0-0 systemd[1]: Failed to start Network Manager Wait Online. Jan 27 16:42:27 master-0-0 systemd[1]: NetworkManager-wait-online.service: Consumed 43ms CPU time
*** Bug 2048535 has been marked as a duplicate of this bug. ***
*** Bug 2048966 has been marked as a duplicate of this bug. ***
Verified on IPv4 cluster with OVN at build 4.11.0-0.nightly-2022-02-01-062253 [kni@provisionhost-0-0 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-02-01-062253 True False 134m Cluster version is 4.11.0-0.nightly-2022-02-01-062253 [kni@provisionhost-0-0 ~]$ oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master-0-0 Ready master 162m v1.23.3+b63be7f 192.168.123.53 <none> Red Hat Enterprise Linux CoreOS 410.84.202201312356-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-108.rhaos4.10.gitb15fee5.el8 master-0-1 Ready master 162m v1.23.3+b63be7f 192.168.123.74 <none> Red Hat Enterprise Linux CoreOS 410.84.202201312356-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-108.rhaos4.10.gitb15fee5.el8 master-0-2 Ready master 162m v1.23.3+b63be7f 192.168.123.117 <none> Red Hat Enterprise Linux CoreOS 410.84.202201312356-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-108.rhaos4.10.gitb15fee5.el8 worker-0-0 Ready worker 144m v1.23.3+b63be7f 192.168.123.72 <none> Red Hat Enterprise Linux CoreOS 410.84.202201312356-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-108.rhaos4.10.gitb15fee5.el8 worker-0-1 Ready worker 144m v1.23.3+b63be7f 192.168.123.62 <none> Red Hat Enterprise Linux CoreOS 410.84.202201312356-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-108.rhaos4.10.gitb15fee5.el8 [kni@provisionhost-0-0 ~]$ oc get pods -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE ovnkube-master-f8pgx 6/6 Running 6 (162m ago) 163m ovnkube-master-wrh7c 6/6 Running 6 (162m ago) 163m ovnkube-master-zt2cw 6/6 Running 1 (155m ago) 163m ovnkube-node-25zqv 5/5 Running 0 163m ovnkube-node-6m9z7 5/5 Running 0 146m ovnkube-node-8f6ql 5/5 Running 0 163m ovnkube-node-ftlhc 5/5 Running 0 145m ovnkube-node-n2xqq 5/5 Running 0 163m [core@master-0-0 ~]$ nmcli con show NAME UUID TYPE DEVICE ovs-if-br-ex c106d856-c8bb-453b-8209-f0b3db2c832f ovs-interface br-ex Wired Connection cbae6a1a-a769-479d-9f75-4977c21c3d62 ethernet enp0s3 br-ex 752a8af0-a3c8-470c-aec4-7202781a5ffe ovs-bridge br-ex ovs-if-phys0 11c7f8eb-add1-4043-96c8-53a657e3dc36 ethernet enp0s4 ovs-port-br-ex 407bb2d3-8fee-4022-b660-7452d4d65a8b ovs-port br-ex ovs-port-phys0 4ccdb004-d433-4ccf-b34e-4dc3e531c09d ovs-port enp0s4 Wired Connection dcc8f7c6-e9fa-456b-b52d-36747fc9d24e ethernet --
*** Bug 2048776 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069