Bug 1914250

Summary:

ovnkube-node fails on master nodes when both DHCPv6 and SLAAC addresses are configured on nodes

Product:

OpenShift Container Platform

Reporter:

Victor Voronkov <vvoronko>

Component:

Networking

Assignee:

Antonio Ojea <aojeagar>

Networking sub component:

ovn-kubernetes

QA Contact:

Victor Voronkov <vvoronko>

Status:

CLOSED ERRATA

Docs Contact:

Severity:

high

Priority:

high

CC:

aconstan, anbhat, kquinn

Version:

4.7

Target Milestone:

---

Target Release:

4.7.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Cause: The code in ovn-kube that detects the default gateway was not taking into consideration multipath environments. Consequence: OVN-Kubernetes nodes failed to start because they cannot find the default gateway. Fix: The logic has been modified to consider the first available gateway if multipath is present. Result: OVN-Kubernetes works in environments with multipath and multiple default gateways.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2021-02-24 15:51:25 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1910165

Attachments:

Description	Flags
installer gather logs	none
strace default gateway detection code in ovn	none

Description Victor Voronkov 2021-01-08 12:59:29 UTC

Description of problem:
bootstrap stage of Baremetal IPI deployment fails, when both DHCPv6 and SLAAC addresses are configured on nodes. ovnkube-node fails to start:

oc -n openshift-ovn-kubernetes logs ovnkube-node-5mw86 -c ovnkube-node
...
I0108 12:30:30.277830   82410 gateway_localnet.go:182] Node local addresses initialized to: map[127.0.0.1:{127.0.0.0 ff000000} ::1:{::1 ffffffffffffffffffffffffffffffff} fd00:1101::fa58:4a9f:b174:2d5b:{fd00:1101::fa58:4a9f:b174:2d5b ffffffffffffffffffffffffffffffff} fd01:0:0:2::2:{fd01:0:0:2:: ffffffffffffffff0000000000000000} fd2e:6f44:5dd8:0:c8e1:6c1c:e835:dc96:{fd2e:6f44:5dd8:: ffffffffffffffff0000000000000000} fd2e:6f44:5dd8::103:{fd2e:6f44:5dd8::103 ffffffffffffffffffffffffffffffff} fe80::3c34:58ff:fe44:c135:{fe80:: ffffffffffffffff0000000000000000} fe80::5054:ff:fe69:4493:{fe80:: ffffffffffffffff0000000000000000} fe80::b0af:cdff:fe55:71d0:{fe80:: ffffffffffffffff0000000000000000} fe80::de6b:3249:5520:1b94:{fe80:: ffffffffffffffff0000000000000000}]
F0108 12:30:30.277980   82410 ovnkube.go:130] failed to get default gateway interface

Version-Release number of selected component (if applicable):
4.7.0-fc.1

How reproducible:
Trigger deployment with IPv6 control plane network when both DHCPv6 and SLAAC addresses are configured on nodes

Steps to Reproduce:
1. Prepare nodes and do all the prequisites
2. Start RA together with DHCPv6 (configured with the same subnet fd2e:6f44:5dd8::/64)
3. Trigger the deployment process

Actual results:
Deployment fails, installer output is:
...
DEBUG The connection to the server api-int.ocp-edge-cluster-0.qe.lab.redhat.com:6443 was refused - did you specify the right host or port? 
DEBUG Gather remote logs                           
...
DEBUG Log bundle written to /var/home/core/log-bundle-20210108121911.tar.gz 
INFO Bootstrap gather logs captured here "/home/kni/clusterconfigs/log-bundle-20210108121911.tar.gz" 
FATAL Bootstrap failed to complete: failed to wait for bootstrapping to complete: timed out waiting for the condition

Expected results:
Deployments succeed

Additional info:
[core@master-0-0 ~]$ ip -6 route
::1 dev lo proto kernel metric 256 pref medium
fd00:1101::5343:db04:d44:490c dev enp4s0 proto kernel metric 100 pref medium
fd00:1101::/64 dev enp4s0 proto ra metric 100 pref medium
fd01:0:0:1::/64 dev ovn-k8s-mp0 proto kernel metric 256 pref medium
fd01::/48 via fd01:0:0:1::1 dev ovn-k8s-mp0 metric 1024 pref medium
fd02::/112 via fd01:0:0:1::1 dev ovn-k8s-mp0 metric 1024 pref medium
fd2e:6f44:5dd8::10c dev br-ex proto kernel metric 100 pref medium
fd2e:6f44:5dd8::/64 dev br-ex proto ra metric 100 pref medium
fe80::/64 dev enp4s0 proto kernel metric 100 pref medium
fe80::/64 dev br-ex proto kernel metric 100 pref medium
fe80::/64 dev genev_sys_6081 proto kernel metric 256 pref medium
fe80::/64 dev ovn-k8s-mp0 proto kernel metric 256 pref medium
default proto ra metric 100 
	nexthop via fe80::6ae5:34fe:4ef6:e430 dev br-ex weight 1 
	nexthop via fe80::5054:ff:feac:eac9 dev br-ex weight 1 pref medium

Comment 1 Victor Voronkov 2021-01-08 13:41:11 UTC

Created attachment 1745589 [details]
installer gather logs

Comment 2 Antonio Ojea 2021-01-13 10:58:36 UTC

I can´t reproduce the behavior locally, I have 2 default routes like in the description

> default proto ra metric 20100 pref medium
>        nexthop via fe80::c24a:ff:fe2c:ec60 dev enp2s0 weight 1 
>        nexthop via fe80::4969:2cb2:f186:5c13 dev enp2s0 weight 1 


but the function returns the default gw correctly

> {Ifindex: 2 Dst: <nil> Src: <nil> Gw: fe80::4969:2cb2:f186:5c13 Flags: [] Table: 254}
> enp2s0 [fe80::4969:2cb2:f186:5c13] <nil>

I also, can´t see the error mentioned in the logs attached

> grep -r "failed to get default" log-bundle-20210108121911

Is it possible to access the environment?

Comment 3 Victor Voronkov 2021-01-13 12:13:32 UTC

I will prepare and give you such ASAP

Comment 5 Antonio Ojea 2021-01-14 12:06:12 UTC

ok, one step at a time, I don´t know if this is  the root cause, but I´vefound a bug in the scripts used by the network manager that didn´t work for interfaces with multiple IP addresses.

We can see that br-ex has two IP addresses

[root@master-0-0 ~]# ip -6 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 state UNKNOWN qlen 1000
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
    inet6 fd00:1101::5715:49f9:21c1:a594/128 scope global dynamic noprefixroute 
       valid_lft 2853sec preferred_lft 2853sec
    inet6 fe80::5054:ff:fea8:e75/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
5: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UNKNOWN qlen 1000
    inet6 fd2e:6f44:5dd8::127/128 scope global dynamic noprefixroute 
       valid_lft 3223sec preferred_lft 3223sec
    inet6 fd2e:6f44:5dd8:0:ef82:12af:7151:bae6/64 scope global dynamic noprefixroute 
       valid_lft 86384sec preferred_lft 14384sec
    inet6 fe80::b89c:c288:ac4e:1265/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

The script doesn´t discriminate by IP, so it obtains 2 leases, one per IP, and fails the check

Jan 14 08:46:53.777560 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: + '[' -z fd2e:6f44:5dd8::14a ']'
Jan 14 08:46:53.778626 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: ++ ip -j -6 a show br-ex
Jan 14 08:46:53.778796 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: ++ jq -r '.[].addr_info[] | select(.scope=="global") | select(.deprecated!=true) | .preferred_life_time'
Jan 14 08:46:53.780413 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com hyperkube[2369]: E0114 08:46:53.780369    2369 kubelet.go:2250] node "master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com" not found
Jan 14 08:46:53.825754 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: + LEASE_TIME='3548
Jan 14 08:46:53.825754 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: 14348'
Jan 14 08:46:53.825754 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: + '[' 3548 14348 -lt 4294967295 ']'
Jan 14 08:46:53.825754 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: /etc/NetworkManager/dispatcher.d/30-static-dhcpv6: line 12: [: too many arguments
Jan 14 08:46:53.825963 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: + '[' ovs-if-br-ex == 'Wired Connection' ']'
Jan 14 08:46:53.825963 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: + IPS=($IP6_ADDRESS_0)
Jan 14 08:46:53.825963 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: + CHECK_STR='^fd2e:6f44:5dd8::14a/'
Jan 14 08:46:53.825963 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: + [[ fd2e:6f44:5dd8::14a/128 fe80::7072:275:6b56:66d =~ ^fd2e:6f44:5dd8::14a/ ]]
Jan 14 08:46:53.826074 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: + IPS=($IP6_ADDRESS_1)
Jan 14 08:46:53.826074 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: + CIDR=fd2e:6f44:5dd8::14a/128
Jan 14 08:46:53.826074 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com nm-dispatcher[1580]: + nmcli con mod ovs-if-br-ex ipv6.addresses fd2e:6f44:5dd8::14a/128


ip -j -6 a show br-ex | jq -r --arg IPADDRESS "$IPADDRESS" '.[].addr_info[] | select(.local==$IPADDRESS) | select(.scope=="global") | select(.deprecated!=true) | .preferred_life_time'

Tthe network manager script wasn´t discriminating by IP address, hence,
interfaces with multiple addresses broke the scripts.

This was partially fixed by https://github.com/openshift/machine-config-operator/pull/2312
However, this still carries the problem that the parsing of the interface output return multiples fields, and those should be discriminated by the IP address received as parameter.

https://github.com/openshift/machine-config-operator/pull/2341

Comment 6 Antonio Ojea 2021-01-14 12:36:32 UTC

Ok, I totally misread the script and it has alread been fixed.

We need to test again with a new version of the Machine Config Operator that has included https://github.com/openshift/machine-config-operator/pull/2312 ,

Comment 7 Antonio Ojea 2021-01-14 19:02:23 UTC

I´ve created an executable with the ovn code to detect the gateway 


import (
        "fmt"
        "net"
        "syscall"

        "github.com/vishvananda/netlink"
        utilnet "k8s.io/utils/net"
)

// getDefaultGatewayInterfaceDetails returns the interface name on
// which the default gateway (for route to 0.0.0.0) is configured.
// It also returns the default gateways themselves.
func getDefaultGatewayInterfaceDetails() (string, []net.IP, error) {
        var intfName string
        var gatewayIPs []net.IP

        needIPv4 := false
        needIPv6 := true
        routes, err := netlink.RouteList(nil, syscall.AF_UNSPEC)
        if err != nil {
                return "", nil, fmt.Errorf("failed to get routing table in node")
        }

        for _, route := range routes {
                if route.Dst == nil && route.Gw != nil && route.LinkIndex > 0 {
                        fmt.Println(route)
                        intfLink, err := netlink.LinkByIndex(route.LinkIndex)
                        if err != nil {
                                continue
                        }
                        if utilnet.IsIPv6(route.Gw) {
                                if !needIPv6 {
                                        continue
                                }
                                needIPv6 = false
                        } else {
                                if !needIPv4 {
                                        continue
                                }
                                needIPv4 = false
                        }

                        if intfName == "" {
                                intfName = intfLink.Attrs().Name
                        } else if intfName != intfLink.Attrs().Name {
                                return "", nil, fmt.Errorf("multiple gateway interfaces detected: %s %s", intfName, intfLink.Attrs().Name)
                        }
                        gatewayIPs = append(gatewayIPs, route.Gw)
                }
        }

        if len(gatewayIPs) == 0 {
                return "", nil, fmt.Errorf("failed to get default gateway interface")
        }
        return intfName, gatewayIPs, nil
}

func main() {
        fmt.Println(getDefaultGatewayInterfaceDetails())
}

The executable works with multiple routes in my local environment as you can see in comment#2


However, it fails (as ovnkube-node fails) in the environment

[core@master-0-2 ~]$ sudo ./test 
 [] failed to get default gateway interface

Comment 8 Antonio Ojea 2021-01-14 19:06:04 UTC

Created attachment 1747499 [details]
strace default gateway detection code in ovn

Comment 9 Antonio Ojea 2021-01-15 09:56:59 UTC

It seems that the problem is that netlink returns an array in the MultiPath field and keeps Gw nil, and we were only checking that Gw != nil

> Debug route {Dst: <nil> Src: <nil> Gw: [{Ifindex: 5 Weight: 1 Gw: fe80::1b8b:3a78:f38c:bfdb Flags: []} {Ifindex: 5 Weight: 1 Gw: fe80::5054:ff:fe48:86d9 Flags: []}] Flags: [] Table: 254}



// Route represents a netlink route.
type Route struct {
	LinkIndex        int
	ILinkIndex       int
	Scope            Scope
	Dst              *net.IPNet
	Src              net.IP
	Gw               net.IP
	MultiPath        []*NexthopInfo


I have updated the PR and now the code detects the right gateway in the failing environment.

[core@master-0-2 ~]$ ./test 
Found default gateway br-ex fe80::1b8b:3a78:f38c:bfdbbr-ex [fe80::1b8b:3a78:f38c:bfdb] <nil>

Comment 10 Antonio Ojea 2021-01-15 11:39:30 UTC

We verified in the environment that ovnkube-node now detects the gateway correctly

[kni@provisionhost-0-0 ~]$ oc get pods -A | grep ovn
openshift-ovn-kubernetes                           ovnkube-master-5lxdf                                                       6/6     Running                 21         19h
openshift-ovn-kubernetes                           ovnkube-master-pwk5n                                                       6/6     Running                 25         19h
openshift-ovn-kubernetes                           ovnkube-master-tx9fl                                                       6/6     Running                 18         19h
openshift-ovn-kubernetes                           ovnkube-node-pg7fn                                                         3/3     Running                 0          11m
openshift-ovn-kubernetes                           ovnkube-node-wt7ns                                                         3/3     Running                 4          13m
openshift-ovn-kubernetes                           ovnkube-node-x67jc                                                         3/3     Running                 0          13m
openshift-ovn-kubernetes                           ovs-node-67l6l                                                             1/1     Running                 0          19h
openshift-ovn-kubernetes                           ovs-node-gx6zx                                                             1/1     Running                 0          19h
openshift-ovn-kubernetes                           ovs-node-lz22m                                                             1/1     Running                 0          19h

Comment 13 Victor Voronkov 2021-01-24 06:12:43 UTC

Verified on 4.7.0-0.nightly-2021-01-22-063949

Bootstrap stage passed ok, OVN pods are up and running

[kni@provisionhost-0-0 ~]$ oc get pods -A | grep ovn
openshift-ovn-kubernetes                           ovnkube-master-fmzhw                                                       6/6     Running            1          33m
openshift-ovn-kubernetes                           ovnkube-master-pq55l                                                       6/6     Running            3          33m
openshift-ovn-kubernetes                           ovnkube-master-xsnc5                                                       6/6     Running            3          33m
openshift-ovn-kubernetes                           ovnkube-node-89vlv                                                         3/3     Running            0          33m
openshift-ovn-kubernetes                           ovnkube-node-bm8pm                                                         3/3     Running            0          33m
openshift-ovn-kubernetes                           ovnkube-node-k657b                                                         3/3     Running            0          14m
openshift-ovn-kubernetes                           ovnkube-node-zgtsj                                                         3/3     Running            0          33m
openshift-ovn-kubernetes                           ovnkube-node-zsqqt                                                         3/3     Running            0          14m
openshift-ovn-kubernetes                           ovs-node-dqhrl                                                             1/1     Running            0          14m
openshift-ovn-kubernetes                           ovs-node-lmzfs                                                             1/1     Running            0          33m
openshift-ovn-kubernetes                           ovs-node-sqt2t                                                             1/1     Running            0          14m
openshift-ovn-kubernetes                           ovs-node-vb64d                                                             1/1     Running            0          33m
openshift-ovn-kubernetes                           ovs-node-zwvcx                                                             1/1     Running            0          33m

Comment 16 errata-xmlrpc 2021-02-24 15:51:25 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633