Bug 1872632 - [UPI] "none" platform does not correctly detect node's NIC IP and pass to kubelet with --node-ip
Summary: [UPI] "none" platform does not correctly detect node's NIC IP and pass to kub...
Keywords:
Status: ASSIGNED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.6.0
Assignee: Dan Winship
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-26 09:20 UTC by huirwang
Modified: 2020-09-25 00:33 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)
AWS console with cluster nodes (93.98 KB, image/png)
2020-08-27 11:02 UTC, Alexander Constantinescu
no flags Details


Links
System ID Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2088 None closed Bug 1872632: allow overriding kubelet --node-ip in the _base templates 2020-09-25 11:56:00 UTC
Github openshift machine-config-operator pull 2100 None open Bug 1872632: templates: fix --node-ip override 2020-09-22 20:19:06 UTC

Description huirwang 2020-08-26 09:20:54 UTC
Description of problem:
The cluster uses ovn-k8s-mp0 ip as node Internal IP in baremetal platform, cluster cannot work well in this condition.

Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-08-25-222652

How reproducible:
Always


1. Lanched an OVN baremetal cluster.

Actual Result:


oc get nodes -o wide
NAME                              STATUS                     ROLES    AGE     VERSION                      INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION          CONTAINER-RUNTIME
huir-0826-6q5g9-compute-0         Ready,SchedulingDisabled   worker   4h10m   v1.19.0-rc.2+aaf4ce1-dirty   10.0.99.77    <none>        Red Hat Enterprise Linux CoreOS 46.82.202008251840-0 (Ootpa)   4.18.0-211.el8.x86_64   cri-o://1.19.0-90.rhaos4.6.git4a0ac05.el8-rc.1
huir-0826-6q5g9-compute-1         Ready                      worker   4h10m   v1.19.0-rc.2+aaf4ce1-dirty   10.128.2.2    <none>        Red Hat Enterprise Linux CoreOS 46.82.202008251840-0 (Ootpa)   4.18.0-211.el8.x86_64   cri-o://1.19.0-90.rhaos4.6.git4a0ac05.el8-rc.1
huir-0826-6q5g9-compute-2         Ready                      worker   4h11m   v1.19.0-rc.2+aaf4ce1-dirty   10.131.0.2    <none>        Red Hat Enterprise Linux CoreOS 46.82.202008251840-0 (Ootpa)   4.18.0-211.el8.x86_64   cri-o://1.19.0-90.rhaos4.6.git4a0ac05.el8-rc.1
huir-0826-6q5g9-control-plane-0   Ready                      master   4h24m   v1.19.0-rc.2+aaf4ce1-dirty   10.129.0.2    <none>        Red Hat Enterprise Linux CoreOS 46.82.202008251840-0 (Ootpa)   4.18.0-211.el8.x86_64   cri-o://1.19.0-90.rhaos4.6.git4a0ac05.el8-rc.1
huir-0826-6q5g9-control-plane-1   Ready                      master   4h24m   v1.19.0-rc.2+aaf4ce1-dirty   10.130.0.2    <none>        Red Hat Enterprise Linux CoreOS 46.82.202008251840-0 (Ootpa)   4.18.0-211.el8.x86_64   cri-o://1.19.0-90.rhaos4.6.git4a0ac05.el8-rc.1
huir-0826-6q5g9-control-plane-2   Ready                      master   4h24m   v1.19.0-rc.2+aaf4ce1-dirty   10.128.0.2    <none>        Red Hat Enterprise Linux CoreOS 46.82.202008251840-0 (Ootpa)   4.18.0-211.el8.x86_64   cri-o://1.19.0-90.rhaos4.6.git4a0ac05.el8-rc.1




We can see for node huir-0826-6q5g9-compute-1, the INTERNAL-IP  ip is 10.128.2.2, not 10.0.96.111.

[core@huir-0826-6q5g9-compute-1 ~]$ ip a show ovn-k8s-mp0
6: ovn-k8s-mp0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 06:95:73:d4:8b:99 brd ff:ff:ff:ff:ff:ff
    inet 10.128.2.2/23 brd 10.128.3.255 scope global ovn-k8s-mp0
       valid_lft forever preferred_lft forever
[core@huir-0826-6q5g9-compute-1 ~]$ ip a show br-ex
9: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:7a:67:d1 brd ff:ff:ff:ff:ff:ff
    inet 10.0.96.111/22 brd 10.0.99.255 scope global dynamic noprefixroute br-ex
       valid_lft 64060sec preferred_lft 64060sec
    inet6 2620:52:0:60:520a:2628:349e:2e36/64 scope global dynamic noprefixroute 
       valid_lft 2591919sec preferred_lft 604719sec
    inet6 fe80::9ee7:8f6e:a96f:a26b/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever


Some OVN pods also use ovn-k8s-mp0 IP than br-ex IP.


oc get pods -n openshift-ovn-kubernetes -o wide
NAME                           READY   STATUS             RESTARTS   AGE     IP            NODE                              NOMINATED NODE   READINESS GATES
ovnkube-master-metrics-859sn   1/1     Running            0          6h49m   10.0.98.220   huir-0826-6q5g9-control-plane-1   <none>           <none>
ovnkube-master-metrics-csjck   1/1     Running            0          6h49m   10.0.97.89    huir-0826-6q5g9-control-plane-2   <none>           <none>
ovnkube-master-metrics-p8vjr   1/1     Running            0          6h49m   10.0.98.14    huir-0826-6q5g9-control-plane-0   <none>           <none>
ovnkube-master-rsvm7           2/4     Running            0          6h11m   10.129.0.2    huir-0826-6q5g9-control-plane-0   <none>           <none>
ovnkube-master-t8h47           4/4     Running            0          6h14m   10.0.97.89    huir-0826-6q5g9-control-plane-2   <none>           <none>
ovnkube-master-xjs95           4/4     Running            0          6h17m   10.0.98.220   huir-0826-6q5g9-control-plane-1   <none>           <none>
ovnkube-node-fbp4v             2/2     Running            0          6h12m   10.129.0.2    huir-0826-6q5g9-control-plane-0   <none>           <none>
ovnkube-node-ghq5l             1/2     CrashLoopBackOff   61         6h11m   10.0.99.77    huir-0826-6q5g9-compute-0         <none>           <none>
ovnkube-node-lrc95             2/2     Running            0          6h12m   10.131.0.2    huir-0826-6q5g9-compute-2         <none>           <none>
ovnkube-node-metrics-4gxgn     1/1     Running            0          6h34m   10.0.96.111   huir-0826-6q5g9-compute-1         <none>           <none>
ovnkube-node-metrics-4n7bd     1/1     Running            0          6h49m   10.0.97.89    huir-0826-6q5g9-control-plane-2   <none>           <none>
ovnkube-node-metrics-6d8jk     1/1     Running            0          6h49m   10.0.98.14    huir-0826-6q5g9-control-plane-0   <none>           <none>
ovnkube-node-metrics-dfp7q     1/1     Running            0          6h49m   10.0.98.220   huir-0826-6q5g9-control-plane-1   <none>           <none>
ovnkube-node-metrics-v4qzz     1/1     Running            0          6h35m   10.0.99.77    huir-0826-6q5g9-compute-0         <none>           <none>
ovnkube-node-metrics-x5g6k     1/1     Running            0          6h35m   10.0.97.10    huir-0826-6q5g9-compute-2         <none>           <none>
ovnkube-node-qws9s             2/2     Running            0          125m    10.128.2.2    huir-0826-6q5g9-compute-1         <none>           <none>
ovnkube-node-w596b             2/2     Running            0          6h11m   10.130.0.2    huir-0826-6q5g9-control-plane-1   <none>           <none>
ovnkube-node-xr9p5             2/2     Running            1          6h14m   10.0.97.89    huir-0826-6q5g9-control-plane-2   <none>           <none>
ovs-node-9z7nz                 1/1     Running            0          6h35m   10.0.99.77    huir-0826-6q5g9-compute-0         <none>           <none>
ovs-node-c4p9p                 1/1     Running            0          6h49m   10.0.98.220   huir-0826-6q5g9-control-plane-1   <none>           <none>
ovs-node-h7snd                 1/1     Running            0          6h34m   10.0.96.111   huir-0826-6q5g9-compute-1         <none>           <none>
ovs-node-jbwtm                 1/1     Running            0          6h35m   10.0.97.10    huir-0826-6q5g9-compute-2         <none>           <none>
ovs-node-rn77c                 1/1     Running            0          6h49m   10.0.98.14    huir-0826-6q5g9-control-plane-0   <none>           <none>
ovs-node-vcn4s                 1/1     Running            0          6h49m   10.0.97.89    huir-0826-6q5g9-control-plane-2   <none>           <none>


Expected Result:

The Internal IP and OVN pods IP should be br-ex IP of the nodes.

Comment 4 Alexander Constantinescu 2020-08-27 11:01:18 UTC
It seems we have a big problem

In shared gateway mode we attach br-ex to the primary NIC, ex:

9: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:5e:5a:d7 brd ff:ff:ff:ff:ff:ff
    inet 10.0.97.89/22 brd 10.0.99.255 scope global noprefixroute dynamic br-ex
       valid_lft 52609sec preferred_lft 52609sec
    inet6 2620:52:0:60:1932:bc68:2020:af5d/64 scope global noprefixroute dynamic 
       valid_lft 2591995sec preferred_lft 604795sec
    inet6 fe80::5064:6d88:5776:2c54/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

The IP:10.0.97.89 is not the InternalIP address of the node, it's considered the ExternalIP as seen in the AWS console (see attached picture)

The problem is that the CNO bootstraps OVN with the list of master nodes according to their InternalIP address representation (as retrieved from the API server), here:

https://github.com/openshift/cluster-network-operator/blob/92e466db53cc9e741084c3697fe893e7496ba61d/pkg/network/ovn_kubernetes.go#L260

This would not be a problem in itself, it would be a mere change in the CNO to bootrstrap with the ExternalIP instead. However, those fields are not set in the API servers node representation:

oc get node -o yaml  huir-0827-mnfkm-control-plane-0
apiVersion: v1
kind: Node
metadata:
  annotations:
    k8s.ovn.org/l3-gateway-config: '{"default":{"mode":"shared","interface-id":"br-ex_huir-0827-mnfkm-control-plane-0","mac-address":"fa:16:3e:5e:5a:d7","ip-addresses":["10.0.97.89/22"],"ip-address":"10.0.97.89/22","next-hops":["10.0.99.254"],"next-hop":"10.0.99.254","node-port-enable":"true","vlan-id":"0"}}'
    k8s.ovn.org/node-chassis-id: 273c77f9-6f1f-4747-8f3d-542e1a8724f6
    k8s.ovn.org/node-join-subnets: '{"default":"100.64.2.0/29"}'
    k8s.ovn.org/node-local-nat-ip: '{"default":["169.254.12.13"]}'
    k8s.ovn.org/node-mgmt-port-mac-address: ce:fe:f3:93:76:22
    k8s.ovn.org/node-primary-ifaddr: '{"ipv4":"10.0.97.89/22","ipv6":"2620:52:0:60:1932:bc68:2020:af5d/64"}'
    k8s.ovn.org/node-subnets: '{"default":"10.129.0.0/23"}'
    machineconfiguration.openshift.io/currentConfig: rendered-master-bda270c04531b48aef1e5493c3b78844
    machineconfiguration.openshift.io/desiredConfig: rendered-master-bda270c04531b48aef1e5493c3b78844
    machineconfiguration.openshift.io/reason: ""
    machineconfiguration.openshift.io/state: Done
    volumes.kubernetes.io/controller-managed-attach-detach: "true"
  creationTimestamp: "2020-08-27T00:50:55Z"
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: huir-0827-mnfkm-control-plane-0
    kubernetes.io/os: linux
    node-role.kubernetes.io/master: ""
    node.openshift.io/os_id: rhcos
  name: huir-0827-mnfkm-control-plane-0
  resourceVersion: "770952"
  selfLink: /api/v1/nodes/huir-0827-mnfkm-control-plane-0
  uid: 1b11603e-b37a-481c-8705-d05f22b29f55
spec:
  podCIDR: 10.128.1.0/24
  podCIDRs:
  - 10.128.1.0/24
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
status:
  addresses:
  - address: 10.129.0.2
    type: InternalIP
  - address: huir-0827-mnfkm-control-plane-0
    type: Hostname

I need to discuss this with Tim to check what he thinks we should do about this. 

We could have a look at changing the API server to set the ExternalIP field on all nodes, but I am afraid that it might be too complex/close to the final freeze.

Comment 5 Alexander Constantinescu 2020-08-27 11:02:31 UTC
Created attachment 1712808 [details]
AWS console with cluster nodes

Comment 6 Alexander Constantinescu 2020-08-27 11:11:12 UTC
Excuse me, that's not the AWS console. It's Openstack

Comment 7 Alexander Constantinescu 2020-08-27 13:15:52 UTC
FYI:

I just created a cluster on GCP (which works fine), it seems we attach the InternalIP to br-ex on that platform. I have attached the output of ovs-configuration.service for both cases for comparison. But there is no error in the Openstack case, so nothing evident to me as to what causes the difference.

GCP

journalctl -u ovs-configuration.service > tmp
sh-4.4# cat tmp
-- Logs begin at Thu 2020-08-27 12:00:24 UTC, end at Thu 2020-08-27 13:13:40 UTC. --
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 systemd[1]: Starting Configures OVS with proper host networking configuration...
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + iface=
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + counter=0
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + '[' 0 -lt 12 ']'
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: ++ ip -j route show default
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: ++ jq -r '.[0].dev'
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + iface=ens4
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + [[ -n ens4 ]]
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + [[ ens4 != \n\u\l\l ]]
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + echo 'IPv4 Default gateway interface found: ens4'
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: IPv4 Default gateway interface found: ens4
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + break
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + '[' ens4 = br-ex ']'
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + '[' -z ens4 ']'
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + iface_mac=42:01:0a:00:00:03
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + echo 'MAC address found for iface: ens4: 42:01:0a:00:00:03'
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: MAC address found for iface: ens4: 42:01:0a:00:00:03
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: ++ ip -j link show ens4
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: ++ jq -r '.[0].mtu'
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + iface_mtu=1460
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + [[ -z 1460 ]]
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + [[ 1460 == \n\u\l\l ]]
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + echo 'MTU found for iface: ens4: 1460'
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: MTU found for iface: ens4: 1460
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli connection show br-ex
Aug 27 12:06:23 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli c add type ovs-bridge conn.interface br-ex con-name br-ex 802-3-ethernet.mtu 1460 802-3-ethernet.cloned-mac-address 42:01:0a:00:00:03
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: Connection 'br-ex' (4ed3ad4b-17d5-4e6b-87ad-eddd8d3eaccb) successfully added.
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: ++ nmcli --fields UUID,DEVICE conn show --active
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: ++ grep ens4
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: ++ awk '{print $1}'
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + old_conn=67de8da7-74d3-4af6-b30d-659ed36212d0
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli connection show ovs-port-phys0
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli c add type ovs-port conn.interface ens4 master br-ex con-name ovs-port-phys0
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: Connection 'ovs-port-phys0' (e3f63661-f500-4885-83af-a2401fa8613f) successfully added.
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli connection show ovs-port-br-ex
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli c add type ovs-port conn.interface br-ex master br-ex con-name ovs-port-br-ex
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: Connection 'ovs-port-br-ex' (155140bc-2a94-4058-94bc-a5abf78d5b1f) successfully added.
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli device disconnect ens4
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: Device 'ens4' successfully disconnected.
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli connection show ovs-if-phys0
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli c add type 802-3-ethernet conn.interface ens4 master ovs-port-phys0 con-name ovs-if-phys0 connection.autoconnect-priority 100 802-3-ethernet.mtu 1460
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: Connection 'ovs-if-phys0' (a91fafd9-c4cf-4730-b2e0-15ff81a71723) successfully added.
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli conn up ovs-if-phys0
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/5)
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli connection show ovs-if-br-ex
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli c add type ovs-interface slave-type ovs-port conn.interface br-ex master ovs-port-br-ex con-name ovs-if-br-ex 802-3-ethernet.mtu 1460 802-3-ethernet.cloned-mac-address 42:01:0a:00:00:03
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: Connection 'ovs-if-br-ex' (acff4245-05dc-444d-8d9f-b9755bbcdd3f) successfully added.
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + counter=0
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + '[' 0 -lt 5 ']'
Aug 27 12:06:24 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + sleep 5
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + nmcli --fields GENERAL.STATE conn show ovs-if-br-ex
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + grep -i activated
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: GENERAL.STATE:                          activated
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + echo 'OVS successfully configured'
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: OVS successfully configured
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + ip a show br-ex
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: 4: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc noqueue state UNKNOWN group default qlen 1000
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]:     link/ether 42:01:0a:00:00:03 brd ff:ff:ff:ff:ff:ff
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]:     inet 10.0.0.3/32 scope global dynamic noprefixroute br-ex
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]:        valid_lft 86396sec preferred_lft 86396sec
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]:     inet6 fe80::8839:f9c0:a4ac:3964/64 scope link noprefixroute
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]:        valid_lft forever preferred_lft forever
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 configure-ovs.sh[1419]: + exit 0
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 systemd[1]: Started Configures OVS with proper host networking configuration.
Aug 27 12:06:29 ci-ln-crxcz5b-f76d1-wmk6z-master-2 systemd[1]: ovs-configuration.service: Consumed 378ms CPU time


Openstack

cat tmp
-- Logs begin at Thu 2020-08-27 00:46:53 UTC, end at Thu 2020-08-27 12:52:15 UTC. --
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 systemd[1]: Starting Configures OVS with proper host networking configuration...
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + iface=
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + counter=0
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + '[' 0 -lt 12 ']'
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: ++ ip -j route show default
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: ++ jq -r '.[0].dev'
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + iface=ens3
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + [[ -n ens3 ]]
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + [[ ens3 != \n\u\l\l ]]
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + echo 'IPv4 Default gateway interface found: ens3'
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: IPv4 Default gateway interface found: ens3
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + break
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + '[' ens3 = br-ex ']'
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + '[' -z ens3 ']'
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + iface_mac=fa:16:3e:5e:5a:d7
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + echo 'MAC address found for iface: ens3: fa:16:3e:5e:5a:d7'
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: MAC address found for iface: ens3: fa:16:3e:5e:5a:d7
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: ++ ip -j link show ens3
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: ++ jq -r '.[0].mtu'
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + iface_mtu=1500
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + [[ -z 1500 ]]
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + [[ 1500 == \n\u\l\l ]]
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + echo 'MTU found for iface: ens3: 1500'
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: MTU found for iface: ens3: 1500
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli connection show br-ex
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli c add type ovs-bridge conn.interface br-ex con-name br-ex 802-3-ethernet.mtu 1500 802-3-ethernet.cloned-mac-address fa:16:3e:5e:5a:d7
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: Connection 'br-ex' (385fc9ac-c2a2-45fe-8e79-cd042153ae1d) successfully added.
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: ++ nmcli --fields UUID,DEVICE conn show --active
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: ++ grep ens3
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: ++ awk '{print $1}'
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + old_conn=21d47e65-8523-1a06-af22-6f121086f085
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli connection show ovs-port-phys0
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli c add type ovs-port conn.interface ens3 master br-ex con-name ovs-port-phys0
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: Connection 'ovs-port-phys0' (c78224c9-4357-41fc-9627-7b44fecc3a87) successfully added.
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli connection show ovs-port-br-ex
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli c add type ovs-port conn.interface br-ex master br-ex con-name ovs-port-br-ex
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: Connection 'ovs-port-br-ex' (d8b99bc6-d7f1-4cc5-8a04-3ebb271c5f83) successfully added.
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli device disconnect ens3
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: Device 'ens3' successfully disconnected.
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli connection show ovs-if-phys0
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli c add type 802-3-ethernet conn.interface ens3 master ovs-port-phys0 con-name ovs-if-phys0 connection.autoconnect-priority 100 802-3-ethernet.mtu 1500
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: Connection 'ovs-if-phys0' (86644bf1-7ee0-4370-aea7-20a60688f63f) successfully added.
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli conn up ovs-if-phys0
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/5)
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli connection show ovs-if-br-ex
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli c add type ovs-interface slave-type ovs-port conn.interface br-ex master ovs-port-br-ex con-name ovs-if-br-ex 802-3-ethernet.mtu 1500 802-3-ethernet.cloned-mac-address fa:16:3e:5e:5a:d7
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: Connection 'ovs-if-br-ex' (3521dd05-3d8a-4251-8466-f88d7db84209) successfully added.
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + counter=0
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + '[' 0 -lt 5 ']'
Aug 27 00:50:21 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + sleep 5
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + nmcli --fields GENERAL.STATE conn show ovs-if-br-ex
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + grep -i activated
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: GENERAL.STATE:                          activated
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + echo 'OVS successfully configured'
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: OVS successfully configured
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + ip a show br-ex
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: 4: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]:     link/ether fa:16:3e:5e:5a:d7 brd ff:ff:ff:ff:ff:ff
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]:     inet 10.0.97.89/22 brd 10.0.99.255 scope global dynamic noprefixroute br-ex
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]:        valid_lft 86396sec preferred_lft 86396sec
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]:     inet6 2620:52:0:60:1932:bc68:2020:af5d/64 scope global tentative dynamic noprefixroute
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]:        valid_lft 2592000sec preferred_lft 604800sec
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]:     inet6 fe80::5064:6d88:5776:2c54/64 scope link noprefixroute
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]:        valid_lft forever preferred_lft forever
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 systemd[1]: Started Configures OVS with proper host networking configuration.
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + configure_driver_options ens3
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + intf=ens3
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 systemd[1]: ovs-configuration.service: Consumed 309ms CPU time
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: ++ cat /sys/class/net/ens3/device/uevent
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: ++ grep DRIVER
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: ++ awk -F = '{print $2}'
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + driver=virtio_net
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + echo 'Driver name is' virtio_net
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: Driver name is virtio_net
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + '[' virtio_net = vmxnet3 ']'
Aug 27 00:50:28 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1448]: + exit 0
-- Reboot --
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 systemd[1]: Starting Configures OVS with proper host networking configuration...
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + iface=
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + counter=0
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + '[' 0 -lt 12 ']'
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: ++ ip -j route show default
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: ++ jq -r '.[0].dev'
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + iface=br-ex
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + [[ -n br-ex ]]
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + [[ br-ex != \n\u\l\l ]]
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + echo 'IPv4 Default gateway interface found: br-ex'
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: IPv4 Default gateway interface found: br-ex
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + break
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + '[' br-ex = br-ex ']'
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: ++ ovs-vsctl list-ifaces br-ex
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + ifaces='ens3
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: patch-br-ex_huir-0827-mnfkm-control-plane-0-to-br-int'
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + for intf in $ifaces
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + configure_driver_options ens3
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + intf=ens3
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: ++ cat /sys/class/net/ens3/device/uevent
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 systemd[1]: Started Configures OVS with proper host networking configuration.
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: ++ grep DRIVER
Aug 27 01:19:09 huir-0827-mnfkm-control-plane-0 systemd[1]: ovs-configuration.service: Consumed 69ms CPU time
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: ++ awk -F = '{print $2}'
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + driver=virtio_net
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + echo 'Driver name is' virtio_net
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: Driver name is virtio_net
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + '[' virtio_net = vmxnet3 ']'
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + for intf in $ifaces
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + configure_driver_options patch-br-ex_huir-0827-mnfkm-control-plane-0-to-br-int
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + intf=patch-br-ex_huir-0827-mnfkm-control-plane-0-to-br-int
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: ++ cat /sys/class/net/patch-br-ex_huir-0827-mnfkm-control-plane-0-to-br-int/device/uevent
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: ++ grep DRIVER
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: ++ awk -F = '{print $2}'
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: cat: /sys/class/net/patch-br-ex_huir-0827-mnfkm-control-plane-0-to-br-int/device/uevent: No such file or directory
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + driver=
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + echo 'Driver name is'
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: Driver name is
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + '[' '' = vmxnet3 ']'
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + echo 'Networking already configured and up for br-ex!'
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: Networking already configured and up for br-ex!
Aug 27 01:19:10 huir-0827-mnfkm-control-plane-0 configure-ovs.sh[1573]: + exit 0

Comment 8 Alexander Constantinescu 2020-08-27 13:39:44 UTC
Going further, logs from NetworkManager seems to point to DHCP being done differently between the providers. 

Openstack

Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 systemd[1]: Starting Network Manager...
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.2051] NetworkManager (version 1.22.8-6.el8_2) is starting... (for the first time)
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.2055] Read config: /etc/NetworkManager/NetworkManager.conf (lib: 10-disable-default-plugins.conf, 20-client-id-from-mac.conf) (etc: sdn.conf)
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 systemd[1]: Started Network Manager.
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.2084] bus-manager: acquired D-Bus service "org.freedesktop.NetworkManager"
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.2204] manager[0x563b86bcb090]: monitoring kernel firmware directory '/lib/firmware'.
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.4862] hostname: hostname: using hostnamed
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.4865] hostname: hostname changed from (none) to "huir-0827-mnfkm-control-plane-0"
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.4874] dns-mgr[0x563b86baf250]: init: dns=default,systemd-resolved rc-manager=symlink
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.4989] Loaded device plugin: NMOvsFactory (/usr/lib64/NetworkManager/1.22.8-6.el8_2/libnm-device-plugin-ovs.so)
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5031] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.22.8-6.el8_2/libnm-device-plugin-team.so)
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5032] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5033] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5034] manager: Networking is enabled by state file
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5035] dhcp-init: Using DHCP client 'internal'
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5036] settings: Loaded settings plugin: keyfile (internal)
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5099] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.22.8-6.el8_2/libnm-settings-plugin-ifcfg-rh.so")
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5185] device (lo): carrier: link connected
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5189] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5202] manager: (ens3): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5210] device (ens3): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5248] device (ens3): carrier: link connected
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5312] device (ens3): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5321] policy: auto-activating connection 'ens3' (21d47e65-8523-1a06-af22-6f121086f085)
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5328] device (ens3): Activation: starting connection 'ens3' (21d47e65-8523-1a06-af22-6f121086f085)
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5330] device (ens3): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5333] manager: NetworkManager state is now CONNECTING
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5336] device (ens3): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5342] device (ens3): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5346] dhcp4 (ens3): activation: beginning transaction (timeout in 45 seconds)
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5419] dhcp4 (ens3): option dhcp_lease_time      => '86400'
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5419] dhcp4 (ens3): option domain_name          => 'openstacklocal'
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5419] dhcp4 (ens3): option domain_name_servers  => '10.11.5.19 10.5.30.45'
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5419] dhcp4 (ens3): option expiry               => '1598575748'
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5419] dhcp4 (ens3): option host_name            => 'host-10-0-97-89'
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5420] dhcp4 (ens3): option interface_mtu        => '1500'
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5420] dhcp4 (ens3): option ip_address           => '10.0.97.89'
Aug 27 00:49:08 huir-0827-mnfkm-control-plane-0 NetworkManager[1706]: <info>  [1598489348.5420] dhcp4 (ens3): option next_server          => '10.0.96.161'

GCP

Aug 27 12:05:12 localhost systemd[1]: Starting Network Manager...
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.3449] NetworkManager (version 1.22.8-6.el8_2) is starting... (for the first time)
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.3453] Read config: /etc/NetworkManager/NetworkManager.conf (lib: 10-disable-default-plugins.conf, 20-client-id-from-mac.conf) (etc: sdn.conf)
Aug 27 12:05:12 localhost systemd[1]: Started Network Manager.
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.3491] bus-manager: acquired D-Bus service "org.freedesktop.NetworkManager"
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.3552] manager[0x56205677d090]: monitoring kernel firmware directory '/lib/firmware'.
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6767] hostname: hostname: using hostnamed
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6771] dns-mgr[0x562056761250]: init: dns=default,systemd-resolved rc-manager=symlink
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6833] Loaded device plugin: NMOvsFactory (/usr/lib64/NetworkManager/1.22.8-6.el8_2/libnm-device-plugin-ovs.so)
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6862] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.22.8-6.el8_2/libnm-device-plugin-team.so)
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6863] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6864] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6866] manager: Networking is enabled by state file
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6867] dhcp-init: Using DHCP client 'internal'
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6869] settings: Loaded settings plugin: keyfile (internal)
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6905] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.22.8-6.el8_2/libnm-settings-plugin-ifcfg-rh.so")
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6932] device (lo): carrier: link connected
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6935] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6949] manager: (ens4): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.6963] device (ens4): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7036] device (ens4): carrier: link connected
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7090] device (ens4): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7135] policy: auto-activating connection 'Wired Connection' (67de8da7-74d3-4af6-b30d-659ed36212d0)
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7145] device (ens4): Activation: starting connection 'Wired Connection' (67de8da7-74d3-4af6-b30d-659ed36212d0)
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7147] device (ens4): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7152] manager: NetworkManager state is now CONNECTING
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7155] device (ens4): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7163] device (ens4): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7168] dhcp4 (ens4): activation: beginning transaction (timeout in 45 seconds)
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7201] dhcp4 (ens4): option dhcp_lease_time      => '86400'
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7201] dhcp4 (ens4): option domain_name          => 'c.openshift-gce-devel-ci.internal'
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7201] dhcp4 (ens4): option domain_name_servers  => '169.254.169.254'
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7201] dhcp4 (ens4): option domain_search        => 'c.openshift-gce-devel-ci.internal google.internal'
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7201] dhcp4 (ens4): option expiry               => '1598616312'
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7201] dhcp4 (ens4): option host_name            => 'ci-ln-crxcz5b-f76d1-wmk6z-master-2.c.openshift-gce-devel-ci.internal'
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7201] dhcp4 (ens4): option interface_mtu        => '1460'
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7201] dhcp4 (ens4): option ip_address           => '10.0.0.3'
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7201] dhcp4 (ens4): option next_server          => '10.0.0.1'
Aug 27 12:05:12 localhost NetworkManager[1652]: <info>  [1598529912.7201] dhcp4 (ens4): option ntp_servers          => '169.254.169.254'

But I am not sure, if that's normal or not.

Comment 9 Alexander Constantinescu 2020-08-27 14:43:27 UTC
So I kind of get the feeling here that the problem is not how our ovs-configuration script does things. I think the problem is how the Node API object sets the `node.status.addresses` on Openstack. The InternalIP address does not correspond to the primary NIC address, as on GCP/AWS received from DHCP - which I *think* it should...?

Comment 10 Alexander Constantinescu 2020-08-27 16:01:36 UTC
FYI, follow up on #comment 4

Here's the output from master node: huir-0827-mnfkm-control-plane-0

[root@huir-0827-mnfkm-control-plane-0 core]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
    link/ether fa:16:3e:5e:5a:d7 brd ff:ff:ff:ff:ff:ff
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 5e:28:ab:21:c2:56 brd ff:ff:ff:ff:ff:ff
4: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether 6a:ea:c5:40:c2:f7 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::68ea:c5ff:fe40:c2f7/64 scope link 
       valid_lft forever preferred_lft forever
5: ovn-k8s-mp0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether ce:fe:f3:93:76:22 brd ff:ff:ff:ff:ff:ff
    inet 10.129.0.2/23 brd 10.129.1.255 scope global ovn-k8s-mp0
       valid_lft forever preferred_lft forever
6: br-int: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
    link/ether 6e:ca:4c:9d:55:43 brd ff:ff:ff:ff:ff:ff
7: ovn-k8s-gw0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0a:58:a9:fe:00:01 brd ff:ff:ff:ff:ff:ff
    inet 169.254.0.1/20 brd 169.254.15.255 scope global ovn-k8s-gw0
       valid_lft forever preferred_lft forever
8: br-local: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 72:e6:39:92:e6:4b brd ff:ff:ff:ff:ff:ff
    inet6 fe80::70e6:39ff:fe92:e64b/64 scope link 
       valid_lft forever preferred_lft forever
9: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:5e:5a:d7 brd ff:ff:ff:ff:ff:ff
    inet 10.0.97.89/22 brd 10.0.99.255 scope global dynamic noprefixroute br-ex
       valid_lft 80618sec preferred_lft 80618sec
    inet6 2620:52:0:60:1932:bc68:2020:af5d/64 scope global dynamic noprefixroute 
       valid_lft 2591897sec preferred_lft 604697sec
    inet6 fe80::5064:6d88:5776:2c54/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

So the node object's InternalIP: 10.129.0.2 has been wired to ovn-k8s-mp0 not br-ex. So when the CNO bootstraps OVN it passes 10.129.0.2 to ovnkube-master / ovn-controller /etc and thus no database connection happens (because that's accessible on 10.0.97.89)

Comment 11 Dan Winship 2020-08-27 16:30:52 UTC
> So the node object's InternalIP: 10.129.0.2 has been wired to ovn-k8s-mp0 not br-ex.

err... no, 10.129.0.2 is the correct IP for ovn-k8s-mp0; it's the gateway IP address of the pod network subnet assigned to that node.

The question is why kubelet is taking the IP from ovn-k8s-mp0 and thinking it should declare that as its InternalIP...

Comment 12 Alexander Constantinescu 2020-08-28 10:23:26 UTC
OK, I think I've narrowed the problem down. 

On Openstack we run the kubelet with the flag `--cloud-priver=`, that means it's up to the kubelet to set the IP address without looking up the node's IP address from the external cloud provider. This is done here:

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/nodestatus/setters.go#L205

I've written a small program replicating that part (as the kubelet has no logging of this)

```
package main

import (
	"fmt"
	"net"
	"os"
)

func validateNodeIP(nodeIP net.IP) error {
	// Honor IP limitations set in setNodeStatus()
	if nodeIP.To4() == nil && nodeIP.To16() == nil {
		return fmt.Errorf("nodeIP must be a valid IP address")
	}
	if nodeIP.IsLoopback() {
		return fmt.Errorf("nodeIP can't be loopback address")
	}
	if nodeIP.IsMulticast() {
		return fmt.Errorf("nodeIP can't be a multicast address")
	}
	if nodeIP.IsLinkLocalUnicast() {
		return fmt.Errorf("nodeIP can't be a link-local unicast address")
	}
	if nodeIP.IsUnspecified() {
		return fmt.Errorf("nodeIP can't be an all zeros address")
	}

	addrs, err := net.InterfaceAddrs()
	if err != nil {
		return err
	}
	for _, addr := range addrs {
		var ip net.IP
		switch v := addr.(type) {
		case *net.IPNet:
			ip = v.IP
		case *net.IPAddr:
			ip = v.IP
		}
		if ip != nil && ip.Equal(nodeIP) {
			return nil
		}
	}
	return fmt.Errorf("node IP: %q not found in the host's network interfaces", nodeIP.String())
}

func main() {
	hostname, err := os.Hostname()
	if err != nil {
		fmt.Printf("unable to get hostname, err: %v", err)
		return
	}
	ips,err := net.LookupIP(hostname)
	if err != nil {
		fmt.Printf("An error occured, err: %v\n", err)
	}
	for _, ip := range ips {
		if err := validateNodeIP(ip); err == nil {
			fmt.Printf("IP is: %s\n", ip.String())
		} else {
			fmt.Printf("IP: %s is skipped because: %v\n", ip.String(), err)
		}
	}
}

```

On Openstack that program returns:

$ ./tmp
IP: fe80::c06d:abff:fe70:9a09 is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::58c3:b9ff:fe32:3348 is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::f64e:de02:c198:b6db is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::f4d3:1aff:fe0b:5765 is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::78a9:45ff:fe87:487 is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::4839:9dff:fe04:25d3 is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::7cb5:37ff:fe42:b1e5 is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::e81d:e3ff:fe9e:f894 is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::9087:99ff:fe3c:3d3 is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::dcfa:a5ff:feb1:3fdf is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::c4d5:f2ff:fec1:1acf is skipped because: nodeIP can't be a link-local unicast address
IP: fe80::9828:d0ff:feca:7068 is skipped because: nodeIP can't be a link-local unicast address
IP is: 2620:52:0:60:946a:c6c1:950f:c7aa
IP: 169.254.0.1 is skipped because: nodeIP can't be a link-local unicast address
IP is: 10.128.2.2
IP is: 10.0.97.10

As seen in the code referenced just before: the kubelet takes the first IPv4 address it finds and assigns that IP to the InternalIP address. Thus 10.128.2.2 (in this example) - which is 
ovn-k8s-mp0 address. This is presumably because the interface index of ovn-k8s-mp0 is lower than br-ex.

$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
    link/ether fa:16:3e:f9:2a:63 brd ff:ff:ff:ff:ff:ff
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 6e:46:b8:7e:14:ab brd ff:ff:ff:ff:ff:ff
4: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether c2:6d:ab:70:9a:09 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::c06d:abff:fe70:9a09/64 scope link 
       valid_lft forever preferred_lft forever
5: ovn-k8s-mp0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 3a:3d:46:4b:79:3e brd ff:ff:ff:ff:ff:ff
    inet 10.128.2.2/23 brd 10.128.3.255 scope global ovn-k8s-mp0
       valid_lft forever preferred_lft forever
6: br-int: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
    link/ether 12:87:98:62:3f:42 brd ff:ff:ff:ff:ff:ff
7: ovn-k8s-gw0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0a:58:a9:fe:00:01 brd ff:ff:ff:ff:ff:ff
    inet 169.254.0.1/20 brd 169.254.15.255 scope global ovn-k8s-gw0
       valid_lft forever preferred_lft forever
8: br-local: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 5a:c3:b9:32:33:48 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::58c3:b9ff:fe32:3348/64 scope link 
       valid_lft forever preferred_lft forever
9: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:f9:2a:63 brd ff:ff:ff:ff:ff:ff
    inet 10.0.97.10/22 brd 10.0.99.255 scope global dynamic noprefixroute br-ex
       valid_lft 81365sec preferred_lft 81365sec
    inet6 2620:52:0:60:946a:c6c1:950f:c7aa/64 scope global dynamic noprefixroute 
       valid_lft 2592000sec preferred_lft 604800sec
    inet6 fe80::f64e:de02:c198:b6db/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
10: 13746d52afb719d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether f6:d3:1a:0b:57:65 brd ff:ff:ff:ff:ff:ff link-netns 2a7a011b-a946-466b-9db5-b09c024804ad
    inet6 fe80::f4d3:1aff:fe0b:5765/64 scope link 
       valid_lft forever preferred_lft forever
11: e6bcf62aaec7a73@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether 7a:a9:45:87:04:87 brd ff:ff:ff:ff:ff:ff link-netns a6aba99b-d33f-47e8-8aca-8a44dd170bde
    inet6 fe80::78a9:45ff:fe87:487/64 scope link 
       valid_lft forever preferred_lft forever
19: 80a85b68987e3c0@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether 4a:39:9d:04:25:d3 brd ff:ff:ff:ff:ff:ff link-netns 0687789c-38f1-48aa-b3cf-fb889943f620
    inet6 fe80::4839:9dff:fe04:25d3/64 scope link 
       valid_lft forever preferred_lft forever
20: 447a23cff658fb1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether 7e:b5:37:42:b1:e5 brd ff:ff:ff:ff:ff:ff link-netns c60591ef-1255-441d-8b61-927ced05baf4
    inet6 fe80::7cb5:37ff:fe42:b1e5/64 scope link 
       valid_lft forever preferred_lft forever
21: 3bcfb45b7e5b54b@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether ea:1d:e3:9e:f8:94 brd ff:ff:ff:ff:ff:ff link-netns 1687712b-b961-4811-a930-2a55e8e8a0d1
    inet6 fe80::e81d:e3ff:fe9e:f894/64 scope link 
       valid_lft forever preferred_lft forever
22: 3f4708a4fbcd6fc@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether 92:87:99:3c:03:d3 brd ff:ff:ff:ff:ff:ff link-netns c6a1907f-8dc6-4eec-9fc3-e5db3ffce3d0
    inet6 fe80::9087:99ff:fe3c:3d3/64 scope link 
       valid_lft forever preferred_lft forever
23: c76d2f65086ccba@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether de:fa:a5:b1:3f:df brd ff:ff:ff:ff:ff:ff link-netns 82450605-64fa-4d0d-875f-5baeb9c53ac9
    inet6 fe80::dcfa:a5ff:feb1:3fdf/64 scope link 
       valid_lft forever preferred_lft forever
24: 7b1302aa21edfa4@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether c6:d5:f2:c1:1a:cf brd ff:ff:ff:ff:ff:ff link-netns 265e51ac-b4e0-4e8c-9d90-43e53a6d753c
    inet6 fe80::c4d5:f2ff:fec1:1acf/64 scope link 
       valid_lft forever preferred_lft forever
25: 94579c8c8debc8c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether 9a:28:d0:ca:70:68 brd ff:ff:ff:ff:ff:ff link-netns 889bb976-d6ca-4d3e-8f8b-f4c75dca4e28
    inet6 fe80::9828:d0ff:feca:7068/64 scope link 
       valid_lft forever preferred_lft forever


The question is however why that net.LookupIP(hostname) returns ALL IP addresses on the host. On GCP we have the following:

$ ./tmpok 
IP is: 10.0.0.5

sh-4.4# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq master ovs-system state UP group default qlen 1000
    link/ether 42:01:0a:00:00:05 brd ff:ff:ff:ff:ff:ff
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 2e:b7:61:1c:26:2d brd ff:ff:ff:ff:ff:ff
4: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 42:01:0a:00:00:05 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.5/32 scope global dynamic noprefixroute br-ex
       valid_lft 79526sec preferred_lft 79526sec
    inet6 fe80::ea51:ba1f:982a:a7ea/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
5: br-int: <BROADCAST,MULTICAST> mtu 1360 qdisc noop state DOWN group default qlen 1000
    link/ether 9e:9c:c0:e9:9d:49 brd ff:ff:ff:ff:ff:ff
6: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether da:68:16:3b:14:6f brd ff:ff:ff:ff:ff:ff
    inet6 fe80::d868:16ff:fe3b:146f/64 scope link 
       valid_lft forever preferred_lft forever
7: br-local: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 96:a8:02:8f:9d:48 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::94a8:2ff:fe8f:9d48/64 scope link 
       valid_lft forever preferred_lft forever
8: ovn-k8s-gw0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0a:58:a9:fe:00:01 brd ff:ff:ff:ff:ff:ff
    inet 169.254.0.1/20 brd 169.254.15.255 scope global ovn-k8s-gw0
       valid_lft forever preferred_lft forever
9: ovn-k8s-mp0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 42:cd:38:c8:aa:6f brd ff:ff:ff:ff:ff:ff
    inet 10.128.4.2/23 brd 10.128.5.255 scope global ovn-k8s-mp0
       valid_lft forever preferred_lft forever
10: aa23b699c337113@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether 1e:a3:71:a5:5a:06 brd ff:ff:ff:ff:ff:ff link-netns 3f0d30d7-a923-4432-9ae1-054b5c05fc5b
    inet6 fe80::1ca3:71ff:fea5:5a06/64 scope link 
       valid_lft forever preferred_lft forever
11: d0d67f8fb285180@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether 76:f4:d9:6e:02:da brd ff:ff:ff:ff:ff:ff link-netns e061ecad-dbcc-4219-bcc3-6cb0c5f0cf00
    inet6 fe80::74f4:d9ff:fe6e:2da/64 scope link 
       valid_lft forever preferred_lft forever
12: f38aafbace1a8dd@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether ce:5f:f1:e8:a7:9a brd ff:ff:ff:ff:ff:ff link-netns 1f2b9538-1dde-4dea-81e9-6fda3765b5a3
    inet6 fe80::cc5f:f1ff:fee8:a79a/64 scope link 
       valid_lft forever preferred_lft forever
13: f3558d80694bda7@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether ee:96:09:9a:14:f0 brd ff:ff:ff:ff:ff:ff link-netns 0f6ff284-b395-4733-9070-d8eda907e852
    inet6 fe80::ec96:9ff:fe9a:14f0/64 scope link 
       valid_lft forever preferred_lft forever
14: 8d42766cc828067@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether d6:3f:1c:2f:22:88 brd ff:ff:ff:ff:ff:ff link-netns 79dc4dca-b32d-439d-9ec6-e3f2e98160a5
    inet6 fe80::d43f:1cff:fe2f:2288/64 scope link 
       valid_lft forever preferred_lft forever
15: 1f32cf96774d45d@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether 66:42:d7:66:35:ca brd ff:ff:ff:ff:ff:ff link-netns 11b2dd97-f4ed-44b0-b8c2-b4797a111284
    inet6 fe80::6442:d7ff:fe66:35ca/64 scope link 
       valid_lft forever preferred_lft forever
16: 533e310648dc506@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether 9e:cf:08:b9:30:3b brd ff:ff:ff:ff:ff:ff link-netns 4b74495f-5acd-4a00-b7c9-29d542cdfd0d
    inet6 fe80::9ccf:8ff:feb9:303b/64 scope link 
       valid_lft forever preferred_lft forever
18: d59ee44a51524f9@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 66:8c:2b:6d:18:ed brd ff:ff:ff:ff:ff:ff link-netns d14f65b8-5910-4d76-b518-758da0409bff
    inet6 fe80::648c:2bff:fe6d:18ed/64 scope link 
       valid_lft forever preferred_lft forever
19: 2c5a37360eff43b@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether 6e:5f:f8:b2:5c:47 brd ff:ff:ff:ff:ff:ff link-netns cd6d1582-b97f-4c35-b38f-6e1ee54ab89c
    inet6 fe80::6c5f:f8ff:feb2:5c47/64 scope link 
       valid_lft forever preferred_lft forever
20: 896c515ba98661e@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether ae:be:3b:07:26:5d brd ff:ff:ff:ff:ff:ff link-netns cdade3b6-5250-483f-8c85-a97f1a96604d
    inet6 fe80::acbe:3bff:fe07:265d/64 scope link 
       valid_lft forever preferred_lft forever
21: c97564e5f2c9c1c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 2a:79:0f:60:f0:2e brd ff:ff:ff:ff:ff:ff link-netns 1cc96cee-b56b-4c50-8745-f99442961c23
    inet6 fe80::2879:fff:fe60:f02e/64 scope link 
       valid_lft forever preferred_lft forever
22: ca9b67cf20f9d46@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 8e:43:5b:ac:a9:33 brd ff:ff:ff:ff:ff:ff link-netns 897e930f-3ab8-47fd-8a6f-8bc04cafdf03
    inet6 fe80::8c43:5bff:feac:a933/64 scope link 
       valid_lft forever preferred_lft forever
23: 5e59097cbb9f962@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether 2a:92:a0:e7:b1:a1 brd ff:ff:ff:ff:ff:ff link-netns 90dcd4e2-40a7-453f-9ba4-5ee9fb73ed22
    inet6 fe80::2892:a0ff:fee7:b1a1/64 scope link 
       valid_lft forever preferred_lft forever
25: 94fd7748b53a0be@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether c2:02:51:9e:a4:bb brd ff:ff:ff:ff:ff:ff link-netns 5b123311-03db-4c8c-b0c3-49e4435e1724
    inet6 fe80::c002:51ff:fe9e:a4bb/64 scope link 
       valid_lft forever preferred_lft forever
26: e302b091c47156a@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 8e:2f:d5:ad:dd:6f brd ff:ff:ff:ff:ff:ff link-netns 964fba77-5d9c-4599-84ce-1e6640c0c09d
    inet6 fe80::8c2f:d5ff:fead:dd6f/64 scope link 
       valid_lft forever preferred_lft forever
27: 2121f4f3184ad97@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 2e:b9:5c:93:54:54 brd ff:ff:ff:ff:ff:ff link-netns 59fe64f4-39d0-407d-a653-47eafa67ca06
    inet6 fe80::2cb9:5cff:fe93:5454/64 scope link 
       valid_lft forever preferred_lft forever
28: 991de499c7f0e90@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether fa:51:72:bb:2d:1d brd ff:ff:ff:ff:ff:ff link-netns d9815926-571d-4b30-baac-8726635d25b5
    inet6 fe80::f851:72ff:febb:2d1d/64 scope link 
       valid_lft forever preferred_lft forever
29: db6e32a83ad3361@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 36:8c:9e:53:6f:e6 brd ff:ff:ff:ff:ff:ff link-netns 21b22dbf-289c-4339-8a9e-80a04c43c208
    inet6 fe80::348c:9eff:fe53:6fe6/64 scope link 
       valid_lft forever preferred_lft forever
38: 02fcf25e68d242b@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 2e:f3:df:12:3e:9b brd ff:ff:ff:ff:ff:ff link-netns 6bedfea4-2cc7-4973-8418-db3a4a496fc3
    inet6 fe80::2cf3:dfff:fe12:3e9b/64 scope link 
       valid_lft forever preferred_lft forever
39: 87de57854192a25@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether 66:a5:1a:5c:ad:f1 brd ff:ff:ff:ff:ff:ff link-netns bce20730-3a3c-467e-921d-a03e6509a6d2
    inet6 fe80::64a5:1aff:fe5c:adf1/64 scope link 
       valid_lft forever preferred_lft forever
40: 29509152075b4ca@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 92:f6:99:1c:44:c9 brd ff:ff:ff:ff:ff:ff link-netns a0ec0d4a-bdfb-439c-ba6b-b3048749f76b
    inet6 fe80::90f6:99ff:fe1c:44c9/64 scope link 
       valid_lft forever preferred_lft forever
41: 87f66223949b7ab@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether f2:ba:0b:26:62:cd brd ff:ff:ff:ff:ff:ff link-netns ce5daba5-9c48-46ba-98cd-721f578862b2
    inet6 fe80::f0ba:bff:fe26:62cd/64 scope link 
       valid_lft forever preferred_lft forever
42: 2a6c94645da32d8@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether ce:46:e3:3b:89:57 brd ff:ff:ff:ff:ff:ff link-netns 7bed194a-332f-4753-a548-9c0aabfecb0d
    inet6 fe80::cc46:e3ff:fe3b:8957/64 scope link 
       valid_lft forever preferred_lft forever
44: 4f2b3e2d36fd42c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether fa:28:90:60:cf:ee brd ff:ff:ff:ff:ff:ff link-netns 99aa630d-6741-495a-88dd-38276cf2e6a4
    inet6 fe80::f828:90ff:fe60:cfee/64 scope link 
       valid_lft forever preferred_lft forever
45: 23a9306132b5a77@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 6e:4c:b4:28:99:b8 brd ff:ff:ff:ff:ff:ff link-netns 92d67cea-87ac-4e71-b14c-7c48629f0f7d
    inet6 fe80::6c4c:b4ff:fe28:99b8/64 scope link 
       valid_lft forever preferred_lft forever
46: 76ef5a1cdc02ec9@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 72:60:02:ef:c5:3e brd ff:ff:ff:ff:ff:ff link-netns 5ab2bd94-f5d0-41e2-9686-710843f7b798
    inet6 fe80::7060:2ff:feef:c53e/64 scope link 
       valid_lft forever preferred_lft forever
47: b288995804c5fd5@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether ee:38:41:fe:ba:55 brd ff:ff:ff:ff:ff:ff link-netns dace91c9-4c34-4480-b6eb-cc548d98a755
    inet6 fe80::ec38:41ff:fefe:ba55/64 scope link 
       valid_lft forever preferred_lft forever
48: 3d2df86362b4ef7@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 06:1e:89:5f:df:f3 brd ff:ff:ff:ff:ff:ff link-netns 4b9dba63-12e1-456e-8f48-9776d9108880
    inet6 fe80::41e:89ff:fe5f:dff3/64 scope link 
       valid_lft forever preferred_lft forever
49: e487ec1aa953704@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 22:d0:5a:bb:f0:6c brd ff:ff:ff:ff:ff:ff link-netns 015edebb-b16b-4bc1-9ab4-ca3a8843e0dd
    inet6 fe80::20d0:5aff:febb:f06c/64 scope link 
       valid_lft forever preferred_lft forever
50: ee0e00a84e719a5@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether f2:3b:c0:12:8d:84 brd ff:ff:ff:ff:ff:ff link-netns f4ed6666-d714-4571-9bd4-a14bf0133bb1
    inet6 fe80::f03b:c0ff:fe12:8d84/64 scope link 
       valid_lft forever preferred_lft forever
51: 4613fd5c6234eed@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 92:64:38:bd:21:df brd ff:ff:ff:ff:ff:ff link-netns c28f35d5-7d36-4408-83ff-1dd98b045892
    inet6 fe80::9064:38ff:febd:21df/64 scope link 
       valid_lft forever preferred_lft forever
52: c1a27a9b6235c7a@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 6a:2a:23:7d:d6:42 brd ff:ff:ff:ff:ff:ff link-netns f5dbe6db-1cca-46a4-8a37-01f7ca266f8d
    inet6 fe80::682a:23ff:fe7d:d642/64 scope link 
       valid_lft forever preferred_lft forever
53: 44a44a06dcca16e@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 02:b1:5a:cb:80:91 brd ff:ff:ff:ff:ff:ff link-netns 555a7550-f8ee-47e0-bb79-ca15df343ec5
    inet6 fe80::b1:5aff:fecb:8091/64 scope link 
       valid_lft forever preferred_lft forever
54: 40a4c03bf129ad8@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue master ovs-system state UP group default 
    link/ether da:13:7c:b0:55:39 brd ff:ff:ff:ff:ff:ff link-netns 58027774-aeb3-46c8-a950-f4191b576ee1
    inet6 fe80::d813:7cff:feb0:5539/64 scope link 
       valid_lft forever preferred_lft forever
55: 1c395cddee58390@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 8e:4c:56:cc:a9:5d brd ff:ff:ff:ff:ff:ff link-netns f16ec94d-0da9-47d6-bcb0-543def0d4af9
    inet6 fe80::8c4c:56ff:fecc:a95d/64 scope link 
       valid_lft forever preferred_lft forever
56: 256eb51f3699d73@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether e2:5f:2b:b3:26:8c brd ff:ff:ff:ff:ff:ff link-netns ba1b6923-cca7-41d2-ab96-a94763111aa6
    inet6 fe80::e05f:2bff:feb3:268c/64 scope link 
       valid_lft forever preferred_lft forever
57: 039dffb3684cb8e@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 9e:29:cb:4e:d8:f6 brd ff:ff:ff:ff:ff:ff link-netns b7fd9b7d-4fa5-446c-a81f-36d214b46b83
    inet6 fe80::9c29:cbff:fe4e:d8f6/64 scope link 
       valid_lft forever preferred_lft forever
58: 7dc52a9a2fef44a@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 76:ad:7d:2c:54:e1 brd ff:ff:ff:ff:ff:ff link-netns 27e9e3fb-ff2b-4968-b31f-f665d8e60e5a
    inet6 fe80::74ad:7dff:fe2c:54e1/64 scope link 
       valid_lft forever preferred_lft forever
59: 004c6d102a681cb@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 6a:47:c5:f4:eb:0e brd ff:ff:ff:ff:ff:ff link-netns 5eaa270f-b791-4b10-b284-57ed19b17a74
    inet6 fe80::6847:c5ff:fef4:eb0e/64 scope link 
       valid_lft forever preferred_lft forever
60: 54f239882489211@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1360 qdisc noqueue state UP group default 
    link/ether 06:6c:09:11:c7:44 brd ff:ff:ff:ff:ff:ff link-netns ab8bca34-284a-4672-ad21-92ed414d504f
    inet6 fe80::46c:9ff:fe11:c744/64 scope link 
       valid_lft forever preferred_lft forever


The answer can be found in /etc/nsswitch.conf

$ cat /etc/nsswitch.conf
...
hosts:      files dns myhostname

And specifically myhostname

myhostname is a GNU plugin, see: http://0pointer.de/lennart/projects/nss-myhostname/

The important part in that link is:

>  nss-myhostname simply returns all locally configure public IP addresses

Which means that any IP address configured on an Openstack machine is defined as "public". 

I.e doing:

$ getent ahosts `hostname`
fe80::c06d:abff:fe70:9a09 STREAM huir-0828-cf5hd-compute-0
fe80::c06d:abff:fe70:9a09 DGRAM  
fe80::c06d:abff:fe70:9a09 RAW    
fe80::58c3:b9ff:fe32:3348 STREAM 
fe80::58c3:b9ff:fe32:3348 DGRAM  
fe80::58c3:b9ff:fe32:3348 RAW    
fe80::f64e:de02:c198:b6db STREAM 
fe80::f64e:de02:c198:b6db DGRAM  
fe80::f64e:de02:c198:b6db RAW    
fe80::f4d3:1aff:fe0b:5765 STREAM 
fe80::f4d3:1aff:fe0b:5765 DGRAM  
fe80::f4d3:1aff:fe0b:5765 RAW    
fe80::78a9:45ff:fe87:487 STREAM 
fe80::78a9:45ff:fe87:487 DGRAM  
fe80::78a9:45ff:fe87:487 RAW    
fe80::4839:9dff:fe04:25d3 STREAM 
fe80::4839:9dff:fe04:25d3 DGRAM  
fe80::4839:9dff:fe04:25d3 RAW    
fe80::7cb5:37ff:fe42:b1e5 STREAM 
fe80::7cb5:37ff:fe42:b1e5 DGRAM  
fe80::7cb5:37ff:fe42:b1e5 RAW    
fe80::e81d:e3ff:fe9e:f894 STREAM 
fe80::e81d:e3ff:fe9e:f894 DGRAM  
fe80::e81d:e3ff:fe9e:f894 RAW    
fe80::9087:99ff:fe3c:3d3 STREAM 
fe80::9087:99ff:fe3c:3d3 DGRAM  
fe80::9087:99ff:fe3c:3d3 RAW    
fe80::dcfa:a5ff:feb1:3fdf STREAM 
fe80::dcfa:a5ff:feb1:3fdf DGRAM  
fe80::dcfa:a5ff:feb1:3fdf RAW    
fe80::c4d5:f2ff:fec1:1acf STREAM 
fe80::c4d5:f2ff:fec1:1acf DGRAM  
fe80::c4d5:f2ff:fec1:1acf RAW    
fe80::9828:d0ff:feca:7068 STREAM 
fe80::9828:d0ff:feca:7068 DGRAM  
fe80::9828:d0ff:feca:7068 RAW    
2620:52:0:60:946a:c6c1:950f:c7aa STREAM 
2620:52:0:60:946a:c6c1:950f:c7aa DGRAM  
2620:52:0:60:946a:c6c1:950f:c7aa RAW    
169.254.0.1     STREAM 
169.254.0.1     DGRAM  
169.254.0.1     RAW    
10.128.2.2      STREAM 
10.128.2.2      DGRAM  
10.128.2.2      RAW    
10.0.97.10      STREAM 
10.0.97.10      DGRAM  
10.0.97.10      RAW    


I need to find out why that is though and what configures that.

Comment 13 Dan Winship 2020-08-28 12:04:34 UTC
> The question is however why that net.LookupIP(hostname) returns ALL IP addresses on the host. On GCP we have the following:
> 
> $ ./tmpok 
> IP is: 10.0.0.5

So you're implying that we don't have "myhostname" in /etc/nsswitch.conf on GCP but we do on OpenStack?

Either way, it sounds like kubelet's autodetection behavior and "hosts myhostname" are not compatible... In particular, it seems like kubelet is more to blame here, since it's not even checking that the IP it picks is routable off the host. It needs to intersect the return value of `LookupIP` with the set of IPs that could theoretically have been returned from `utilnet.ChooseHostInterface` or somethign...

> This is presumably because the interface index of ovn-k8s-mp0 is lower than br-ex.

That doesn't make sense though... configure-ovs-network runs well before ovnkube-node starts, so br-ex should have a lower interface index than any of the other ovn-kube-related interfaces... Did it get deleted and recreated? Can you check dmesg and/or NetworkManager journals to see how/when the various interfaces were created?


So:

1. If we can fix the index ordering, the bug will probably go away; this may be the easiest fix.
2. kubelet's baremetal IP-finding code is wrong and we should fix it upstream, and that would fix the problem if we can't fix the index ordering. (There are other problems with that code too, like the fact that uses the node *name* where it should be using the node *hostname*, so we have to figure out how much fixing we want to do...)
3. If the OCP default is to *not* use "myhostname" and that's something that's being added for OpenStack, then removing that might fix the problem, but presumably that would have other side effects and is probably not an option.

Comment 14 Alexander Constantinescu 2020-08-28 12:40:05 UTC
> So you're implying that we don't have "myhostname" in /etc/nsswitch.conf on GCP but we do on OpenStack?

No, I am saying the output of that DNS lookup (which is equal to getent ahosts `hostname`) is different. And specifically, on GCP the output only contains the eth0 IPv4, while on Openstack it contains all IPs for all interfaces on the node - which is strange given that http://0pointer.de/lennart/projects/nss-myhostname/ specifies that it should return "all locally configure public IP addresses", which ovn-k8s-mp0 should not be for example. 

> Did it get deleted and recreated? Can you check dmesg and/or NetworkManager journals to see how/when the various interfaces were created?

I will check the interface ordering and update this BZ with my findings.

Comment 15 Dan Winship 2020-08-28 14:19:35 UTC
(In reply to Alexander Constantinescu from comment #14)
> > So you're implying that we don't have "myhostname" in /etc/nsswitch.conf on GCP but we do on OpenStack?
> 
> No, I am saying the output of that DNS lookup (which is equal to getent
> ahosts `hostname`) is different. 

Ah, ok; so (presumably) nsswitch.conf says "hosts: files dns myhostname" on both, but on GCP systems, either "files" or "dns" succeeds so it never gets to "myhostname".

Which makes sense; in GCP/AWS, the cloud makes sure there are DNS records for the nodes, whereas in OpenStack that wouldn't happen. (Perhaps this could also be fixed in OpenStack by adding an appropriate alias to /etc/hosts.)

Comment 16 Alexander Constantinescu 2020-08-28 16:13:26 UTC
> either "files" or "dns" succeeds so it never gets to "myhostname".

It's true but it doesn't matter, because as I mentioned: the DNS resolution of myhostname (equivalent to "getent ahosts `hostname`") returns "only the eth0 IPv4" on GCP. Only Openstack returns that screwy list of all IPs. Even my local libvirt cluster returns the same result as on GCP:

From my libvirt cluster:

$ getent ahosts `hostname`
192.168.126.12  STREAM test-gz9pv-master-1.test.alexander
192.168.126.12  DGRAM  
192.168.126.12  RAW    

$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
    link/ether 52:54:00:56:f9:70 brd ff:ff:ff:ff:ff:ff
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether fa:14:4b:fb:b1:a1 brd ff:ff:ff:ff:ff:ff
4: ovn-k8s-mp0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether a6:85:9f:a1:9d:2c brd ff:ff:ff:ff:ff:ff
    inet 10.128.0.2/23 brd 10.128.1.255 scope global ovn-k8s-mp0
       valid_lft forever preferred_lft forever
5: br-int: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
    link/ether 16:30:f2:b5:0b:47 brd ff:ff:ff:ff:ff:ff
6: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether e2:41:79:73:b0:a2 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::e041:79ff:fe73:b0a2/64 scope link 
       valid_lft forever preferred_lft forever
7: ovn-k8s-gw0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0a:58:a9:fe:00:01 brd ff:ff:ff:ff:ff:ff
    inet 169.254.0.1/20 brd 169.254.15.255 scope global ovn-k8s-gw0
       valid_lft forever preferred_lft forever
8: br-local: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether ea:3b:2d:a2:a1:4f brd ff:ff:ff:ff:ff:ff
    inet6 fe80::e83b:2dff:fea2:a14f/64 scope link 
       valid_lft forever preferred_lft forever
9: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 52:54:00:56:f9:70 brd ff:ff:ff:ff:ff:ff
    inet 192.168.126.12/24 brd 192.168.126.255 scope global dynamic noprefixroute br-ex
       valid_lft 3257sec preferred_lft 3257sec
    inet6 fe80::bc88:75ad:4faa:b92/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
10: a95c973292b9d65@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether c6:71:2a:2b:e5:2a brd ff:ff:ff:ff:ff:ff link-netns bd441c49-dc3e-4995-a844-45e0198b8e44
    inet6 fe80::c471:2aff:fe2b:e52a/64 scope link 
       valid_lft forever preferred_lft forever
11: 2fa27ccad01c4ba@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether 1a:66:cf:f9:a3:4c brd ff:ff:ff:ff:ff:ff link-netns a2994b34-c922-4a45-a394-388a4553f041
    inet6 fe80::1866:cfff:fef9:a34c/64 scope link 
       valid_lft forever preferred_lft forever
13: d2b1fde3a4ccc64@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether ba:30:82:40:67:d6 brd ff:ff:ff:ff:ff:ff link-netns 68b57816-d380-4243-8ec8-a6207d254caa
    inet6 fe80::b830:82ff:fe40:67d6/64 scope link 
       valid_lft forever preferred_lft forever
14: b039a8d6bc12d71@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether 52:69:92:ca:f7:77 brd ff:ff:ff:ff:ff:ff link-netns 03f6dc24-5a54-4df7-8e0a-fdf630e4a92d
    inet6 fe80::5069:92ff:feca:f777/64 scope link 
       valid_lft forever preferred_lft forever
15: c8d6b4d871e6a9c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether 6e:6a:49:a8:93:7c brd ff:ff:ff:ff:ff:ff link-netns e11d2197-2705-4755-b4ad-7753defeafa4
    inet6 fe80::6c6a:49ff:fea8:937c/64 scope link 
       valid_lft forever preferred_lft forever
16: a71a7a45cac93ea@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether 0a:ef:b5:8e:61:e1 brd ff:ff:ff:ff:ff:ff link-netns 85a4b12f-a490-419e-a4e0-9e2badd0cf5e
    inet6 fe80::8ef:b5ff:fe8e:61e1/64 scope link 
       valid_lft forever preferred_lft forever
18: c5a270a46c08812@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP group default 
    link/ether 02:e4:71:c9:05:d6 brd ff:ff:ff:ff:ff:ff link-netns 99015917-8450-4b86-9b34-2c69a36d23e9
    inet6 fe80::e4:71ff:fec9:5d6/64 scope link 
       valid_lft forever preferred_lft forever

Anyways, progress has been made again: I am able to reproduce this on my libvirt cluster locally on my computer. And I've modified the ovn-configuration.sh script to output `ip a` at its exit. 

There are several problems here, as you mentioned:

> 2. kubelet's baremetal IP-finding code is wrong and we should fix it upstream, and that would fix the problem if we can't fix the index ordering. (There are other problems with that code too, like the fact that uses the node *name* where it should be using the node *hostname*, so we have to figure out how much fixing we want to do...)

Then 

3. Whatever VM configuration/cloud setting/kernel setting/fairy dust or unicorn has Openstack return ALL IPs across all interfaces for "getent ahosts `hostname`" should stop. Only publicly exposed IPs should be returned by that DNS resolution.

Now, concerning 1. i.e: the re-ordering of interfaces: I've reproduced this on libvirt, and again: I don't know what causes it but once the node reboots the br-ex interface is placed last in the list. This is BEFORE ovnkube-node reboots and it is not caused by the ovs-configuration script. I know this because when the node reboots and br-ex is properly configured, the script does nothing but exists, see below output:

-- Reboot --
Aug 28 15:51:31 test-gz9pv-master-1 systemd[1]: Starting Configures OVS with proper host networking configuration...
Aug 28 15:51:31 test-gz9pv-master-1 configure-ovs.sh[1592]: + iface=
Aug 28 15:51:31 test-gz9pv-master-1 configure-ovs.sh[1592]: + counter=0
Aug 28 15:51:31 test-gz9pv-master-1 configure-ovs.sh[1592]: + '[' 0 -lt 12 ']'
Aug 28 15:51:31 test-gz9pv-master-1 configure-ovs.sh[1592]: ++ jq -r '.[0].dev'
Aug 28 15:51:31 test-gz9pv-master-1 configure-ovs.sh[1592]: ++ ip -j route show default
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + iface=br-ex
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + [[ -n br-ex ]]
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + [[ br-ex != \n\u\l\l ]]
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + echo 'IPv4 Default gateway interface found: br-ex'
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: IPv4 Default gateway interface found: br-ex
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + break
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + '[' br-ex = br-ex ']'
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + echo 'Networking already configured and up for br-ex!'
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: Networking already configured and up for br-ex!
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + ip a
Aug 28 15:51:32 test-gz9pv-master-1 systemd[1]: Started Configures OVS with proper host networking configuration.
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     inet 127.0.0.1/8 scope host lo
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:        valid_lft forever preferred_lft forever
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     inet6 ::1/128 scope host
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:        valid_lft forever preferred_lft forever
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether 52:54:00:56:f9:70 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether fa:14:4b:fb:b1:a1 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 4: ovn-k8s-mp0: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether a6:85:9f:a1:9d:2c brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 5: br-int: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether 16:30:f2:b5:0b:47 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 6: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether e2:41:79:73:b0:a2 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     inet6 fe80::e041:79ff:fe73:b0a2/64 scope link
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:        valid_lft forever preferred_lft forever
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 7: ovn-k8s-gw0: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether 0a:58:a9:fe:00:01 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 8: br-local: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether ea:3b:2d:a2:a1:4f brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 9: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether 52:54:00:56:f9:70 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     inet 192.168.126.12/24 brd 192.168.126.255 scope global dynamic noprefixroute br-ex
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:        valid_lft 3600sec preferred_lft 3600sec
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     inet6 fe80::bc88:75ad:4faa:b92/64 scope link tentative noprefixroute
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:        valid_lft forever preferred_lft forever
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + exit 0
Aug 28 15:51:32 test-gz9pv-master-1 systemd[1]: ovs-configuration.service: Consumed 57ms CPU time

br-ex is ninth and ovnkube-node has not started and the configuration script did nothing except echo some stuff. So either this is something left over from the previous ovnkube-node configuration of br-ex which triggers the re-ordering on restart, or NetworkManager does this (but I am unable to understand from reading its logs, see below)

Here are NetworkManager logs from the reboot:

Aug 28 15:51:31 localhost systemd[1]: Started Hostname Service.
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0169] hostname: hostname: using hostnamed
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0172] dns-mgr[0x563d62234250]: init: dns=default,systemd-resolved rc-manager=symlink
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0231] Loaded device plugin: NMOvsFactory (/usr/lib64/NetworkManager/1.22.8-6.el8_2/libnm-device-plugin-ovs.so)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0249] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.22.8-6.el8_2/libnm-device-plugin-team.so)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0250] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0251] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Aug 28 15:51:31 localhost dbus-daemon[1146]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.6' (uid=0 pid=1420 comm="/usr/sbin/NetworkManag>
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0252] manager: Networking is enabled by state file
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0253] dhcp-init: Using DHCP client 'internal'
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0256] settings: Loaded settings plugin: keyfile (internal)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0274] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.22.8-6.el8_2/libnm-settings-plugin-ifcfg-rh.so")
Aug 28 15:51:31 localhost systemd[1]: Starting Network Manager Script Dispatcher Service...
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0334] device (lo): carrier: link connected
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0336] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0345] manager: (br-int): new Open vSwitch Interface device (/org/freedesktop/NetworkManager/Devices/2)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0351] manager: (br-local): new Open vSwitch Interface device (/org/freedesktop/NetworkManager/Devices/3)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0358] manager: (ovn-k8s-gw0): new Open vSwitch Interface device (/org/freedesktop/NetworkManager/Devices/4)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0364] manager: (ovn-k8s-mp0): new Open vSwitch Interface device (/org/freedesktop/NetworkManager/Devices/5)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0372] manager: (ens3): new Ethernet device (/org/freedesktop/NetworkManager/Devices/6)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0376] device (ens3): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Aug 28 15:51:31 localhost kernel: IPv6: ADDRCONF(NETDEV_UP): ens3: link is not ready
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0435] device (ens3): carrier: link connected
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0453] device (genev_sys_6081): carrier: link connected
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0456] manager: (genev_sys_6081): new Generic device (/org/freedesktop/NetworkManager/Devices/7)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0464] manager: (br-ex): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/8)
Aug 28 15:51:31 localhost dbus-daemon[1146]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0469] manager: (br-ex): new Open vSwitch Bridge device (/org/freedesktop/NetworkManager/Devices/9)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0473] device (br-ex): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0485] manager: (br-ex): new Open vSwitch Interface device (/org/freedesktop/NetworkManager/Devices/10)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0489] device (br-ex): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0502] manager: (ens3): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/11)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0507] device (ens3): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Aug 28 15:51:31 localhost systemd[1]: Started Network Manager Script Dispatcher Service.
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0572] ovs: ovs interface "92714f253c5cba9" ((null)) failed: could not open network device 92714f253c5cba9 (No such device)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0572] ovs: ovs interface "6eb64ae03e9a18f" ((null)) failed: could not open network device 6eb64ae03e9a18f (No such device)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0572] ovs: ovs interface "70f3adba4eb5107" ((null)) failed: could not open network device 70f3adba4eb5107 (No such device)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0573] ovs: ovs interface "18fb30d42c5147b" ((null)) failed: could not open network device 18fb30d42c5147b (No such device)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0573] ovs: ovs interface "bf59011c624959f" ((null)) failed: could not open network device bf59011c624959f (No such device)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0573] ovs: ovs interface "ddec5a12e243f08" ((null)) failed: could not open network device ddec5a12e243f08 (No such device)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0573] ovs: ovs interface "c488d2b84d35146" ((null)) failed: could not open network device c488d2b84d35146 (No such device)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0577] manager: (6eb64ae03e9a18f): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/12)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0584] manager: (92714f253c5cba9): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/13)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0591] manager: (patch-lnet-node_local_switch-to-br-int): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/14)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0599] manager: (patch-br-int-to-lnet-node_local_switch): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/15)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0611] manager: (patch-br-ex_test-gz9pv-master-1-to-br-int): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/16)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0619] manager: (br-int): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/17)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0626] manager: (patch-br-int-to-br-ex_test-gz9pv-master-1): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/18)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0633] manager: (ovn-k8s-mp0): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/19)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0641] manager: (bf59011c624959f): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/20)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0653] manager: (ovn-c6e3dd-0): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/21)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0679] manager: (ovn-k8s-gw0): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/22)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0689] manager: (ddec5a12e243f08): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/23)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0697] manager: (18fb30d42c5147b): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/24)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0705] manager: (c488d2b84d35146): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/25)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0713] manager: (70f3adba4eb5107): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/26)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0721] manager: (br-local): new Open vSwitch Port device (/org/freedesktop/NetworkManager/Devices/27)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0729] manager: (br-int): new Open vSwitch Bridge device (/org/freedesktop/NetworkManager/Devices/28)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0737] manager: (br-local): new Open vSwitch Bridge device (/org/freedesktop/NetworkManager/Devices/29)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0813] policy: auto-activating connection 'ovs-port-br-ex' (66688bb6-77a3-4e43-8bd4-ed620a470aca)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0816] device (ens3): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0823] device (br-ex): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0825] device (br-ex): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0827] device (ens3): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0833] device (br-ex): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0841] device (br-ex): state change: unavailable -> disconnected (reason 'user-requested', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0848] device (br-ex): Activation: starting connection 'ovs-port-br-ex' (66688bb6-77a3-4e43-8bd4-ed620a470aca)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0849] policy: auto-activating connection 'ovs-if-phys0' (215ef9cd-4337-4d6f-ab55-74f234d132ff)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0850] policy: auto-activating connection 'br-ex' (7f199444-421a-436e-b14d-44b8e1a11c98)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0851] policy: auto-activating connection 'ovs-if-br-ex' (810c8081-134d-435a-ad50-c3df0143a7ea)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0853] policy: auto-activating connection 'ovs-port-phys0' (820dd895-36d0-4664-a3ea-670375e85743)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0856] device (br-ex): Activation: starting connection 'br-ex' (7f199444-421a-436e-b14d-44b8e1a11c98)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0857] device (br-ex): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0860] manager: NetworkManager state is now CONNECTING
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0862] device (ens3): Activation: starting connection 'ovs-if-phys0' (215ef9cd-4337-4d6f-ab55-74f234d132ff)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0864] device (br-ex): Activation: starting connection 'ovs-if-br-ex' (810c8081-134d-435a-ad50-c3df0143a7ea)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0867] device (ens3): Activation: starting connection 'ovs-port-phys0' (820dd895-36d0-4664-a3ea-670375e85743)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0868] device (br-ex): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0872] device (br-ex): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0875] device (ens3): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0879] device (br-ex): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0883] device (br-ex): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0888] device (ens3): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0894] device (ens3): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0897] device (br-ex): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0900] device (ens3): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0905] device (br-ex): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0906] device (br-ex): state change: ip-config -> secondaries (reason 'ip-config-unavailable', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0921] device (br-ex): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0928] device (br-ex): Activation: connection 'ovs-if-br-ex' enslaved, continuing activation
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0930] device (ens3): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0931] device (ens3): Activation: connection 'ovs-port-phys0' enslaved, continuing activation
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0932] device (ens3): state change: ip-config -> secondaries (reason 'ip-config-unavailable', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0938] device (br-ex): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0941] device (br-ex): Activation: connection 'ovs-port-br-ex' enslaved, continuing activation
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0944] device (br-ex): state change: ip-config -> secondaries (reason 'ip-config-unavailable', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0946] device (br-ex): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0956] policy: set-hostname: set hostname to 'localhost.localdomain' (no default device)
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0957] device (br-ex): Activation: successful, device activated.
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0964] device (ens3): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0971] device (ens3): Activation: connection 'ovs-if-phys0' enslaved, continuing activation
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0973] device (ens3): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0978] device (ens3): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0987] device (ens3): Activation: successful, device activated.
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.0994] device (br-ex): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost NetworkManager[1420]: <info>  [1598629891.1001] device (br-ex): Activation: successful, device activated.
Aug 28 15:51:31 localhost.localdomain systemd-hostnamed[1432]: Changed host name to 'localhost.localdomain'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1098] device (ens3): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1100] device (ens3): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1109] device (ens3): Activation: successful, device activated.
Aug 28 15:51:31 localhost.localdomain kernel: device br-ex entered promiscuous mode
Aug 28 15:51:31 localhost.localdomain systemd-udevd[1485]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Aug 28 15:51:31 localhost.localdomain systemd-udevd[1485]: Could not generate persistent MAC address for br-ex: No such file or directory
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1359] device (br-ex): set-hw-addr: set-cloned MAC address to 52:54:00:56:F9:70 (52:54:00:56:F9:70)
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1366] device (br-ex): carrier: link connected
Aug 28 15:51:31 localhost.localdomain ovs-vswitchd[1336]: ovs|00109|bridge|ERR|interface br-ex: ignoring mac in Interface record (use Bridge record to set local port's mac)
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1378] dhcp4 (br-ex): activation: beginning transaction (timeout in 45 seconds)
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1508] dhcp4 (br-ex): option dhcp_lease_time      => '3600'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1509] dhcp4 (br-ex): option domain_name          => 'test.alexander'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1509] dhcp4 (br-ex): option domain_name_servers  => '192.168.126.1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1509] dhcp4 (br-ex): option expiry               => '1598633491'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1509] dhcp4 (br-ex): option host_name            => 'test-gz9pv-master-1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1509] dhcp4 (br-ex): option ip_address           => '192.168.126.12'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1509] dhcp4 (br-ex): option next_server          => '192.168.126.1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1509] dhcp4 (br-ex): option requested_broadcast_address => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1509] dhcp4 (br-ex): option requested_domain_name => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1510] dhcp4 (br-ex): option requested_domain_name_servers => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1510] dhcp4 (br-ex): option requested_domain_search => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1510] dhcp4 (br-ex): option requested_host_name  => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1510] dhcp4 (br-ex): option requested_interface_mtu => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1510] dhcp4 (br-ex): option requested_ms_classless_static_routes => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1510] dhcp4 (br-ex): option requested_nis_domain => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1510] dhcp4 (br-ex): option requested_nis_servers => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1510] dhcp4 (br-ex): option requested_ntp_servers => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1510] dhcp4 (br-ex): option requested_rfc3442_classless_static_routes => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1511] dhcp4 (br-ex): option requested_root_path  => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1511] dhcp4 (br-ex): option requested_routers    => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1511] dhcp4 (br-ex): option requested_static_routes => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1511] dhcp4 (br-ex): option requested_subnet_mask => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1511] dhcp4 (br-ex): option requested_time_offset => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1511] dhcp4 (br-ex): option requested_wpad       => '1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1511] dhcp4 (br-ex): option routers              => '192.168.126.1'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1511] dhcp4 (br-ex): option subnet_mask          => '255.255.255.0'
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1512] dhcp4 (br-ex): state changed unknown -> bound
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1552] device (br-ex): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1597] device (br-ex): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1600] device (br-ex): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1606] manager: NetworkManager state is now CONNECTED_LOCAL
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1631] manager: NetworkManager state is now CONNECTED_SITE
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1632] policy: set 'ovs-if-br-ex' (br-ex) as default for IPv4 routing and DNS
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1634] policy: set-hostname: set hostname to 'test-gz9pv-master-1' (from DHCPv4)
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1679] device (br-ex): Activation: successful, device activated.
Aug 28 15:51:31 localhost.localdomain ovs-vswitchd[1336]: ovs|00117|bridge|ERR|interface br-ex: ignoring mac in Interface record (use Bridge record to set local port's mac)
Aug 28 15:51:31 localhost.localdomain dbus-daemon[1146]: [system] Activating via systemd: service name='org.freedesktop.resolve1' unit='dbus-org.freedesktop.resolve1.service' requested by ':1.6' (uid=0 pid=1420 comm="/usr/sbin/NetworkMan>
Aug 28 15:51:31 localhost.localdomain NetworkManager[1420]: <info>  [1598629891.1697] manager: NetworkManager state is now CONNECTED_GLOBAL
Aug 28 15:51:31 test-gz9pv-master-1 dbus-daemon[1146]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.resolve1.service': Unit dbus-org.freedesktop.resolve1.service not found.
Aug 28 15:51:31 test-gz9pv-master-1 systemd-hostnamed[1432]: Changed host name to 'test-gz9pv-master-1'
Aug 28 15:51:31 test-gz9pv-master-1 NetworkManager[1420]: <info>  [1598629891.1712] manager: startup complete
Aug 28 15:51:31 test-gz9pv-master-1 systemd[1]: Started Network Manager Wait Online.
Aug 28 15:51:31 test-gz9pv-master-1 network-manager/90-long-hostname[1506]: hostname is already set
Aug 28 15:51:31 test-gz9pv-master-1 network-manager/90-long-hostname[1525]: hostname is already set
Aug 28 15:51:31 test-gz9pv-master-1 network-manager/90-long-hostname[1538]: hostname is already set
Aug 28 15:51:31 test-gz9pv-master-1 network-manager/90-long-hostname[1546]: hostname is already set
Aug 28 15:51:31 test-gz9pv-master-1 network-manager/90-long-hostname[1559]: hostname is already set
Aug 28 15:51:31 test-gz9pv-master-1 network-manager/90-long-hostname[1572]: hostname is already set
Aug 28 15:51:31 test-gz9pv-master-1 network-manager/90-long-hostname[1585]: hostname is already set
Aug 28 15:51:31 test-gz9pv-master-1 bash[1590]: node identified as test-gz9pv-master-1
Aug 28 15:51:31 test-gz9pv-master-1 systemd[1]: Started Ensure the node hostname is valid for the cluster.
Aug 28 15:51:31 test-gz9pv-master-1 systemd[1]: Starting Configures OVS with proper host networking configuration...
Aug 28 15:51:31 test-gz9pv-master-1 configure-ovs.sh[1592]: + iface=
Aug 28 15:51:31 test-gz9pv-master-1 configure-ovs.sh[1592]: + counter=0
Aug 28 15:51:31 test-gz9pv-master-1 configure-ovs.sh[1592]: + '[' 0 -lt 12 ']'
Aug 28 15:51:31 test-gz9pv-master-1 configure-ovs.sh[1592]: ++ jq -r '.[0].dev'
Aug 28 15:51:31 test-gz9pv-master-1 configure-ovs.sh[1592]: ++ ip -j route show default
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + iface=br-ex
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + [[ -n br-ex ]]
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + [[ br-ex != \n\u\l\l ]]
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + echo 'IPv4 Default gateway interface found: br-ex'
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: IPv4 Default gateway interface found: br-ex
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + break
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + '[' br-ex = br-ex ']'
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + echo 'Networking already configured and up for br-ex!'
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: Networking already configured and up for br-ex!
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + ip a
Aug 28 15:51:32 test-gz9pv-master-1 systemd[1]: Started Configures OVS with proper host networking configuration.
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     inet 127.0.0.1/8 scope host lo
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:        valid_lft forever preferred_lft forever
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     inet6 ::1/128 scope host
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:        valid_lft forever preferred_lft forever
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether 52:54:00:56:f9:70 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether fa:14:4b:fb:b1:a1 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 4: ovn-k8s-mp0: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether a6:85:9f:a1:9d:2c brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 5: br-int: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether 16:30:f2:b5:0b:47 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 6: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether e2:41:79:73:b0:a2 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     inet6 fe80::e041:79ff:fe73:b0a2/64 scope link
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:        valid_lft forever preferred_lft forever
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 7: ovn-k8s-gw0: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether 0a:58:a9:fe:00:01 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 8: br-local: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether ea:3b:2d:a2:a1:4f brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: 9: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     link/ether 52:54:00:56:f9:70 brd ff:ff:ff:ff:ff:ff
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     inet 192.168.126.12/24 brd 192.168.126.255 scope global dynamic noprefixroute br-ex
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:        valid_lft 3600sec preferred_lft 3600sec
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:     inet6 fe80::bc88:75ad:4faa:b92/64 scope link tentative noprefixroute
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]:        valid_lft forever preferred_lft forever
Aug 28 15:51:32 test-gz9pv-master-1 configure-ovs.sh[1592]: + exit 0
Aug 28 15:51:32 test-gz9pv-master-1 systemd[1]: ovs-configuration.service: Consumed 57ms C

Comment 17 Alexander Constantinescu 2020-08-28 16:36:02 UTC
OK, it's certainly NetworkManager on restart. I've inserted a custom service to run before NetworkManager-wait-online.service

Here's the output:

Aug 28 16:32:17 localhost systemd[1]: Starting A small hello world from Alex...
Aug 28 16:32:17 localhost alex.sh[1138]: + echo Hello world from Alex
Aug 28 16:32:17 localhost alex.sh[1138]: Hello world from Alex
Aug 28 16:32:17 localhost alex.sh[1138]: + ip a
Aug 28 16:32:17 localhost systemd[1]: Starting NTP client/server...
Aug 28 16:32:17 localhost bash[1139]: waiting for non-localhost hostname to be assigned
Aug 28 16:32:17 localhost systemd[1]: Starting Open vSwitch Database Unit...
Aug 28 16:32:17 localhost systemd[1]: Started irqbalance daemon.
Aug 28 16:32:17 localhost systemd[1]: Starting System Security Services Daemon...
Aug 28 16:32:17 localhost systemd[1]: Reached target sshd-keygen.target.
Aug 28 16:32:17 localhost systemd[1]: Starting Generate /run/issue.d/console-login-helper-messages.issue...
Aug 28 16:32:17 localhost alex.sh[1138]: 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
Aug 28 16:32:17 localhost alex.sh[1138]:     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
Aug 28 16:32:17 localhost alex.sh[1138]:     inet 127.0.0.1/8 scope host lo
Aug 28 16:32:17 localhost alex.sh[1138]:        valid_lft forever preferred_lft forever
Aug 28 16:32:17 localhost alex.sh[1138]:     inet6 ::1/128 scope host
Aug 28 16:32:17 localhost alex.sh[1138]:        valid_lft forever preferred_lft forever
Aug 28 16:32:17 localhost alex.sh[1138]: 2: ens3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
Aug 28 16:32:17 localhost alex.sh[1138]:     link/ether 52:54:00:56:f9:70 brd ff:ff:ff:ff:ff:ff
Aug 28 16:32:17 localhost chronyd[1157]: chronyd version 3.5 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 +DEBUG)
Aug 28 16:32:17 localhost chown[1143]: /usr/bin/chown: cannot access '/var/run/openvswitch': No such file or directory
Aug 28 16:32:17 localhost chronyd[1157]: Frequency -19.709 +/- 5.979 ppm read from /var/lib/chrony/drift
Aug 28 16:32:17 localhost chronyd[1157]: Using right/UTC timezone to obtain leap second data
Aug 28 16:32:17 localhost systemd[1]: Started A small hello world from Alex.
Aug 28 16:32:17 localhost systemd[1]: alex.service: Consumed 5ms CPU time

Comment 18 Alexander Constantinescu 2020-08-28 16:37:52 UTC
So at this point I am not sure what's faster...fixing the kubelet or fixing NetworkManager or fixing Openstack?

Comment 19 Dan Williams 2020-08-28 17:14:02 UTC
Anything that relies on interface index ordering is fundamentally broken. We should not be working around dumb kubelet bugs by manipulating interface indexes at all.

Comment 20 Alexander Constantinescu 2020-08-28 17:19:30 UTC
And actually the more I think about it, I tell myself that the kubelet *cannot* be fixed. 

Go has no way of retrieving the default route without performing OS specific syscalls (which why netlink can do that, but only compiles for linux). The kubelet cannot be built like that...and I think they were really counting on `net.LookupIP(node.Name)` not returning crazy stuff and thus just picking the first item in the list (which would have equated to the IP of the default route).  

So this should probably just be sent over to the Openstack team / NetworkManager

Comment 21 Dan Winship 2020-08-28 17:52:28 UTC
(In reply to Alexander Constantinescu from comment #16)
> > either "files" or "dns" succeeds so it never gets to "myhostname".
> 
> It's true but it doesn't matter, because as I mentioned: the DNS resolution
> of myhostname (equivalent to "getent ahosts `hostname`") returns "only the
> eth0 IPv4" on GCP. Only Openstack returns that screwy list of all IPs.

The way nsswitch.conf works is that if it says "hosts   files dns myhostname", then that means when someone tries to resolve a hostname, first use "files" (ie, /etc/hosts), and if there's an answer there, return that answer. If there's no answer, then use "dns" and if there's an answer there, return that answer. If there's still no answer, then use "myhostname".

So what's happening is that on GCP, either "files" or "dns" has a match for the system hostname, so we don't have to try "myhostname". While on OpenStack, the node's name does not appear in either /etc/hosts or in DNS, so it falls back to myhostname. If the host's name did appear in /etc/hosts or DNS on OpenStack, then we would not fall back to "myhostname" on OpenStack either.

> Even my local libvirt cluster returns the same result as on GCP:

Presumably your hostname appears in /etc/hosts, and so "myhostname" does not get used.

> or NetworkManager does this (but I am unable to understand from reading its logs, see below)

The NetworkManager logs show that NetworkManager is _observing_ the network configuration, not that it's creating it.

I'm guessing OVS must be the one creating those interfaces? Are we persisting our ovsdb across reboots? We don't want that, since ovn-kubernetes is not written to deal with the possibility that there might be leftover interfaces from the last time it was run (which might have been with an older version of ovn-kubernetes that had a slightly different internal architecture).

> And actually the more I think about it, I tell myself that the kubelet
> *cannot* be fixed. 
> 
> Go has no way of retrieving the default route without performing OS specific
> syscalls (which why netlink can do that, but only compiles for linux).

kubelet *already* has a function that finds the IP corresponding to the default route; it's just that it only uses it if `LookupIP()` fails, so if you have "myhostname" configured in nsswitch.conf, it will never get used. (Go uses netlink in the standard library, it just doesn't expose it via any public APIs)

Comment 22 Dan Williams 2020-08-28 18:24:03 UTC
If a cloud provider is non-functional or no cloud provider is used, and for some reason looking up the node's hostname does not provide a usable result (in this case because DNS does not provide a result, but myhostname appears to provide one that is not useful) then the standard practice is to set the --node-ip or --bind-addr of components run on that node to the address the system administrator expects the node to have.

Actual bare-metal (with devscripts) does this correctly, OpenStack must also do this if it does not already.

Over to Node since I"m not sure where to put bare-metal/cloud-provider/etc issues.

Comment 23 Martin André 2020-09-01 16:25:11 UTC
For(In reply to Alexander Constantinescu from comment #12)
> OK, I think I've narrowed the problem down. 
> 
> On Openstack we run the kubelet with the flag `--cloud-priver=`, that means
> it's up to the kubelet to set the IP address without looking up the node's
> IP address from the external cloud provider. This is done here:
> 
> https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/nodestatus/
> setters.go#L205

For OpenStack we set the --node-ip option [1] so I'm not sure the above code is used at all. It's reading its value from a /etc/systemd/system/kubelet.service.d/20-nodenet.conf file dropped by the nodeip-finder script [2] from baremetal-runtimecfg. Would it be enough to discard the ovn-k8s-* interfaces?

[1] https://github.com/openshift/machine-config-operator/blob/36f37f2d6009affe8174854f5ef5538e0cc49034/templates/master/01-master-kubelet/openstack/units/kubelet.service.yaml#L27
[2] https://github.com/openshift/baremetal-runtimecfg/blob/b2b74d7c6a5c02811f7d8262ee2e0c00e73f8b68/scripts/nodeip-finder

Comment 24 Dan Williams 2020-09-01 16:53:13 UTC
(In reply to Martin André from comment #23)
> For(In reply to Alexander Constantinescu from comment #12)
> > OK, I think I've narrowed the problem down. 
> > 
> > On Openstack we run the kubelet with the flag `--cloud-priver=`, that means
> > it's up to the kubelet to set the IP address without looking up the node's
> > IP address from the external cloud provider. This is done here:
> > 
> > https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/nodestatus/
> > setters.go#L205
> 
> For OpenStack we set the --node-ip option [1] so I'm not sure the above code
> is used at all. It's reading its value from a
> /etc/systemd/system/kubelet.service.d/20-nodenet.conf file dropped by the
> nodeip-finder script [2] from baremetal-runtimecfg. Would it be enough to
> discard the ovn-k8s-* interfaces?
> 
> [1]
> https://github.com/openshift/machine-config-operator/blob/
> 36f37f2d6009affe8174854f5ef5538e0cc49034/templates/master/01-master-kubelet/
> openstack/units/kubelet.service.yaml#L27
> [2]
> https://github.com/openshift/baremetal-runtimecfg/blob/
> b2b74d7c6a5c02811f7d8262ee2e0c00e73f8b68/scripts/nodeip-finder

That would be insufficient. The information cannot come from scraping the a bunch of random interfaces, it needs to come from the actual network configuration of the node. Something knows what that configuration is.

Comment 25 Dan Williams 2020-09-01 16:58:42 UTC
To be clear, it needs to come from the actual network configuration *as seen from outside the node*. Like, whatever is provisioning the node knows exactly what the IP is. And that's the IP(s) that kubelet needs to report.

Comment 26 Dan Winship 2020-09-01 18:24:38 UTC
> That would be insufficient. The information cannot come from scraping the a bunch of random interfaces,
> it needs to come from the actual network configuration of the node. Something knows what that configuration is.

The nodeip-finder script is the thing that is supposed to understand OCP node configuration. In particular, it exists because kubelet isn't smart enough to know to ignore IPs added by ipfailoverd, so nodeip-finder is.

However, nodeip-finder should already have been smart enough to not use the ovn-k8s-mp0 interface, because that interface doesn't have a route to the apiserver... so it seems like something went wrong there? It should not need to specifically know to ignore ovn-k8s-mp0.

Comment 27 Mike Fedosin 2020-09-03 14:31:59 UTC
This bug may be fixed with: https://github.com/openshift/machine-config-operator/pull/2031
It makes sure that we set correct hostname before starting networking configuration. 

we didn't have issues with pure ovs, but with ovn there were some weird race conditions, and I expect this bug can be one of the manifestations of them.

Comment 29 Pierre Prinetti 2020-09-03 14:43:03 UTC
Possibly a duplicate of bug 1851540; could you verify if this is still happening with the latest nightly please?

Comment 33 Martin André 2020-09-09 14:34:32 UTC
I got confirmation from Gaoyun Pei this isn't deployment on OpenStack but rather simulated BM nodes using OpenStack VMs.
Re-assigning to Node component.

Comment 34 Seth Jennings 2020-09-09 22:19:58 UTC
Ok, this bug has had a ton of noise.

Just to level set, the is bare metal platform using OVN installed on Openstack VMs (no cloud provider integration configured)

It is the combination of bare-metal + OVN that leads to this situation.

The issue is the kubelet doesn't select the expected interface address for the internal IP.

The kubelet is designed around a host with a single candidate address (per address family) for the internal IP.

comment #12 shows how the kubelet would do the selection

> IP: fe80::c06d:abff:fe70:9a09 is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::58c3:b9ff:fe32:3348 is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::f64e:de02:c198:b6db is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::f4d3:1aff:fe0b:5765 is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::78a9:45ff:fe87:487 is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::4839:9dff:fe04:25d3 is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::7cb5:37ff:fe42:b1e5 is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::e81d:e3ff:fe9e:f894 is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::9087:99ff:fe3c:3d3 is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::dcfa:a5ff:feb1:3fdf is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::c4d5:f2ff:fec1:1acf is skipped because: nodeIP can't be a link-local unicast address
> IP: fe80::9828:d0ff:feca:7068 is skipped because: nodeIP can't be a link-local unicast address
> IP is: 2620:52:0:60:946a:c6c1:950f:c7aa
> IP: 169.254.0.1 is skipped because: nodeIP can't be a link-local unicast address
> IP is: 10.128.2.2
> IP is: 10.0.97.10

comment #20 is correct

> And actually the more I think about it, I tell myself that the kubelet *cannot* be fixed. 
> 
> Go has no way of retrieving the default route without performing OS specific syscalls (which why netlink can do that, but only compiles for linux). The kubelet cannot be built like that...and I think they were really counting on > `net.LookupIP(node.Name)` not returning crazy stuff and thus just picking the first item in the list (which would have equated to the IP of the default route).  
> 
> As seen in the code referenced just before: the kubelet takes the first IPv4 address it finds and assigns that IP to the InternalIP address. Thus 10.128.2.2 (in this example) - which is 
> ovn-k8s-mp0 address. This is presumably because the interface index of ovn-k8s-mp0 is lower than br-ex.

comment #21

> kubelet *already* has a function that finds the IP corresponding to the default route; it's just that it only uses it if `LookupIP()` fails, so if you have "myhostname" configured in nsswitch.conf, it will never get used. (Go uses netlink in the standard library, it just doesn't expose it via any public APIs)

If OVN is going to create another interface on the host that has another candidate address for the internal IP, the --node-ip flag will have to be provided to the kubelet to disambiguate.

Comment 35 Seth Jennings 2020-09-09 22:30:27 UTC
OVN is creating this new source of ambiguity.  OVN depended on cloud provider integration to resolve this ambiguity, but it is not always present i.e. bare-metal.  Routing to them.

Comment 36 Dan Winship 2020-09-10 15:04:49 UTC
> Ok, this bug has had a ton of noise.

Yeah...

> OVN is creating this new source of ambiguity.  OVN depended on cloud
> provider integration to resolve this ambiguity, but it is not always present
> i.e. bare-metal.  Routing to them.

Per comment #23, this type of openstack install uses the baremetal-cfg nodeip-finder to generate a `--node-ip` to pass to kubelet. The nodeip-finder code ought to be doing the right thing here and it is not clear why it is not (comment #26). It works fine on actual-bare-metal-via-dev-scripts. eg, rerunning nodeip-configuration.service after ovn-kubernetes is up shows:

    Parsed Virtual IP 192.168.111.5
    Checking whether address 192.168.111.23/24 br-ex contains VIP 192.168.111.5
    Address 192.168.111.23/24 br-ex contains VIP 192.168.111.5
    Checking whether address 169.254.0.1/20 ovn-k8s-gw0 contains VIP 192.168.111.5
    Checking whether address 172.22.0.35/24 enp1s0 contains VIP 192.168.111.5
    Checking whether address 10.131.0.2/23 ovn-k8s-mp0 contains VIP 192.168.111.5
    Checking whether address 127.0.0.1/8 lo contains VIP 192.168.111.5
    Chosen Node IP 192.168.111.23

(deleted all the IPv6 link-local address lines to make the output shorter). I'm not sure what order it's checking in, but it doesn't matter, since it looks at both br-ex and ovn-k8s-mp0, and sees that br-ex is correct and ovn-k8s-mp0 is not.

@huirwang, we need to see what's happening with the nodeip-configuration.service on the reboot in your cluster that breaks things; what arguments it is being passed, and what it outputs. Also, what does "ip a" and "ip r" show on the node when it runs? If must-gather isn't functional you can log in via openstack console or something to get logs...

Comment 37 Martin André 2020-09-10 15:28:56 UTC
(In reply to Dan Winship from comment #36)
> Per comment #23, this type of openstack install uses the baremetal-cfg
> nodeip-finder to generate a `--node-ip` to pass to kubelet. The
> nodeip-finder code ought to be doing the right thing here and it is not
> clear why it is not (comment #26). It works fine on
> actual-bare-metal-via-dev-scripts.

comment #23 is no longer relevant because it's not deploying on openstack platform. From what I could tell, none of the static pods you'd normally find in OpenStack or BM deployments (keepalived, haproxy, coredns, ...) where running on those nodes. It didn't run the NM dispatcher scripts either.

Comment 38 Dan Williams 2020-09-10 15:53:38 UTC
(In reply to Seth Jennings from comment #35)
> OVN is creating this new source of ambiguity.  OVN depended on cloud
> provider integration to resolve this ambiguity, but it is not always present
> i.e. bare-metal.  Routing to them.

Seth, this is not an OVN/ovnkube issue, and ovnkube has been around since OCP 4.1 anyway. The logic of nodeip-finder is fundamentally wrong. OVN does not depend on cloud provider integration for anything. But it does require (like anything else, even openshift-sdn) that the platform provider is doing the right thing when it determines the node IP to pass to kubelet.

I don't know whether this is a Node bug or a Cloud provider bug or what, but it is *not* networking.

Comment 40 Dan Winship 2020-09-11 15:15:08 UTC
ah, you are not running the nodeip-configuration service. Kubelet is being started without `--node-ip`, so it's running the default kubelet node-ip-detecting code, which is not able to deal with this.

nodeip-configuration gets run if MCO thinks the node is "baremetal", and that definitely happens if you do the "official" install procedure using dev-scripts. I'm not sure why you are getting the "base" kubelet config (https://github.com/openshift/machine-config-operator/blob/master/templates/master/01-master-kubelet/_base/units/kubelet.service.yaml) rather than the "baremetal" one (https://github.com/openshift/machine-config-operator/blob/master/templates/master/01-master-kubelet/baremetal/units/kubelet.service.yaml) in this cluster. Either you are configuring things wrong, or MCO is detecting things wrong...

So I guess over to MCO...

Comment 41 Kirsten Garrison 2020-09-11 19:17:46 UTC
Assigning Baremetal to take a look here

Comment 42 Ben Nemec 2020-09-11 20:21:43 UTC
This is being deployed as a None platform:

...
spec:
  cloudConfig:
    name: ""
  platformSpec:
    type: None
status:
  apiServerInternalURI: https://api-int.huirwang0911.qe.devcluster.openshift.com:6443
  apiServerURL: https://api.huirwang0911.qe.devcluster.openshift.com:6443
  etcdDiscoveryDomain: huirwang0911.qe.devcluster.openshift.com
  infrastructureName: huirwang0911-srh9h
  platform: None
  platformStatus:
    type: None

That's why our services aren't running. I don't believe our nodeip-configuration code will work without a platform, so if the None platform was intentional then another method would have to be used. If the None platform was not intentional then whatever caused the infrastructure object to be populated this way needs to be fixed.

Comment 43 Dan Winship 2020-09-11 21:30:16 UTC
I noticed that the install-config in this cluster said platform "none", but that's what the docs say to do for bare metal: https://docs.openshift.com/container-platform/4.5/installing/installing_bare_metal/installing-bare-metal.html#installation-bare-metal-config-yaml_installing-bare-metal

Is that wrong or is there something else that triggers a baremetal platform in the infrastructure object?

Comment 44 Ben Nemec 2020-09-14 18:23:56 UTC
Okay, I missed that this was UPI because of all the discussion of IPI components. My team only does IPI so I don't really know anything about UPI. I do know that if you install with UPI then you don't get any of our stuff because it's all dependent on baremetal (IPI) platform configuration options. It's possible it could be adapted for use with UPI, but I believe you'd need to talk to the installer team about that. They maintain baremetal UPI.

Comment 45 Abhinav Dahiya 2020-09-14 18:28:53 UTC
Moving to networking team , as OVN net-interfaces are not something the installer team can triage or fix.

Comment 46 Dan Williams 2020-09-14 19:50:07 UTC
(In reply to Abhinav Dahiya from comment #45)
> Moving to networking team , as OVN net-interfaces are not something the
> installer team can triage or fix.

Seriously people. There is literally nothing that is networking specific about this. The *PLATFORM* in use must pass --node-ip to kubelet, and it must do something intelligent to figure out what the node's actual NIC is. The networking team does not do nor is it involved with host-level networking configuration.

It is not a networking team problem. I don't know if it's an installer problem, or what. But whatever platform is being used, even if it is none, MUST PASS A VALID --node-ip TO KUBELET.  And it's that platform's job to figure out what IP addresses are useful, because hey, it's the platform, and it knows how networking is configured for that deployment.  That has nothing to do with the SDN/OVN/whatever.

What kind of meeting do we need to call with everyone so that I may clearly explain this?

Comment 47 Dan Winship 2020-09-14 20:18:55 UTC
OK, so.

On bare metal IPI, we run the nodeip-configuration service, which figures out the right node IP to use and passes it to kubelet via --node-ip. However, this depends on having an apiserver VIP, which doesn't exist in the UPI case. We cannot just pass an arbitrary IP in the UPI case because the helper binary used by nodeip-configuration expects to find an interface that has a _direct_ route to the provided IP, not just a default route.

On bare metal UPI, kubelet tries to find the node IP like so:

  1. Was --node-ip passed? No
  2. Is the node name actually an IP address? No
  3. Does the node name resolve to an IP? YES!
  4. Find an IP on an interface with a default route. (Not reached because step 3 succeeded)

The problem with step 3, as diagnosed earlier, is that the node has the nss-myhostname plugin installed (this is standard in Fedora; presumably also RHEL/RHCOS?) and so when we try to resolve the hostname in step 3, it returns all of the node's IPs in interface number order, which includes the ovn-k8s-mp0 IP before the br-ex IP, and kubelet does not do any sanity checking of routes in this case, so it uses that bad IP. (If ovn-kubernetes had not fiddled with the node's network configuration then there would not be any other network interfaces with IPs on them at startup, and so kubelet would have picked the correct IP.)

So, some fixes

  1. We could add a new mode to runtimecfg node-ip to just figure out the best node IP
     in a global sense (ie not relative to the apiserver VIP), and make MCO call it in
     that mode on bare metal UPI and pass that --node-ip to kubelet. There is a small
     chance that this could result in the default node IP changing on UPI nodes where
     people are doing strange things with multiple default routes.

  2. We could try to fix ovs-configure.sh to set things up in such a way that br-ex ends
     up being created before ovn-k8s-mp0 on reboot, so that naively iterating interfaces
     will find the correct one first. This depends on internal details of OVS and
     NetworkManager and may not be possible.

       2a. We could make br-ex be transient so it didn't get automatically recreated by
           OVS on restart, but IIRC there's some reason to not do that which I don't
           remember.

  3. We could try uninstalling nss-myhostname, but I suspect we'd run into other problems on
     the system if we tried to remove it.

  4. We could have ovs-configuration.sh add a node name to IP mapping in /etc/hosts so that
     nss-myhostname would be bypassed and kubelet would get the right IP when it looked up
     the node name. (And other processes? It is possible that kubelet is not the only piece
     of software on the system that is being thwarted by the unexpected weird interaction
     of nss-myhostname and ovs-configuration.sh)

  5. We could fix kubelet to look more carefully at the results of `net.LookupIP(node.Name)`
     and require/prefer an IP on the interface with the default route. _Theoretically_ this is
     an incompatible behavior change and people might object to it, but it seems unlikely.
     This may be worth doing even if we also do one of the other options.

  6. We could say that users need to manually configure the node IP (somehow) when doing bare
     metal UPI + ovn-kubernetes.

Comment 48 Dan Winship 2020-09-14 20:20:31 UTC
I didn't actually mean to click "undo all of dcbw's change" but at any rate, if it's not Networking it's MCO, not Installer anyway

Comment 49 Dan Williams 2020-09-14 21:24:28 UTC
(In reply to Dan Winship from comment #47)
> OK, so.
> 
> On bare metal IPI, we run the nodeip-configuration service, which figures
> out the right node IP to use and passes it to kubelet via --node-ip.
> However, this depends on having an apiserver VIP, which doesn't exist in the
> UPI case. We cannot just pass an arbitrary IP in the UPI case because the
> helper binary used by nodeip-configuration expects to find an interface that
> has a _direct_ route to the provided IP, not just a default route.
> 
> On bare metal UPI, kubelet tries to find the node IP like so:
> 
>   1. Was --node-ip passed? No
>   2. Is the node name actually an IP address? No
>   3. Does the node name resolve to an IP? YES!
>   4. Find an IP on an interface with a default route. (Not reached because
> step 3 succeeded)
> 
> The problem with step 3, as diagnosed earlier, is that the node has the
> nss-myhostname plugin installed (this is standard in Fedora; presumably also
> RHEL/RHCOS?) and so when we try to resolve the hostname in step 3, it
> returns all of the node's IPs in interface number order, which includes the
> ovn-k8s-mp0 IP before the br-ex IP, and kubelet does not do any sanity
> checking of routes in this case, so it uses that bad IP. (If ovn-kubernetes
> had not fiddled with the node's network configuration then there would not
> be any other network interfaces with IPs on them at startup, and so kubelet
> would have picked the correct IP.)

Nothing can expect interfaces to be in any specific order. Ever. It doesn't matter if ovn-kubernetes fiddles with them, or if some other magic VPN the customer wants creates VPN tunnels as part of startup, or if for whatever reason the run IPsec and that creates a magic interface to.

> So, some fixes
> 
>   1. We could add a new mode to runtimecfg node-ip to just figure out the
> best node IP
>      in a global sense (ie not relative to the apiserver VIP), and make MCO
> call it in
>      that mode on bare metal UPI and pass that --node-ip to kubelet. There
> is a small
>      chance that this could result in the default node IP changing on UPI
> nodes where
>      people are doing strange things with multiple default routes.

In all cases that don't have a cloud provider or external heavily-managed DNS the correct auto-detect approach is "the IP of whatever interface has the default route". That's where kubelet falls down, because UPI simply doesn't have the heavily-managed DNS infrastructure.

For cloud providers that have a more nuanced idea of internal/external/DNS/etc it makes sense to do a DNS lookup, because the cloud controls DNS and you need to take what the cloud wants as the node's identity/IP.

>   2. We could try to fix ovs-configure.sh to set things up in such a way
> that br-ex ends
>      up being created before ovn-k8s-mp0 on reboot, so that naively
> iterating interfaces
>      will find the correct one first. This depends on internal details of
> OVS and
>      NetworkManager and may not be possible.

Nope. This has nothing to do with ovs-configure.sh and is not its responsiblity.

>        2a. We could make br-ex be transient so it didn't get automatically
> recreated by
>            OVS on restart, but IIRC there's some reason to not do that which
> I don't
>            remember.

Still not the problem. Interfaces come and go and nothing can rely on their ordering, ever.

>   3. We could try uninstalling nss-myhostname, but I suspect we'd run into
> other problems on
>      the system if we tried to remove it.
> 
>   4. We could have ovs-configuration.sh add a node name to IP mapping in
> /etc/hosts so that
>      nss-myhostname would be bypassed and kubelet would get the right IP
> when it looked up
>      the node name. (And other processes? It is possible that kubelet is not
> the only piece
>      of software on the system that is being thwarted by the unexpected
> weird interaction
>      of nss-myhostname and ovs-configuration.sh)

Nope, still not ovs-configure.sh's problem. If the platform (even if it's "none') doesn't set the machine up correctly, we should not be working around that.

>   5. We could fix kubelet to look more carefully at the results of
> `net.LookupIP(node.Name)`
>      and require/prefer an IP on the interface with the default route.
> _Theoretically_ this is
>      an incompatible behavior change and people might object to it, but it
> seems unlikely.
>      This may be worth doing even if we also do one of the other options.

Possibly, yes, because kubelet is the thing that actually needs the right information. But as you say, I'm pretty sure upstream kubelet won't care much about non-cloud-provider cases at large scale and will just punt back to making the machine itself be set up correctly. We can try though.

>   6. We could say that users need to manually configure the node IP
> (somehow) when doing bare
>      metal UPI + ovn-kubernetes.

Seems like scripting should really be doing this for us, eg #1, a quasi-cloud-provider thing that does half what a cloud provider does, but assumes no external intelligence.

Comment 50 Dan Williams 2020-09-14 21:25:55 UTC
To be clear, my vote is Winship's #1 option; runtimecfg node-ip.

Comment 54 Dan Winship 2020-09-15 12:54:31 UTC
> Nothing can expect interfaces to be in any specific order. Ever.

Unless there's only one of them. Until we landed shared gateway mode, it was guaranteed that a node that only had one interface could just run kubelet without needing to override --node-ip, so...

> This has nothing to do with ovs-configure.sh

...I 100% disagree with that.


All that said, it turns out that nss-myhostname *doesn't* simply return the addresses in interface order:

       ·   The local, configured hostname is resolved to all locally configured IP addresses
           ordered by their scope

We are claiming that the IP address on ovn-k8s-mp0 has global scope, so nss-myhostname considers it valid to return as the node's primary IP, which seems not implausible.

Though OTOH it seems like nobody uses scope correctly... on my laptop the libvirt bridge and the VPN tunnel both also have "scope global"

Comment 55 Dan Williams 2020-09-15 13:19:34 UTC
(In reply to Dan Winship from comment #54)
> > Nothing can expect interfaces to be in any specific order. Ever.
> 
> Unless there's only one of them. Until we landed shared gateway mode, it was
> guaranteed that a node that only had one interface could just run kubelet
> without needing to override --node-ip, so...

guaranteed because it mostly worked this way before, even though something was doing the wrong thing, is not really guaranteed. It's "by accident it worked this way in the past". 

> > This has nothing to do with ovs-configure.sh
> 
> ...I 100% disagree with that.

I may be exposed by ovs-configure.sh, but the problem is 100% not the fault of the network configuration done there, precisely because ovs-configure.sh is not the only thing that touches host networking. Anything else can also do that especially in UPI where the custom is free to run whatever they want on the host, including VMs, tunnels, whatever. Any of those things may also confuse nss-myhostname.

> All that said, it turns out that nss-myhostname *doesn't* simply return the
> addresses in interface order:
> 
>        ·   The local, configured hostname is resolved to all locally
> configured IP addresses
>            ordered by their scope
> 
> We are claiming that the IP address on ovn-k8s-mp0 has global scope, so
> nss-myhostname considers it valid to return as the node's primary IP, which
> seems not implausible.

Sure, we should fix that. But...

> Though OTOH it seems like nobody uses scope correctly... on my laptop the
> libvirt bridge and the VPN tunnel both also have "scope global"

I can guarantee that nobody uses it correctly, as you've found. I appreciate the attempt by nss-myhostname to bring some order to the chaos, but it's a long road and stuff gets updated incrementally.

Even if we update ovs-configure.sh and ovn-kubernetes *and* openshift-sdn to set the right scope on tun0/mp0/gw0/etc everything else in the world sets the wrong scope and we'll *still* have this problem on UPI whenever anyone does anything custom to the machine's networking.

This can all be avoided by just *doing the right thing* on UPI by using the IP addresses of the NIC that has the default route, and let the administrator override it explicitly if there are multiple NICs in the machine.

Comment 57 Dan Winship 2020-09-15 13:25:09 UTC
Everyone agrees that on UPI machines with "complicated" networking configurations, the administrator is going to have to configure stuff. I am arguing specifically that if you have a machine with a _trivial_ network configuration, it used to work without needing any further manual tweaking, and now we have broken that. Saying "it will break if anyone else does anything custom" is irrelevant because I am specifically only talking about the case where no one did anything custom.

Comment 58 Dan Williams 2020-09-15 13:58:29 UTC
(In reply to Dan Winship from comment #57)
> Everyone agrees that on UPI machines with "complicated" networking
> configurations, the administrator is going to have to configure stuff. I am
> arguing specifically that if you have a machine with a _trivial_ network
> configuration, it used to work without needing any further manual tweaking,
> and now we have broken that. Saying "it will break if anyone else does
> anything custom" is irrelevant because I am specifically only talking about
> the case where no one did anything custom.

Disagree. The whole point of UPI is *because* you want to do custom things and heavily manage the OS. Otherwise you would use RHCOS.

Comment 59 Dan Williams 2020-09-15 14:00:24 UTC
"Otherwise you would use RHCOS and metal. Not "none" platform."

Or is that not what we expect?

Comment 61 Dan Winship 2020-09-16 10:08:50 UTC
The merged PR is not a complete fix (though it may unblock testing by letting you manually override the node IP by setting KUBELET_NODE_IP=... in /etc/kubernetes/kubelet-env

Comment 62 Sunil Choudhary 2020-09-18 07:39:16 UTC
Hello Dan,

After setting KUBELET_NODE_IP in /etc/kubernetes/kubelet-env and restarting kubelet, do we need any other steps? I tired this but kubelet is not picking up this IP.

# cat /etc/kubernetes/kubelet-env 
KUBELET_NODE_IP="10.0.98.97"

Restarted kubelet service.

F   UID     PID    PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4     0  375383       1  20   0 2545984 157636 -    Ssl  ?          0:29 kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig --kubeconfig=/var/lib/kubelet/kubeconfig --container-runtime=remote --container-runtime-endpoint=/var/run/crio/crio.sock --runtime-cgroups=/system.slice/crio.service --node-labels=node-role.kubernetes.io/worker,node.openshift.io/os_id=rhcos --node-ip=${KUBELET_NODE_IP:-} --minimum-container-ttl-duration=6m0s --volume-plugin-dir=/etc/kubernetes/kubelet-plugins/volume/exec --cloud-provider= --pod-infra-container-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:70a88050d9c755137378b8a8618bb42b35e78397b9d044986aa3ba72aae97077 --v=4

# journalctl -u kubelet | grep -i "node_ip"
Sep 18 02:56:49 wsun09181ci-6ncwv-compute-0 hyperkube[1785]: I0918 02:56:49.649436    1785 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"
Sep 18 03:12:44 wsun09181ci-6ncwv-compute-0 hyperkube[1742]: I0918 03:12:44.846087    1742 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"
Sep 18 03:22:08 wsun09181ci-6ncwv-compute-0 hyperkube[1737]: I0918 03:22:08.735134    1737 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"
Sep 18 06:57:05 wsun09181ci-6ncwv-compute-0 hyperkube[338775]: I0918 06:57:05.590494  338775 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"
Sep 18 06:58:07 wsun09181ci-6ncwv-compute-0 hyperkube[340344]: I0918 06:58:07.113976  340344 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"
Sep 18 06:58:23 wsun09181ci-6ncwv-compute-0 hyperkube[340748]: I0918 06:58:23.770677  340748 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"
Sep 18 07:00:20 wsun09181ci-6ncwv-compute-0 hyperkube[344096]: I0918 07:00:20.339392  344096 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"
Sep 18 07:00:29 wsun09181ci-6ncwv-compute-0 hyperkube[344221]: I0918 07:00:29.352124  344221 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"
Sep 18 07:06:11 wsun09181ci-6ncwv-compute-0 hyperkube[353937]: I0918 07:06:11.828694  353937 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"
Sep 18 07:17:48 wsun09181ci-6ncwv-compute-0 hyperkube[373719]: I0918 07:17:48.370699  373719 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"
Sep 18 07:18:51 wsun09181ci-6ncwv-compute-0 hyperkube[375383]: I0918 07:18:51.079080  375383 flags.go:59] FLAG: --node-ip="${KUBELET_NODE_IP:-}"

Comment 63 Dan Winship 2020-09-18 10:25:01 UTC
oops, I'm an idiot. It doesn't currenty work.

Comment 65 Dan Winship 2020-09-23 15:06:26 UTC
(In reply to Sunil Choudhary from comment #62)
> After setting KUBELET_NODE_IP in /etc/kubernetes/kubelet-env and restarting
> kubelet, do we need any other steps? I tired this but kubelet is not picking
> up this IP.

For reasons I don't understand, `/etc/kubernetes/kubelet-env` does not actually seem to get read by the service, but it works (with the latest code, not the earlier version) if you create an additional file. eg:

  cat > /etc/systemd/system/kubelet.service.d/80-nodeip.conf
  [Service]
  Environment=KUBELET_NODE_IP="10.0.98.97"


Note You need to log in before you can comment on or make changes to this bug.