Bug 2073475

Summary: (OVN Azure EgressIP) egressIP is not assigned to node after the node has k8s.ovn.org/egress-assignable enabled
Product: OpenShift Container Platform Reporter: jechen <jechen>
Component: NetworkingAssignee: Periyasamy Palanisamy <pepalani>
Networking sub component: ovn-kubernetes QA Contact: Dan Brahaney <dbrahane>
Status: CLOSED DUPLICATE Docs Contact:
Severity: low    
Priority: low CC: akaris, ffernand, pepalani
Version: 4.10Flags: jechen: needinfo-
Target Milestone: ---   
Target Release: 4.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-14 08:48:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2072439, 2075444    
Bug Blocks:    

Description jechen 2022-04-08 14:57:11 UTC
Description of problem:
EgressIP is not assigned to node after the node k8s.ovn.org/egress-assignable enabled

Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-04-08-061429   True        False         33m     Cluster version is 4.10.0-0.nightly-2022-04-08-061429


How reproducible:


Steps to Reproduce:
1.
$ oc get node
NAME                                 STATUS   ROLES    AGE   VERSION
jechen-0408a-2w6ds-master-0          Ready    master   67m   v1.23.5+9ce5071
jechen-0408a-2w6ds-master-1          Ready    master   67m   v1.23.5+9ce5071
jechen-0408a-2w6ds-master-2          Ready    master   67m   v1.23.5+9ce5071
jechen-0408a-2w6ds-worker-westus-1   Ready    worker   50m   v1.23.5+9ce5071
jechen-0408a-2w6ds-worker-westus-2   Ready    worker   50m   v1.23.5+9ce5071
jechen-0408a-2w6ds-worker-westus-3   Ready    worker   50m   v1.23.5+9ce5071


$ oc label node jechen-0408a-2w6ds-worker-westus-1 "k8s.ovn.org/egress-assignable"=""

$ oc describe node jechen-0408a-2w6ds-worker-westus-1
Name:               jechen-0408a-2w6ds-worker-westus-1
Roles:              worker
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=Standard_D4s_v3
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=westus
                    failure-domain.beta.kubernetes.io/zone=0
                    k8s.ovn.org/egress-assignable=
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=jechen-0408a-2w6ds-worker-westus-1
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.kubernetes.io/instance-type=Standard_D4s_v3
                    node.openshift.io/os_id=rhcos
                    topology.disk.csi.azure.com/zone=
                    topology.kubernetes.io/region=westus
                    topology.kubernetes.io/zone=0
Annotations:        cloud.network.openshift.io/egress-ipconfig:
                      [{"interface":"jechen-0408a-2w6ds-worker-westus-1-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]
                    csi.volume.kubernetes.io/nodeid: {"disk.csi.azure.com":"jechen-0408a-2w6ds-worker-westus-1"}
                    k8s.ovn.org/host-addresses: ["10.0.1.5"]
                    k8s.ovn.org/l3-gateway-config:
                      {"default":{"mode":"shared","interface-id":"br-ex_jechen-0408a-2w6ds-worker-westus-1","mac-address":"00:0d:3a:59:e7:16","ip-addresses":["1...
                    k8s.ovn.org/node-chassis-id: 92e77aa2-8d56-423c-9a2f-08488cc80c7f
                    k8s.ovn.org/node-mgmt-port-mac-address: 66:fa:c8:b9:e3:3f
                    k8s.ovn.org/node-primary-ifaddr: {"ipv4":"10.0.1.5/24"}
                    k8s.ovn.org/node-subnets: {"default":"10.129.2.0/23"}
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-8bf94a8c88bb4b4a97f8d8a7e42058e6
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-8bf94a8c88bb4b4a97f8d8a7e42058e6
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Fri, 08 Apr 2022 09:09:16 -0400
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  jechen-0408a-2w6ds-worker-westus-1
  AcquireTime:     <unset>
  RenewTime:       Fri, 08 Apr 2022 10:00:30 -0400
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Fri, 08 Apr 2022 09:57:12 -0400   Fri, 08 Apr 2022 09:09:16 -0400   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Fri, 08 Apr 2022 09:57:12 -0400   Fri, 08 Apr 2022 09:09:16 -0400   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Fri, 08 Apr 2022 09:57:12 -0400   Fri, 08 Apr 2022 09:09:16 -0400   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Fri, 08 Apr 2022 09:57:12 -0400   Fri, 08 Apr 2022 09:10:18 -0400   KubeletReady                 kubelet is posting ready status
Addresses:
  Hostname:    jechen-0408a-2w6ds-worker-westus-1
  InternalIP:  10.0.1.5
Capacity:
  attachable-volumes-azure-disk:  8
  cpu:                            4
  ephemeral-storage:              133682156Ki
  hugepages-1Gi:                  0
  hugepages-2Mi:                  0
  memory:                         16409428Ki
  pods:                           250
Allocatable:
  attachable-volumes-azure-disk:  8
  cpu:                            3500m
  ephemeral-storage:              123201474766
  hugepages-1Gi:                  0
  hugepages-2Mi:                  0
  memory:                         15258452Ki
  pods:                           250
System Info:
  Machine ID:                             475e16d795364b38b7f71a6b4767a2ba
  System UUID:                            148dbcd6-3c8d-8748-bcb8-d52b6281ebff
  Boot ID:                                b7f04b6b-2f24-41b4-8410-91ec810e4498
  Kernel Version:                         4.18.0-305.40.2.el8_4.x86_64
  OS Image:                               Red Hat Enterprise Linux CoreOS 410.84.202204050541-0 (Ootpa)
  Operating System:                       linux
  Architecture:                           amd64
  Container Runtime Version:              cri-o://1.23.2-4.rhaos4.10.git9ef73d4.el8
  Kubelet Version:                        v1.23.5+9ce5071
  Kube-Proxy Version:                     v1.23.5+9ce5071
ProviderID:                               azure:///subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jechen-0408a-2w6ds-rg/providers/Microsoft.Compute/virtualMachines/jechen-0408a-2w6ds-worker-westus-1
Non-terminated Pods:                      (18 in total)
  Namespace                               Name                                   CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                               ----                                   ------------  ----------  ---------------  -------------  ---
  openshift-cluster-csi-drivers           azure-disk-csi-driver-node-shbc5       30m (0%)      0 (0%)      150Mi (1%)       0 (0%)         51m
  openshift-cluster-node-tuning-operator  tuned-n5nsg                            10m (0%)      0 (0%)      50Mi (0%)        0 (0%)         51m
  openshift-dns                           dns-default-xr6l5                      60m (1%)      0 (0%)      110Mi (0%)       0 (0%)         50m
  openshift-dns                           node-resolver-mlgbv                    5m (0%)       0 (0%)      21Mi (0%)        0 (0%)         51m
  openshift-image-registry                image-registry-5f4bf4959c-8m8gk        100m (2%)     0 (0%)      256Mi (1%)       0 (0%)         59m
  openshift-image-registry                node-ca-qgdbs                          10m (0%)      0 (0%)      10Mi (0%)        0 (0%)         51m
  openshift-ingress-canary                ingress-canary-kz644                   10m (0%)      0 (0%)      20Mi (0%)        0 (0%)         50m
  openshift-machine-config-operator       machine-config-daemon-929dl            40m (1%)      0 (0%)      100Mi (0%)       0 (0%)         51m
  openshift-monitoring                    alertmanager-main-0                    9m (0%)       0 (0%)      120Mi (0%)       0 (0%)         44m
  openshift-monitoring                    grafana-7d5f5f9bf9-6rhjx               6m (0%)       0 (0%)      99Mi (0%)        0 (0%)         49m
  openshift-monitoring                    node-exporter-8lvr8                    9m (0%)       0 (0%)      47Mi (0%)        0 (0%)         51m
  openshift-monitoring                    prometheus-k8s-0                       100m (2%)     0 (0%)      1104Mi (7%)      0 (0%)         49m
  openshift-monitoring                    thanos-querier-5c867c6cb4-8jnps        15m (0%)      0 (0%)      92Mi (0%)        0 (0%)         49m
  openshift-multus                        multus-additional-cni-plugins-57gcc    10m (0%)      0 (0%)      10Mi (0%)        0 (0%)         51m
  openshift-multus                        multus-cpr4v                           10m (0%)      0 (0%)      65Mi (0%)        0 (0%)         51m
  openshift-multus                        network-metrics-daemon-rdh9g           20m (0%)      0 (0%)      120Mi (0%)       0 (0%)         51m
  openshift-network-diagnostics           network-check-target-vdt6z             10m (0%)      0 (0%)      15Mi (0%)        0 (0%)         51m
  openshift-ovn-kubernetes                ovnkube-node-gw5z7                     55m (1%)      0 (0%)      680Mi (4%)       0 (0%)         51m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                       Requests      Limits
  --------                       --------      ------
  cpu                            509m (14%)    0 (0%)
  memory                         3069Mi (20%)  0 (0%)
  ephemeral-storage              0 (0%)        0 (0%)
  hugepages-1Gi                  0 (0%)        0 (0%)
  hugepages-2Mi                  0 (0%)        0 (0%)
  attachable-volumes-azure-disk  0             0
Events:                          <none>


2.
$ cat ./SDN-1332-test/config_egressip1_ovn_ns_team_red_azure.yaml
apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
  name: egressip1
spec:
  egressIPs:
  - 10.0.0.101
  namespaceSelector:
    matchLabels:
      team: red 

$ oc create -f ./SDN-1332-test/config_egressip1_ovn_ns_team_red_azure.yaml 
egressip.k8s.ovn.org/egressip1 created


3.
$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   
[jechen@jechen ~]$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.101                   


Actual results:
egressip object has no assigned node after waiting enough long

Expected results:
egressip should have node assigned

Additional info:

must-gather is uploaded to https://drive.google.com/file/d/1DYaqvWWFQ2cfJDgULSwwhDziCh6PQSUS/view?usp=sharing

Comment 1 jechen 2022-04-08 15:06:02 UTC
$  oc logs -n openshift-cloud-network-config-controller  -l app=cloud-network-config-controller
I0408 14:03:18.977712       1 cloudprivateipconfig_controller.go:271] CloudPrivateIPConfig: "10.0.0.101" will be added to node: "jechen-0408a-2w6ds-worker-westus-1"
E0408 14:03:19.243000       1 controller.go:165] error syncing '10.0.0.101': error assigning CloudPrivateIPConfig: "10.0.0.101" to node: "jechen-0408a-2w6ds-worker-westus-1", err: network.InterfacesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="PrivateIPAddressNotInSubnet" Message="Private static IP address 10.0.0.101 does not belong to the range of subnet prefix 10.0.1.0/24." Details=[], requeuing in cloud-private-ip-config workqueue
I0408 14:03:39.729493       1 cloudprivateipconfig_controller.go:271] CloudPrivateIPConfig: "10.0.0.101" will be added to node: "jechen-0408a-2w6ds-worker-westus-1"
E0408 14:03:40.017498       1 controller.go:165] error syncing '10.0.0.101': error assigning CloudPrivateIPConfig: "10.0.0.101" to node: "jechen-0408a-2w6ds-worker-westus-1", err: network.InterfacesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="PrivateIPAddressNotInSubnet" Message="Private static IP address 10.0.0.101 does not belong to the range of subnet prefix 10.0.1.0/24." Details=[], requeuing in cloud-private-ip-config workqueue
I0408 14:04:20.992426       1 cloudprivateipconfig_controller.go:271] CloudPrivateIPConfig: "10.0.0.101" will be added to node: "jechen-0408a-2w6ds-worker-westus-1"
E0408 14:04:21.313469       1 controller.go:165] error syncing '10.0.0.101': error assigning CloudPrivateIPConfig: "10.0.0.101" to node: "jechen-0408a-2w6ds-worker-westus-1", err: network.InterfacesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="PrivateIPAddressNotInSubnet" Message="Private static IP address 10.0.0.101 does not belong to the range of subnet prefix 10.0.1.0/24." Details=[], requeuing in cloud-private-ip-config workqueue
I0408 14:05:43.239223       1 cloudprivateipconfig_controller.go:271] CloudPrivateIPConfig: "10.0.0.101" will be added to node: "jechen-0408a-2w6ds-worker-westus-1"
E0408 14:05:43.517869       1 controller.go:165] error syncing '10.0.0.101': error assigning CloudPrivateIPConfig: "10.0.0.101" to node: "jechen-0408a-2w6ds-worker-westus-1", err: network.InterfacesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="PrivateIPAddressNotInSubnet" Message="Private static IP address 10.0.0.101 does not belong to the range of subnet prefix 10.0.1.0/24." Details=[], requeuing in cloud-private-ip-config workqueue
I0408 14:08:27.364927       1 cloudprivateipconfig_controller.go:271] CloudPrivateIPConfig: "10.0.0.101" will be added to node: "jechen-0408a-2w6ds-worker-westus-1"
I0408 14:08:28.317484       1 controller.go:160] Dropping key '10.0.0.101' from the cloud-private-ip-config workqueue

$ oc get nodes -o wide
NAME                                 STATUS   ROLES    AGE    VERSION           INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                        KERNEL-VERSION                 CONTAINER-RUNTIME
jechen-0408a-2w6ds-master-0          Ready    master   133m   v1.23.5+9ce5071   10.0.0.8      <none>        Red Hat Enterprise Linux CoreOS 410.84.202204050541-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.23.2-4.rhaos4.10.git9ef73d4.el8
jechen-0408a-2w6ds-master-1          Ready    master   133m   v1.23.5+9ce5071   10.0.0.7      <none>        Red Hat Enterprise Linux CoreOS 410.84.202204050541-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.23.2-4.rhaos4.10.git9ef73d4.el8
jechen-0408a-2w6ds-master-2          Ready    master   133m   v1.23.5+9ce5071   10.0.0.6      <none>        Red Hat Enterprise Linux CoreOS 410.84.202204050541-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.23.2-4.rhaos4.10.git9ef73d4.el8
jechen-0408a-2w6ds-worker-westus-1   Ready    worker   115m   v1.23.5+9ce5071   10.0.1.5      <none>        Red Hat Enterprise Linux CoreOS 410.84.202204050541-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.23.2-4.rhaos4.10.git9ef73d4.el8
jechen-0408a-2w6ds-worker-westus-2   Ready    worker   115m   v1.23.5+9ce5071   10.0.1.6      <none>        Red Hat Enterprise Linux CoreOS 410.84.202204050541-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.23.2-4.rhaos4.10.git9ef73d4.el8
jechen-0408a-2w6ds-worker-westus-3   Ready    worker   115m   v1.23.5+9ce5071   10.0.1.4      <none>        Red Hat Enterprise Linux CoreOS 410.84.202204050541-0 (Ootpa)   4.18.0-305.40.2.el8_4.x86_64   cri-o://1.23.2-4.rhaos4.10.git9ef73d4.el8

$  oc get nodes -o yaml | grep egress
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408a-2w6ds-master-0-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408a-2w6ds-master-1-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408a-2w6ds-master-2-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408a-2w6ds-worker-westus-1-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      k8s.ovn.org/egress-assignable: ""
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408a-2w6ds-worker-westus-2-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408a-2w6ds-worker-westus-3-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'

Comment 2 jechen 2022-04-08 15:54:23 UTC
 oc adm must-gather -- gather_network_logs
log is there

https://drive.google.com/file/d/1_2GaJ74abZpN4CU2bXviqKA414_dkClH/view?usp=sharing

Comment 10 Andreas Karis 2022-04-14 08:48:58 UTC

*** This bug has been marked as a duplicate of bug 2075444 ***