Bug 2073180 - (OVN Azure EgressIP) egressIP is not assigned to node after the node has k8s.ovn.org/egress-assignable enabled
Summary: (OVN Azure EgressIP) egressIP is not assigned to node after the node has k8s...
Keywords:
Status: CLOSED DUPLICATE of bug 2072439
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.11
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.11.0
Assignee: ffernand
QA Contact: Dan Brahaney
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-07 20:29 UTC by jechen
Modified: 2023-09-15 01:23 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-18 18:44:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cloud-network-config-controller pull 32 0 None open Bug 2072439: Get subnet information from subnet instead of from network addresses 2022-04-08 14:13:43 UTC

Description jechen 2022-04-07 20:29:58 UTC
Description of problem:
EgressIP is not assigned to node after the node k8s.ovn.org/egress-assignable enabled

Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-04-07-053433   True        False         4h43m   Cluster version is 4.11.0-0.nightly-2022-04-07-053433


How reproducible:


Steps to Reproduce:
1.
$ oc get node
NAME                                         STATUS   ROLES    AGE   VERSION
jechen-0407e-vgxvv-master-0                  Ready    master   64m   v1.23.3+54654d2
jechen-0407e-vgxvv-master-1                  Ready    master   64m   v1.23.3+54654d2
jechen-0407e-vgxvv-master-2                  Ready    master   64m   v1.23.3+54654d2
jechen-0407e-vgxvv-worker-southcentralus-1   Ready    worker   45m   v1.23.3+54654d2
jechen-0407e-vgxvv-worker-southcentralus-2   Ready    worker   46m   v1.23.3+54654d2
jechen-0407e-vgxvv-worker-southcentralus-3   Ready    worker   45m   v1.23.3+54654d2

$ oc label node jechen-0407e-vgxvv-worker-southcentralus-2  "k8s.ovn.org/egress-assignable"=""
node/jechen-0407e-vgxvv-worker-southcentralus-2 labeled


2.
$ oc create -f ./SDN-1332-test/config_egressip1_ovn_ns_team_red_azure.yaml 
egressip.k8s.ovn.org/egressip1 created

3.

$ oc get egressip
NAME        EGRESSIPS    ASSIGNED NODE   ASSIGNED EGRESSIPS
egressip1   10.0.0.111     

Actual results:
egressip object has no assigned node after waiting enough long

Expected results:

egressip should have node assigned


Additional info:

Comment 2 jechen 2022-04-07 22:42:08 UTC
$ cat ./SDN-1332-test/config_egressip1_ovn_ns_team_red_azure.yaml 
apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
  name: egressip1
spec:
  egressIPs:
  - 10.0.0.111
  namespaceSelector:
    matchLabels:
      team: red 


will get must-gather shortly

This problem also happens to 4.10, it was 1 out 2 times I tried.

Comment 3 jechen 2022-04-07 22:43:55 UTC
I tested egressIP on GCP for 4.10 release, today is first time I tried on Azure.

Comment 5 jechen 2022-04-07 23:25:19 UTC
Have a bigger must-gather log, it does not allow me to upload because of size, need to figure out where to upload.

Comment 9 jechen 2022-04-08 12:53:55 UTC
I uploaded the bigger must-gather file to:  https://drive.google.com/file/d/1Cdctjx4K2IDceoVe8bKsKU2kkel650mZ/view?usp=sharing

Comment 14 Andreas Karis 2022-04-08 15:39:52 UTC
Yeah I agree. 

10.0.0.111 isn't a valid IP address for the workers, the IP must be in 10.0.128.0/17 (the annotation is indeed wrong and that's what  bug 2072439 fixes).

The cncc will have thrown something like "Private static IP address 10.0.0.111 does not belong to the range of subnet prefix 10.0.128.0/17." Details=[], requeuing in cloud-private-ip-config workqueue

Comment 15 jechen 2022-04-08 16:10:45 UTC
I have destroyed my 4.11 cluster of yesterday, but I remember for yesterday's cluster, annoation was:10.0.0.0/16

And I just built a 4.11 cluster, now the annotation is shown 10.0.128.0/17, and with egressip chosen to be 10.0.128.101, egressip is assigned to the node now.

$ oc describe node jechen-0408c-x9rhm-worker-a-6q4mh.c.openshift-qe.internal
Name:               jechen-0408c-x9rhm-worker-a-6q4mh.c.openshift-qe.internal
Roles:              worker
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=n1-standard-4
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-central1
                    failure-domain.beta.kubernetes.io/zone=us-central1-a
                    k8s.ovn.org/egress-assignable=
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=jechen-0408c-x9rhm-worker-a-6q4mh.c.openshift-qe.internal
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.kubernetes.io/instance-type=n1-standard-4
                    node.openshift.io/os_id=rhcos
                    topology.gke.io/zone=us-central1-a
                    topology.kubernetes.io/region=us-central1
                    topology.kubernetes.io/zone=us-central1-a
Annotations:        cloud.network.openshift.io/egress-ipconfig: [{"interface":"nic0","ifaddr":{"ipv4":"10.0.128.0/17"},"capacity":{"ip":10}}]


$ oc get egressip
NAME        EGRESSIPS      ASSIGNED NODE                                               ASSIGNED EGRESSIPS
egressip1   10.0.128.101   jechen-0408c-x9rhm-worker-a-6q4mh.c.openshift-qe.internal   10.0.128.101

Comment 16 Andreas Karis 2022-04-08 16:25:19 UTC
Can you attach the output of:
~~~
oc get nodes -o yaml | grep egress-ipconfig
~~~

So that I can see the annotation for all nodes?

Did you create this cluster with `nightly` or with `ci` or with what version?

Comment 17 Andreas Karis 2022-04-08 16:26:24 UTC
The capacity throws me off .. `"capacity":{"ip":10}`. In azure, the default capacity is 255 O_o

Comment 18 jechen 2022-04-08 16:41:27 UTC
sorry, I take back my last comment (comment #15), I was looking at a GCP cluster.  Azure OVN cluster with latest 4.11 nightly still has 10.0.0.0/16 

$ oc describe node jechen-0408b-txj8d-worker-southcentralus-1
Name:               jechen-0408b-txj8d-worker-southcentralus-1
Roles:              worker
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=Standard_D4s_v3
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=southcentralus
                    failure-domain.beta.kubernetes.io/zone=0
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=jechen-0408b-txj8d-worker-southcentralus-1
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.kubernetes.io/instance-type=Standard_D4s_v3
                    node.openshift.io/os_id=rhcos
                    topology.disk.csi.azure.com/zone=
                    topology.kubernetes.io/region=southcentralus
                    topology.kubernetes.io/zone=0
Annotations:        cloud.network.openshift.io/egress-ipconfig:
                      [{"interface":"jechen-0408b-txj8d-worker-southcentralus-1-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]
                    csi.volume.kubernetes.io/nodeid:
                      {"disk.csi.azure.com":"jechen-0408b-txj8d-worker-southcentralus-1","file.csi.azure.com":"jechen-0408b-txj8d-worker-southcentralus-1"}
                    k8s.ovn.org/host-addresses: ["10.0.1.6"]
                    k8s.ovn.org/l3-gateway-config:
                      {"default":{"mode":"shared","interface-id":"br-ex_jechen-0408b-txj8d-worker-southcentralus-1","mac-address":"00:22:48:a6:49:2a","ip-addres...
                    k8s.ovn.org/node-chassis-id: 4e5ae092-6bf4-4727-a526-67d369a6f6c8
                    k8s.ovn.org/node-mgmt-port-mac-address: 0e:a4:03:8e:39:68
                    k8s.ovn.org/node-primary-ifaddr: {"ipv4":"10.0.1.6/24"}
                    k8s.ovn.org/node-subnets: {"default":"10.129.2.0/23"}
                    machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-5e396f9a52c1edad8416771b0b06323f
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-5e396f9a52c1edad8416771b0b06323f
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    volumes.kubernetes.io/controller-managed-attach-detach: true

Comment 19 jechen 2022-04-08 16:45:08 UTC
on a new 4.11 OVN Azure cluster

$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-04-07-053433   True        False         45m     Cluster version is 4.11.0-0.nightly-2022-04-07-053433


$ oc get nodes -o yaml | grep egress-ipconfig
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-master-0-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-master-1-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-master-2-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-worker-southcentralus-1-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-worker-southcentralus-2-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-worker-southcentralus-3-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'


$ oc logs -n openshift-cloud-network-config-controller  -l app=cloud-network-config-controller
I0408 15:55:03.665787       1 controller.go:102] Started secret workers
I0408 15:55:03.665831       1 controller.go:160] Dropping key 'jechen-0408b-txj8d-master-0' from the node workqueue
I0408 15:55:03.665843       1 controller.go:160] Dropping key 'jechen-0408b-txj8d-master-1' from the node workqueue
I0408 15:55:03.665867       1 controller.go:160] Dropping key 'jechen-0408b-txj8d-master-2' from the node workqueue
I0408 15:55:04.238626       1 node_controller.go:106] Setting annotation: 'cloud.network.openshift.io/egress-ipconfig: [{"interface":"jechen-0408b-txj8d-worker-southcentralus-2-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]' on node: jechen-0408b-txj8d-worker-southcentralus-2
I0408 15:55:04.259291       1 node_controller.go:106] Setting annotation: 'cloud.network.openshift.io/egress-ipconfig: [{"interface":"jechen-0408b-txj8d-worker-southcentralus-3-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]' on node: jechen-0408b-txj8d-worker-southcentralus-3
I0408 15:55:04.271948       1 controller.go:160] Dropping key 'jechen-0408b-txj8d-worker-southcentralus-2' from the node workqueue
I0408 15:55:04.272033       1 node_controller.go:106] Setting annotation: 'cloud.network.openshift.io/egress-ipconfig: [{"interface":"jechen-0408b-txj8d-worker-southcentralus-1-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]' on node: jechen-0408b-txj8d-worker-southcentralus-1
I0408 15:55:04.296221       1 controller.go:160] Dropping key 'jechen-0408b-txj8d-worker-southcentralus-3' from the node workqueue
I0408 15:55:04.306224       1 controller.go:160] Dropping key 'jechen-0408b-txj8d-worker-southcentralus-1' from the node workqueue



$ oc get nodes -o wide
NAME                                         STATUS   ROLES    AGE   VERSION           INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                        KERNEL-VERSION                 CONTAINER-RUNTIME
jechen-0408b-txj8d-master-0                  Ready    master   72m   v1.23.3+54654d2   10.0.0.8      <none>        Red Hat Enterprise Linux CoreOS 411.85.202203242008-0 (Ootpa)   4.18.0-348.20.1.el8_5.x86_64   cri-o://1.24.0-5.rhaos4.11.gitd020fdb.el8
jechen-0408b-txj8d-master-1                  Ready    master   72m   v1.23.3+54654d2   10.0.0.6      <none>        Red Hat Enterprise Linux CoreOS 411.85.202203242008-0 (Ootpa)   4.18.0-348.20.1.el8_5.x86_64   cri-o://1.24.0-5.rhaos4.11.gitd020fdb.el8
jechen-0408b-txj8d-master-2                  Ready    master   72m   v1.23.3+54654d2   10.0.0.7      <none>        Red Hat Enterprise Linux CoreOS 411.85.202203242008-0 (Ootpa)   4.18.0-348.20.1.el8_5.x86_64   cri-o://1.24.0-5.rhaos4.11.gitd020fdb.el8
jechen-0408b-txj8d-worker-southcentralus-1   Ready    worker   50m   v1.23.3+54654d2   10.0.1.6      <none>        Red Hat Enterprise Linux CoreOS 411.85.202203242008-0 (Ootpa)   4.18.0-348.20.1.el8_5.x86_64   cri-o://1.24.0-5.rhaos4.11.gitd020fdb.el8
jechen-0408b-txj8d-worker-southcentralus-2   Ready    worker   50m   v1.23.3+54654d2   10.0.1.5      <none>        Red Hat Enterprise Linux CoreOS 411.85.202203242008-0 (Ootpa)   4.18.0-348.20.1.el8_5.x86_64   cri-o://1.24.0-5.rhaos4.11.gitd020fdb.el8
jechen-0408b-txj8d-worker-southcentralus-3   Ready    worker   53m   v1.23.3+54654d2   10.0.1.4      <none>        Red Hat Enterprise Linux CoreOS 411.85.202203242008-0 (Ootpa)   4.18.0-348.20.1.el8_5.x86_64   cri-o://1.24.0-5.rhaos4.11.gitd020fdb.el8



$ oc get nodes -o yaml | grep egress
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-master-0-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-master-1-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-master-2-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-worker-southcentralus-1-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-worker-southcentralus-2-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'
      cloud.network.openshift.io/egress-ipconfig: '[{"interface":"jechen-0408b-txj8d-worker-southcentralus-3-nic","ifaddr":{"ipv4":"10.0.0.0/16"},"capacity":{"ip":255}}]'

Comment 20 jechen 2022-04-08 16:57:04 UTC
 oc adm must-gather -- gather_network_logs
log is here:  https://drive.google.com/file/d/16CDArgPffxW_tptT2sjGvxU_WDOFX41n/view?usp=sharing

Comment 21 ffernand 2022-04-18 18:44:15 UTC

*** This bug has been marked as a duplicate of bug 2072439 ***

Comment 22 Red Hat Bugzilla 2023-09-15 01:23:07 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.