Bug 2076321

Summary: [ocp-4.10][ovn-kubernetes] pod fails to connect kubernetes-service-ip when EgressIP is assigned to a namespace.
Product: OpenShift Container Platform Reporter: siva kanakala <skanakal>
Component: NetworkingAssignee: Ben Bennett <bbennett>
Networking sub component: ovn-kubernetes QA Contact: huirwang
Status: CLOSED DUPLICATE Docs Contact:
Severity: urgent    
Priority: urgent CC: ffernand, sdodson, surya, trozet
Version: 4.10   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-19 16:43:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description siva kanakala 2022-04-18 17:28:50 UTC
Description of problem:
[ocp-4.10][ovn-kubernetes]pod failed to connect Kubernetes service IP when egress IP was assigned to a namespace. 

Version-Release number of selected component (if applicable):
4.10.8

How reproducible:

Steps to Reproduce:
1. Configure egressIP to a namespace 

  Mon Apr 18 10:38:44 skanakal  ☻ ☀  oc create -f egressip.yml 
egressip.k8s.ovn.org/egress-project1 created
  Mon Apr 18 10:38:57 skanakal  ☻ ☀  
  Mon Apr 18 10:38:57 skanakal  ☻ ☀  oc get egressip
NAME              EGRESSIPS       ASSIGNED NODE                            ASSIGNED EGRESSIPS
egress-project1   192.168.51.13   ci-ln-r4vc4yk-c1627-5lfzr-worker-hk9ps   192.168.51.13
  Mon Apr 18 10:39:00 skanakal  ☻ ☀  

2. Deploy a test pod and verify the curl response to k8's svc ip:   

Mon Apr 18 10:37:40 skanakal  ☻ ☀  oc get pods -o wide
NAME             READY   STATUS    RESTARTS   AGE   IP            NODE                                     NOMINATED NODE   READINESS GATES
caddy-rc-n7tfx   1/1     Running   0          99s   10.129.2.15   ci-ln-r4vc4yk-c1627-5lfzr-worker-kp8ps   <none>           <none>
caddy-rc-xghl7   1/1     Running   0          99s   10.128.2.22   ci-ln-r4vc4yk-c1627-5lfzr-worker-hk9ps   <none>           <none>
  Mon Apr 18 10:38:04 skanakal  ☻ ☀  

It works if I try it from the pod which is currently on egressnode: 

Mon Apr 18 10:39:01 skanakal  ☻ ☀  oc rsh caddy-rc-xghl7
/srv $ 
/srv $ 
/srv $ nc -zv 172.30.0.1 443
172.30.0.1 (172.30.0.1:443) open
/srv $ 
/srv $ exit

It fails if I try this from the pod that is scheduled on non-egress node: 

  Mon Apr 18 10:39:22 skanakal  ☻ ☀  oc rsh  caddy-rc-n7tfx    
/srv $ 
/srv $ nc -zv 172.30.0.1 443
nc: 172.30.0.1 (172.30.0.1:443): Operation timed out
/srv $ exit
command terminated with exit code 1


3. When I delete the egress-ip it works from both the pods: 

  Mon Apr 18 10:46:25 skanakal  ☻ ☀  oc delete egressip egress-project1
egressip.k8s.ovn.org "egress-project1" deleted
  Mon Apr 18 10:47:28 skanakal  ☻ ☀  
  Mon Apr 18 10:47:30 skanakal  ☻ ☀  
  Mon Apr 18 10:47:31 skanakal  ☻ ☀  oc rsh  caddy-rc-n7tfx
/srv $ 
/srv $ nc -zv 172.30.0.1 443
172.30.0.1 (172.30.0.1:443) open
/srv $ 
/srv $ exit


Actual results:
pod fails to connect to the k8's svc IP even when the egress_IP attached 

Expected results:
pod should be able to connect to the k8's svc IP even when the egress_IP attached 

Additional info:

I am able to reproduce this issue locally and we have mustgather data.

Comment 1 siva kanakala 2022-04-18 18:25:04 UTC
sh-4.4# ovn-nbctl lr-policy-list ovn_cluster_router
Routing Policies
      1004 inport == "rtos-ci-ln-r4vc4yk-c1627-5lfzr-master-0" && ip4.dst == 192.168.51.14 /* ci-ln-r4vc4yk-c1627-5lfzr-master-0 */         reroute                10.129.0.2
      1004 inport == "rtos-ci-ln-r4vc4yk-c1627-5lfzr-master-0" && ip4.dst == 192.168.51.2 /* ci-ln-r4vc4yk-c1627-5lfzr-master-0 */         reroute                10.129.0.2
      1004 inport == "rtos-ci-ln-r4vc4yk-c1627-5lfzr-master-1" && ip4.dst == 192.168.51.19 /* ci-ln-r4vc4yk-c1627-5lfzr-master-1 */         reroute                10.128.0.2
      1004 inport == "rtos-ci-ln-r4vc4yk-c1627-5lfzr-master-2" && ip4.dst == 192.168.51.30 /* ci-ln-r4vc4yk-c1627-5lfzr-master-2 */         reroute                10.130.0.2
      1004 inport == "rtos-ci-ln-r4vc4yk-c1627-5lfzr-worker-hdfdx" && ip4.dst == 192.168.51.20 /* ci-ln-r4vc4yk-c1627-5lfzr-worker-hdfdx */         reroute                10.131.0.2
      1004 inport == "rtos-ci-ln-r4vc4yk-c1627-5lfzr-worker-hk9ps" && ip4.dst == 192.168.51.23 /* ci-ln-r4vc4yk-c1627-5lfzr-worker-hk9ps */         reroute                10.128.2.2
      1004 inport == "rtos-ci-ln-r4vc4yk-c1627-5lfzr-worker-hk9ps" && ip4.dst == 192.168.51.3 /* ci-ln-r4vc4yk-c1627-5lfzr-worker-hk9ps */         reroute                10.128.2.2
      1004 inport == "rtos-ci-ln-r4vc4yk-c1627-5lfzr-worker-kp8ps" && ip4.dst == 192.168.51.12 /* ci-ln-r4vc4yk-c1627-5lfzr-worker-kp8ps */         reroute                10.129.2.2
       101 ip4.src == 10.128.0.0/14 && ip4.dst == 10.128.0.0/14           allow
       101 ip4.src == 10.128.0.0/14 && ip4.dst == 100.64.0.0/16           allow           <<<<<<<<<----------
       101 ip4.src == 10.128.0.0/14 && ip4.dst == 192.168.51.12/32           allow
       101 ip4.src == 10.128.0.0/14 && ip4.dst == 192.168.51.14/32           allow
       101 ip4.src == 10.128.0.0/14 && ip4.dst == 192.168.51.19/32           allow
       101 ip4.src == 10.128.0.0/14 && ip4.dst == 192.168.51.20/32           allow
       101 ip4.src == 10.128.0.0/14 && ip4.dst == 192.168.51.23/32           allow
       101 ip4.src == 10.128.0.0/14 && ip4.dst == 192.168.51.30/32           allow
sh-4.4# exit

It seems hostnetwork access is allowed to the services backed by egressip matching pods:


Mon Apr 18 11:41:28 skanakal  ☻ ☀  oc get network cluster -o json | jq '.status'
{
  "clusterNetwork": [
    {
      "cidr": "10.128.0.0/14",
      "hostPrefix": 23
    }
  ],
  "clusterNetworkMTU": 1400,
  "networkType": "OVNKubernetes",
  "serviceNetwork": [
    "172.30.0.0/16"
  ]
}
  Mon Apr 18 11:42:08 skanakal  ☻ ☀

Comment 2 Scott Dodson 2022-04-18 19:34:03 UTC
If this is believed to be a regression, it worked in 4.9 but not 4.10, please add the Regression keyword. It's unclear reading the description whether this is believed to be a regression or not.

Comment 7 Surya Seetharaman 2022-04-19 07:15:27 UTC
It's possible this is a dupe of https://bugzilla.redhat.com/show_bug.cgi?id=2070929? @flavio: wdyt?
Since api server backend pods are host-networked, is it possible that the 1004 route takes priority over 101? I'm surprised we haven't noticed this though for so long, not sure if its the same in versions less than 4.9 as well.

Comment 9 Tim Rozet 2022-04-19 16:43:51 UTC
I think this is a duplicate of 2070929. I can see that SNAT entries are missing on the originating node (not the egress IP node).

*** This bug has been marked as a duplicate of bug 2070929 ***

Comment 10 Red Hat Bugzilla 2023-09-15 01:23:14 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days