Bug 2060549

Summary: ErrorAddingLogicalPort: duplicate IP found in ECMP Pod route cache!
Product: OpenShift Container Platform Reporter: Murali Krishnasamy <murali>
Component: NetworkingAssignee: Tim Rozet <trozet>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: rkhan, rsevilla, surya
Version: 4.10   
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2061804 2063863 (view as bug list) Environment:
Last Closed: 2022-08-10 10:52:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2061804, 2063863    

Description Murali Krishnasamy 2022-03-03 18:35:08 UTC
Description of problem:
On churning ICNI2 density workload on 4.10 openshift causes pod to stuck on `containerCreating` state forever with an error event says - 

Warning  ErrorAddingLogicalPort  53s (x9 over 13m)  controlplane  failed to handle external GW check: unable to add pod: serving-ns-33/dep-serving-33-4-serving-job-66f7948f5d-dzx52 as external gateway for namespace: served-ns-33, error: duplicate IP found in ECMP Pod route cache! IP: "192.168.222.33", first pod: "dep-serving-33-4-serving-job-66f7948f5d-dzx52", second pod: "serving-ns-33_dep-serving-33-4-serving-job-66f7948f5d-dzx52"


Version-Release number of selected component (if applicable):
4.10.0-rc.6
ovn21.12-central-21.12.0-25.el8fdp.x86_64
ovn21.12-vtep-21.12.0-25.el8fdp.x86_64
ovn21.12-21.12.0-25.el8fdp.x86_64
ovn21.12-host-21.12.0-25.el8fdp.x86_64

How reproducible:
Always on churning ICNI2 workload

Steps to Reproduce:
1. Deploy a healthy cluster on baremetal node(in this case 120 node) on OVN shared gateway mode
2. Create ICNI2 workload with serving and served pods, verify all BFD sessions and ECMP routes
3. Delete and recreate a couple of times and restart ovn master pods for a leader change
4. Re-create ICNI2 workload and watch for stuck pods or `ErrorAddingLogicalPort` message in OVN master logs

Actual results:
Pod stuck on adding logical port due to duplicate IP on ECMP pods route

Expected results:
Pod should be running like normal app pod without any error

Additional info:
This happened only after restarting OVN master pods.

Comment 2 Surya Seetharaman 2022-03-03 18:39:26 UTC
Spoke to Tim, as we can see from the logs below, we add the key to cache and then the actual transact happens much later, and if that fails, we will retry pod setup and then we retry adding this pod to cache and we hit the duplicate podIP error.

Tim might be able to explain it better, assigning this to him, we had a few options for fixing but considering this situation is not practically possible that often in user env, we went with the simplest fix to check the case of same pod retry happening.

==============================Round 1================================================
I0303 15:45:44.617308       1 pods.go:332] [serving-ns-33/dep-serving-33-4-serving-job-66f7948f5d-dzx52] creating logical port for pod on switch worker003-fc640
I0303 15:45:44.617409       1 kube.go:73] Setting annotations map[k8s.ovn.org/pod-networks:{"default":{"ip_addresses":["10.128.16.48/23"],"mac_address":"0a:58:0a:80:10:30","gateway_ips":["10.128.16.1"],"ip_address":"10.128.16.48/23","gateway_ip":"10.128.16.1"}}] on pod serving-ns-33/dep-serving-33-4-serving-job-66f7948f5d-dzx52
==
I0303 15:45:44.629398       1 egressgw.go:155] External gateway pod: dep-serving-33-4-serving-job-66f7948f5d-dzx52, detected for namespace(s) served-ns-33
I0303 15:45:44.629444       1 egressgw.go:203] Adding routes for external gateway pod: dep-serving-33-4-serving-job-66f7948f5d-dzx52, next hops: "192.168.222.33", namespace: served-ns-33, bfd-enabled: true
==
2022-03-03T15:45:44.630Z|00065|ovsdb_cs|INFO|ssl:192.168.216.12:9641: clustered database server is not cluster leader; trying another server
2022-03-03T15:45:44.630Z|00066|reconnect|INFO|ssl:192.168.216.12:9641: connection attempt timed out
2022-03-03T15:45:44.630Z|00067|reconnect|INFO|ssl:192.168.216.11:9641: connecting...
2022-03-03T15:45:44.634Z|00068|reconnect|INFO|ssl:192.168.216.11:9641: connected
====
I0303 15:45:44.732939       1 client.go:1053]  "msg"="endpoint lost leader, reconnecting" "database"="OVN_Northbound" "endpoint"="ssl:192.168.216.12:9641" "sid"="d627d068-0e72-4397-80f4-22b922bee61e"
I0303 15:45:44.733110       1 client.go:1137]  "msg"="connection lost, reconnecting" "database"="OVN_Northbound" "endpoint"="ssl:192.168.216.10:9641"
I0303 15:45:44.733660       1 transact.go:41] Configuring OVN: [{Op:insert Table:BFD Row:map[dst_ip:192.168.222.33 logical_port:rtoe-GR_worker023-fc640] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996875} {Op:insert Table:Logical_Router_Static_Route Row:map[bfd:{GoSet:[{GoUUID:u2596996875}]} ip_prefix:10.130.30.8/32 nexthop:192.168.222.33 options:{GoMap:map[ecmp_symmetric_reply:true]} output_port:{GoSet:[rtoe-GR_worker023-fc640]} policy:{GoSet:[src-ip]}] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996876} {Op:mutate Table:Logical_Router Row:map[] Rows:[] Columns:[] Mutations:[{Column:static_routes Mutator:insert Value:{GoSet:[{GoUUID:u2596996876}]}}] Timeout:<nil> Where:[where column _uuid == {37882d04-18f4-4b88-b87e-0d245145be44}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}]
===
2022-03-03T15:45:52.863Z|00069|ovsdb_cs|INFO|ssl:192.168.216.11:9641: clustered database server is not cluster leader; trying another server
2022-03-03T15:45:52.863Z|00070|reconnect|INFO|ssl:192.168.216.11:9641: connection attempt timed out
2022-03-03T15:45:52.863Z|00071|reconnect|INFO|ssl:192.168.216.10:9641: connecting...
2022-03-03T15:45:52.869Z|00072|reconnect|INFO|ssl:192.168.216.10:9641: connected
2022-03-03T15:45:52.870Z|00073|ovsdb_cs|INFO|ssl:192.168.216.10:9641: clustered database server is not cluster leader; trying another server
2022-03-03T15:45:52.870Z|00074|reconnect|INFO|ssl:192.168.216.10:9641: connection attempt timed out
2022-03-03T15:45:52.870Z|00075|reconnect|INFO|ssl:192.168.216.12:9641: connecting...
2022-03-03T15:45:52.874Z|00076|reconnect|INFO|ssl:192.168.216.12:9641: connected
I0303 15:46:04.734211       1 pods.go:322] [serving-ns-33/dep-serving-33-4-serving-job-66f7948f5d-dzx52] addLogicalPort took 20.116904396s
====
E0303 15:46:04.734238       1 ovn.go:648] failed to handle external GW check: unable to add src-ip route to GR router, err: error in transact with ops [{Op:insert Table:BFD Row:map[dst_ip:192.168.222.33 logical_port:rtoe-GR_worker023-fc640] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996875} {Op:insert Table:Logical_Router_Static_Route Row:map[bfd:{GoSet:[{GoUUID:u2596996875}]} ip_prefix:10.130.30.8/32 nexthop:192.168.222.33 options:{GoMap:map[ecmp_symmetric_reply:true]} output_port:{GoSet:[rtoe-GR_worker023-fc640]} policy:{GoSet:[src-ip]}] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996876} {Op:mutate Table:Logical_Router Row:map[] Rows:[] Columns:[] Mutations:[{Column:static_routes Mutator:insert Value:{GoSet:[{GoUUID:u2596996876}]}}] Timeout:<nil> Where:[where column _uuid == {37882d04-18f4-4b88-b87e-0d245145be44}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}]: context deadline exceeded: while awaiting reconnection
=========================================Round 2===================================
I0303 15:46:04.734287       1 pods.go:332] [serving-ns-33/dep-serving-33-4-serving-job-66f7948f5d-dzx52] creating logical port for pod on switch worker003-fc640
I0303 15:46:04.734406       1 client.go:1248]  "msg"="warning: unable to ensure cache consistency for reading"  "database"="OVN_Northbound"
I0303 15:46:04.734408       1 event.go:282] Event(v1.ObjectReference{Kind:"Pod", Namespace:"serving-ns-33", Name:"dep-serving-33-4-serving-job-66f7948f5d-dzx52", UID:"1b25bb96-66f8-401d-b063-1c90eec18ad5", APIVersion:"v1", ResourceVersion:"5071427", FieldPath:""}): type: 'Warning' reason: 'ErrorAddingLogicalPort' failed to handle external GW check: unable to add src-ip route to GR router, err: error in transact with ops [{Op:insert Table:BFD Row:map[dst_ip:192.168.222.33 logical_port:rtoe-GR_worker023-fc640] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996875} {Op:insert Table:Logical_Router_Static_Route Row:map[bfd:{GoSet:[{GoUUID:u2596996875}]} ip_prefix:10.130.30.8/32 nexthop:192.168.222.33 options:{GoMap:map[ecmp_symmetric_reply:true]} output_port:{GoSet:[rtoe-GR_worker023-fc640]} policy:{GoSet:[src-ip]}] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:u2596996876} {Op:mutate Table:Logical_Router Row:map[] Rows:[] Columns:[] Mutations:[{Column:static_routes Mutator:insert Value:{GoSet:[{GoUUID:u2596996876}]}}] Timeout:<nil> Where:[where column _uuid == {37882d04-18f4-4b88-b87e-0d245145be44}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}]: context deadline exceeded: while awaiting reconnection
=========
I0303 15:46:05.162958       1 egressgw.go:155] External gateway pod: dep-serving-33-4-serving-job-66f7948f5d-dzx52, detected for namespace(s) served-ns-33
I0303 15:46:05.162994       1 pods.go:322] [serving-ns-33/dep-serving-33-4-serving-job-66f7948f5d-dzx52] addLogicalPort took 428.709552ms
E0303 15:46:05.163003       1 ovn.go:648] failed to handle external GW check: unable to add pod: serving-ns-33/dep-serving-33-4-serving-job-66f7948f5d-dzx52 as external gateway for namespace: served-ns-33, error: duplicate IP found in ECMP Pod route cache! IP: "192.168.222.33", first pod: "dep-serving-33-4-serving-job-66f7948f5d-dzx52", second pod: "serving-ns-33_dep-serving-33-4-serving-job-66f7948f5d-dzx52"
I0303 15:46:05.163010       1 egressgw.go:155] External gateway pod: dep-serving-33-2-serving-job-dc6c6d4c8-9jnnr, detected for namespace(s) served-ns-33
I0303 15:46:05.163043       1 pods.go:332] [serving-ns-33/dep-serving-33-4-serving-job-66f7948f5d-dzx52] creating logical port for pod on switch worker003-fc640
I0303 15:46:05.163108       1 event.go:282] Event(v1.ObjectReference{Kind:"Pod", Namespace:"serving-ns-33", Name:"dep-serving-33-4-serving-job-66f7948f5d-dzx52", UID:"1b25bb96-66f8-401d-b063-1c90eec18ad5", APIVersion:"v1", ResourceVersion:"5071434", FieldPath:""}): type: 'Warning' reason: 'ErrorAddingLogicalPort' failed to handle external GW check: unable to add pod: serving-ns-33/dep-serving-33-4-serving-job-66f7948f5d-dzx52 as external gateway for namespace: served-ns-33, error: duplicate IP found in ECMP Pod route cache! IP: "192.168.222.33", first pod: "dep-serving-33-4-serving-job-66f7948f5d-dzx52", second pod: "serving-ns-33_dep-serving-33-4-serving-job-66f7948f5d-dzx52"

Comment 3 Surya Seetharaman 2022-03-03 20:23:18 UTC
(In reply to Surya Seetharaman from comment #2)
> Spoke to Tim, as we can see from the logs below, we add the key to cache and
> then the actual transact happens much later, and if that fails, we will
> retry pod setup and then we retry adding this pod to cache and we hit the
> duplicate podIP error.
> 
> Tim might be able to explain it better, assigning this to him, we had a few
> options for fixing but considering this situation is not practically
> possible that often in user env, we went with the simplest fix to check the
> case of same pod retry happening.
>

Wait I take this back actually, my understanding is if we fail for whatever reason during an addLogicalPort of exgw serving pod, we'd hit this bug i.e any adds of routes/policies/bfd post that add to cache event. Maybe its not a "rare" event :)

Comment 4 Murali Krishnasamy 2022-03-08 21:25:18 UTC
Tested this patch(https://github.com/openshift/ovn-kubernetes/pull/990) and it worked. 
Able to churn ICNI2 workload on 120 node environment without addLogicalPort error.

Comment 9 errata-xmlrpc 2022-08-10 10:52:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069