Description of problem: Nat rules for egressip were not cleared even restart ovnkube-master pods Version-Release number of selected component (if applicable): 4.10.0-0.nightly-2021-12-06-201335 How reproducible: Steps to Reproduce: 1. Label one node as egress node 2. Create egressip object oc get egressip -o yaml .... spec: egressIPs: - 172.31.249.117 namespaceSelector: matchLabels: org: pm podSelector: {} status: items: - egressIP: 172.31.249.117 node: compute-0 .... 3. Create ns ds36l and 10 pods in it, label org=pm to the namespace 4. scale the CNO to 0 oc scale deployment network-operator -n openshift-network-operator --replicas 0 5.Delete ovnkube-master ds Scale test pods replicas to 1 6. scale the CNO to 1 oc scale deployment network-operator -n openshift-network-operator --replicas 1 deployment.apps/network-operator scaled 7. Check lr-policy-list and snat ovn-nbctl lr-policy-list ovn_cluster_router | grep "100 " 100 ip4.src == 10.128.2.28 reroute 100.64.0.6 100 ip4.src == 10.128.2.29 reroute 100.64.0.6 100 ip4.src == 10.128.2.30 reroute 100.64.0.6 100 ip4.src == 10.128.2.31 reroute 100.64.0.6 100 ip4.src == 10.128.2.32 reroute 100.64.0.6 100 ip4.src == 10.128.2.33 reroute 100.64.0.6 100 ip4.src == 10.131.0.22 reroute 100.64.0.6 100 ip4.src == 10.131.0.23 reroute 100.64.0.6 100 ip4.src == 10.131.0.24 reroute 100.64.0.6 100 ip4.src == 10.131.0.25 reroute 100.64.0.6 8. Nat rules are not correct,not only for the above 10 pod's IP. As I have done some egressip regression testing on this cluster and also this case for a couple of times. sh-4.4# ovn-nbctl --format=csv --no-heading find nat external_ids:name=egressip e371ed02-2d3d-4f47-83f8-e47327323a16,[],[],{name=egressip},"""172.31.249.117""",[],"""""","""10.128.2.30""",k8s-compute-0,"{stateless=""false""}",snat bd532783-71ed-4e7f-81e3-05c3c13e189a,[],[],{name=egressip},"""172.31.248.78""",[],"""""","""10.131.0.21""",k8s-compute-0,"{stateless=""false""}",snat 32135123-a84b-4918-88ef-cb20003fbd04,[],[],{name=egressip},"""172.31.248.78""",[],"""""","""10.128.2.14""",k8s-compute-0,"{stateless=""false""}",snat 4da801e4-c9c5-4220-929d-c2e8e504dd24,[],[],{name=egressip},"""172.31.249.117""",[],"""""","""10.131.0.24""",k8s-compute-0,"{stateless=""false""}",snat 78eaece4-766e-4358-bfbb-ac6c1e210d05,[],[],{name=egressip},"""172.31.248.53""",[],"""""","""10.128.2.64""",k8s-compute-0,"{stateless=""false""}",snat ca7582a5-a7f1-4bed-97e0-9a518e91d558,[],[],{name=egressip},"""172.31.248.78""",[],"""""","""10.128.2.13""",k8s-compute-0,"{stateless=""false""}",snat 7b1fdd26-1f81-408c-8cad-6d9fcd03731f,[],[],{name=egressip},"""172.31.249.117""",[],"""""","""10.128.2.29""",k8s-compute-0,"{stateless=""false""}",snat e008c686-44f8-49ef-9690-845567b800db,[],[],{name=egressip},"""172.31.248.53""",[],"""""","""10.128.2.65""",k8s-compute-0,"{stateless=""false""}",snat 727c0dc6-fc19-4c6a-b890-e57b97dacd64,[],[],{name=egressip},"""172.31.248.53""",[],"""""","""10.128.2.61""",k8s-compute-0,"{stateless=""false""}",snat 46231587-9d93-40b0-a9b6-db90383ca133,[],[],{name=egressip},"""172.31.248.212""",[],"""""","""10.131.0.10""",k8s-compute-0,"{stateless=""false""}",snat 9ee85407-8a48-4963-9bac-d734c674bdff,[],[],{name=egressip},"""172.31.248.212""",[],"""""","""10.128.2.10""",k8s-compute-0,"{stateless=""false""}",snat ed59f208-8503-4763-a12c-23d44306d13a,[],[],{name=egressip},"""172.31.248.212""",[],"""""","""10.131.0.9""",k8s-compute-0,"{stateless=""false""}",snat 5cc6fac9-a3b3-4355-a704-5c796746951d,[],[],{name=egressip},"""172.31.248.78""",[],"""""","""10.128.2.18""",k8s-compute-0,"{stateless=""false""}",snat 42cdda4b-3d9b-4d31-80f4-30cfc6863283,[],[],{name=egressip},"""172.31.249.117""",[],"""""","""10.128.2.32""",k8s-compute-0,"{stateless=""false""}",snat 8462e02c-c2ba-4e9a-83ae-2dffbd0019f2,[],[],{name=egressip},"""172.31.248.78""",[],"""""","""10.128.2.16""",k8s-compute-0,"{stateless=""false""}",snat 4bcd434c-a9cf-4b44-968c-e33e84015f3b,[],[],{name=egressip},"""172.31.248.78""",[],"""""","""10.128.2.17""",k8s-compute-0,"{stateless=""false""}",snat 535535d0-db85-435c-8acb-2362688a20a9,[],[],{name=egressip},"""172.31.249.117""",[],"""""","""10.131.0.23""",k8s-compute-0,"{stateless=""false""}",snat 19c316b2-5a5e-44c2-bc3e-dfe95cdf7596,[],[],{name=egressip},"""172.31.248.212""",[],"""""","""10.128.2.14""",k8s-compute-0,"{stateless=""false""}",snat d25eb913-3287-4650-9f3a-05110508e25d,[],[],{name=egressip},"""172.31.248.212""",[],"""""","""10.128.2.9""",k8s-compute-0,"{stateless=""false""}",snat 55b8da53-fc85-4092-b985-3c18411f6a88,[],[],{name=egressip},"""172.31.248.212""",[],"""""","""10.131.0.15""",k8s-compute-0,"{stateless=""false""}",snat e3256ad4-63d1-4f02-a114-e3539129139a,[],[],{name=egressip},"""172.31.248.53""",[],"""""","""10.128.2.63""",k8s-compute-0,"{stateless=""false""}",snat 4339efb5-b75a-429a-8f0b-b11f93f97686,[],[],{name=egressip},"""172.31.249.117""",[],"""""","""10.128.2.31""",k8s-compute-0,"{stateless=""false""}",snat ac1fdc15-f300-4fa5-a271-74982450601a,[],[],{name=egressip},"""172.31.248.212""",[],"""""","""10.128.2.13""",k8s-compute-0,"{stateless=""false""}",snat d93ec1b5-9774-4422-b0ee-8334354d2d32,[],[],{name=egressip},"""172.31.249.117""",[],"""""","""10.131.0.25""",k8s-compute-0,"{stateless=""false""}",snat 5190eef6-335f-4a98-ac68-e7e033aafde4,[],[],{name=egressip},"""172.31.248.212""",[],"""""","""10.128.2.12""",k8s-compute-0,"{stateless=""false""}",snat c9b4eb58-d2d0-4b55-a2bc-c246f536eb1f,[],[],{name=egressip},"""172.31.249.117""",[],"""""","""10.131.0.22""",k8s-compute-0,"{stateless=""false""}",snat a0628f39-ee24-4e73-9bc7-f93e3f83410e,[],[],{name=egressip},"""172.31.248.78""",[],"""""","""10.131.0.23""",k8s-compute-0,"{stateless=""false""}",snat 9217a2f9-06cb-4e56-94b0-d5be7dc808fc,[],[],{name=egressip},"""172.31.248.212""",[],"""""","""10.128.2.11""",k8s-compute-0,"{stateless=""false""}",snat 4b0d787b-facb-4119-92c6-b6b1d66a4788,[],[],{name=egressip},"""172.31.249.117""",[],"""""","""10.128.2.28""",k8s-compute-0,"{stateless=""false""}",snat 4ac19f7d-9c7f-4982-b0f0-ad305170fbaa,[],[],{name=egressip},"""172.31.248.53""",[],"""""","""10.128.2.60""",k8s-compute-0,"{stateless=""false""}",snat ac8a781e-0d17-4928-89a4-0f809720aa04,[],[],{name=egressip},"""172.31.248.78""",[],"""""","""10.131.0.22""",k8s-compute-0,"{stateless=""false""}",snat Actual results: Stale lr-policy-list and snat rules left Expected results: No stale lr-policy-list and snat rules left Additional info:
Stale rows are not getting deleted because of the following error in the transaction: I1221 22:49:13.197021 37 model_client.go:313] Delete operations generated as: [{Op:delete Table:Logical_Router_Policy Row:map[] Rows:[] Columns:[] Mutations:[] Timeout:0 Where:[where column _uuid == {a97cd20d-c22b-4a99-882f-e19ebf9d\ 6af7}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}] E1221 22:49:13.197075 37 egressip.go:877] XXX syncStaleEgressReroutePolicy will delete egressip: {a97cd20d-c22b-4a99-882f-e19ebf9d6af7 reroute map[name:egressip] ip4.src == 10.244.1.27 <nil> [100.64.0.4] map[] 100} I1221 22:49:13.197388 37 model_client.go:304] Mutate operations generated as: [{Op:mutate Table:Logical_Router Row:map[] Rows:[] Columns:[] Mutations:[{Column:policies Mutator:delete Value:{GoSet:[{GoUUID:a97cd20d-c22b-4a99-882f-e19\ ebf9d6af7}]}}] Timeout:0 Where:[where column _uuid == {4a681460-5950-463c-9fb9-745721734569}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}] I1221 22:49:13.197447 37 transact.go:41] Configuring OVN: [{Op:delete Table:Logical_Router_Policy Row:map[] Rows:[] Columns:[] Mutations:[] Timeout:0 Where:[where column _uuid == {891376ae-6907-4ba9-b72f-d476ebaeb5c6}] Until: Durabl\ e:<nil> Comment:<nil> Lock:<nil> UUIDName:} {Op:delete Table:Logical_Router_Policy Row:map[] Rows:[] Columns:[] Mutations:[] Timeout:0 Where:[where column _uuid == {a97cd20d-c22b-4a99-882f-e19ebf9d6af7}] Until: Durable:<nil> Comment:<nil\ > Lock:<nil> UUIDName:} {Op:mutate Table:Logical_Router Row:map[] Rows:[] Columns:[] Mutations:[{Column:policies Mutator:delete Value:{GoSet:[{GoUUID:a97cd20d-c22b-4a99-882f-e19ebf9d6af7}]}}] Timeout:0 Where:[where column _uuid == {4a681\ 460-5950-463c-9fb9-745721734569}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}] I1221 22:49:13.197651 37 client.go:726] "msg"="transacting operations" "database"="OVN_Northbound" "operations"="[{Op:delete Table:Logical_Router_Policy Row:map[] Rows:[] Columns:[] Mutations:[] Timeout:0 Where:[where column _uuid\ == {891376ae-6907-4ba9-b72f-d476ebaeb5c6}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:} {Op:delete Table:Logical_Router_Policy Row:map[] Rows:[] Columns:[] Mutations:[] Timeout:0 Where:[where column _uuid == {a97cd20d-c22b-4\ a99-882f-e19ebf9d6af7}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:} {Op:mutate Table:Logical_Router Row:map[] Rows:[] Columns:[] Mutations:[{Column:policies Mutator:delete Value:{GoSet:[{GoUUID:a97cd20d-c22b-4a99-882f-e19ebf\ 9d6af7}]}}] Timeout:0 Where:[where column _uuid == {4a681460-5950-463c-9fb9-745721734569}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}]" E1221 22:49:13.198459 37 egressip.go:895] Unable to remove stale logical router policies, err: error in transact with ops [{Op:delete Table:Logical_Router_Policy Row:map[] Rows:[] Columns:[] Mutations:[] Timeout:0 Where:[where colum\ n _uuid == {891376ae-6907-4ba9-b72f-d476ebaeb5c6}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:} {Op:delete Table:Logical_Router_Policy Row:map[] Rows:[] Columns:[] Mutations:[] Timeout:0 Where:[where column _uuid == {a97cd20d\ -c22b-4a99-882f-e19ebf9d6af7}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:} {Op:mutate Table:Logical_Router Row:map[] Rows:[] Columns:[] Mutations:[{Column:policies Mutator:delete Value:{GoSet:[{GoUUID:a97cd20d-c22b-4a99-882f\ -e19ebf9d6af7}]}}] Timeout:0 Where:[where column _uuid == {4a681460-5950-463c-9fb9-745721734569}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}] results [{Count:1 Error: Details: UUID:{GoUUID:} Rows:[]} {Count:1 Error: Details\ : UUID:{GoUUID:} Rows:[]} {Count:1 Error: Details: UUID:{GoUUID:} Rows:[]} {Count:0 Error:referential integrity violation Details:cannot delete Logical_Router_Policy row 891376ae-6907-4ba9-b72f-d476ebaeb5c6 because of 1 remaining referen\ ce(s) UUID:{GoUUID:} Rows:[]}] and errors []: referential integrity violation: cannot delete Logical_Router_Policy row 891376ae-6907-4ba9-b72f-d476ebaeb5c6 because of 1 remaining reference(s) There is something wrong in the mutation delete, because the wrong uuid is being deleted. To be further investigated.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056