Bug 1809739

Summary: [ovn-kubernetes] When a node gets deleted, the Chassis record for that node is not deleted from the sbdb.
Product: OpenShift Container Platform Reporter: Aniket Bhat <anbhat>
Component: NetworkingAssignee: Aniket Bhat <anbhat>
Networking sub component: ovn-kubernetes QA Contact: Ross Brattain <rbrattai>
Status: CLOSED WONTFIX Docs Contact:
Severity: high    
Priority: high CC: aconstan, bbennett, dcbw, rbrattai, zzhao
Version: 4.4   
Target Milestone: ---   
Target Release: 4.3.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: SDN-CI-IMPACT,SDN-BP
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1809738 Environment:
Last Closed: 2020-05-19 21:27:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1809738    
Bug Blocks:    

Description Aniket Bhat 2020-03-03 19:11:06 UTC
+++ This bug was initially created as a clone of Bug #1809738 +++

Description of problem:

When a node is deprovisioned/deleted from the cluster, the southbound db's chassis record for this node doesn't get deleted. This results in stale geneve tunnels and vswitchd flows on the other nodes in the cluster. At scale, this can mean thousands of tunnels and unused stale flows.

Version-Release number of selected component (if applicable):
4.4

How reproducible:
Always

Steps to Reproduce:
1. Create a ovn cluster
2. Add a few nodes
3. Delete one node
4. Note that the tunnels corresponding to the deleted node and the ovs flows for this remote ip endpoint stay in ovs' on the other nodes.

Actual results:

Flows and tunnels corresponding to the node being deleted stay as stale entries.

Expected results:

All flows and the corresponding tunnels for the node being deleted are cleaned up when the node goes away.

Additional info:

Upstream issue: https://github.com/ovn-org/ovn-kubernetes/issues/1105

Comment from Russell Bryant:

Just some more detail ... ovn-controller will delete its associated Chassis record if it shuts down gracefully. I'm not sure that's ever the case, though. The fallback is that something else needs to do the cleanup. ovn-kubernetes is already watching Nodes, so it can add this as another thing it does when syncing Nodes or when it sees a Node get deleted.

This will require knowing which Chassis record in the ovn southbound database corresponds to a Node. ovn-kubernetes already ensures that the hostname field of the Chassis is equal to the Node name.

Comment 6 Red Hat Bugzilla 2023-09-14 05:53:49 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days