Description of problem: When `spec.exportNetworkFlows` is removed from `network.object` the existing collector targets are not cleared from OVS oc patch network.operator cluster --type='json' \ -p='[{"op":"remove", "path":"/spec/exportNetworkFlows"}]' does not remove the collector targets in OVS. Version-Release number of selected component (if applicable): 4.9.0-0.nightly-2021-07-21-081948 How reproducible: Always Steps to Reproduce: 1. patch network.object to add collector target spec: exportNetworkFlows: netFlow: collectors: - 10.129.0.7:2056 2. wait for ovnkube-node pods to recycle 3. Delete the `spec.exportNetworkFlows` oc patch network.operator cluster --type='json' \ -p='[{"op":"remove", "path":"/spec/exportNetworkFlows"}]' 4. Verify the collector target is still configured in OVS for f in $(oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-node -o jsonpath='{range@.items[*]}{.metadata.name}{"\n"}{end}' ) ; do oc -n openshift-ovn-kubernetes exec -c ovnkube-node $f -- bash -c 'for f in ipfix sflow netflow ; do ovs-vsctl find $f ; done' ; done Actual results: The ovnkube-node pods are immediately terminated and no OVS command is run to clear the collector targets. Expected results: OVS netflow collector targets are cleared when CNO `spec.exportNetworkFlows` is deleted. Additional info: Workaround is to clear the flows manually by running `ovs-vsctl -- clear Bridge br-int <FLOW>` for each flow type for f in $(oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-node -o jsonpath='{range@.items[*]}{.metadata.name}{"\n"}{end}' ) ; do oc -n openshift-ovn-kubernetes exec -c ovnkube-node $f -- bash -c 'for f in ipfix sflow netflow ; do ovs-vsctl -- clear Bridge br-int $f ; done' ; done If the administrator changes the collectors then the old targets will be over-written, so another workaround is to set the collector to a non-routeable address and take whatever hit OVS will incur sending to non-routeable addresses. `spec.exportNetworkFlows.netFlow.collectors` has `minItems: 1` in the API so we cannot clear the collectors with null or empty list.
I reproduced this in upstream easily enough, 1. Run the "e2e br-int NetFlow export validation" upstream e2e test on a local kind cluster. 2. Manually remove the Netflow targets like so -> kubectl -n ovn-kubernetes set env daemonset/ovnkube-node -c ovnkube-node OVN_NETFLOW_TARGETS="" ...Allow ovnkube-node pods to reboot 3. Check to see if the targets are still in OVS [astoycos@nfvsdn-02-oot ovn-kubernetes]$ for f in $(kubectl get pods -n ovn-kubernetes -l app=ovnkube-node -o jsonpath='{range@.items[*]}{.metadata.name}{"\n"}{end}' ) ; do kubectl -n ovn-kubernetes exec -c ovnkube-node $f -- bash -c 'for f in ipfix sflow netflow ; do ovs-vsctl find $f ; done' ; done _uuid : 5fff6c60-4e62-4d4d-9a85-8ef04adaa03a active_timeout : 60 add_id_to_interface : false engine_id : [] engine_type : [] external_ids : {} targets : ["172.18.0.5:2056"] _uuid : ed886d14-538d-4035-8d0c-c880572ae42a active_timeout : 60 add_id_to_interface : false engine_id : [] engine_type : [] external_ids : {} targets : ["172.18.0.5:2056"] _uuid : ae82ae95-aa42-4c14-b355-550ada8b8cb9 active_timeout : 60 add_id_to_interface : false engine_id : [] engine_type : [] external_ids : {} targets : ["172.18.0.5:2056"] I will post an upstream fix shortly and will extend the CI coverage to ensure we don't hit this again. Thanks, Andrew
Upstream fix can be seen here -> https://github.com/ovn-org/ovn-kubernetes/pull/2462
Verified on 4.10.0-0.nightly-2021-11-29-191648
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056