Bug 1985838
| Summary: | [OVN] CNO exportNetworkFlows does not clear collectors when deleted | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Ross Brattain <rbrattai> |
| Component: | Networking | Assignee: | Andrew Stoycos <astoycos> |
| Networking sub component: | ovn-kubernetes | QA Contact: | Ross Brattain <rbrattai> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | jechen, jgeorge, memodi, nweinber, rh-container |
| Version: | 4.8 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.10.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-03-12 04:36:01 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I reproduced this in upstream easily enough,
1. Run the "e2e br-int NetFlow export validation" upstream e2e test on a local kind cluster.
2. Manually remove the Netflow targets like so -> kubectl -n ovn-kubernetes set env daemonset/ovnkube-node -c ovnkube-node OVN_NETFLOW_TARGETS=""
...Allow ovnkube-node pods to reboot
3. Check to see if the targets are still in OVS
[astoycos@nfvsdn-02-oot ovn-kubernetes]$ for f in $(kubectl get pods -n ovn-kubernetes -l app=ovnkube-node -o jsonpath='{range@.items[*]}{.metadata.name}{"\n"}{end}' ) ; do kubectl -n ovn-kubernetes exec -c ovnkube-node $f -- bash -c 'for f in ipfix sflow netflow ; do ovs-vsctl find $f ; done' ; done
_uuid : 5fff6c60-4e62-4d4d-9a85-8ef04adaa03a
active_timeout : 60
add_id_to_interface : false
engine_id : []
engine_type : []
external_ids : {}
targets : ["172.18.0.5:2056"]
_uuid : ed886d14-538d-4035-8d0c-c880572ae42a
active_timeout : 60
add_id_to_interface : false
engine_id : []
engine_type : []
external_ids : {}
targets : ["172.18.0.5:2056"]
_uuid : ae82ae95-aa42-4c14-b355-550ada8b8cb9
active_timeout : 60
add_id_to_interface : false
engine_id : []
engine_type : []
external_ids : {}
targets : ["172.18.0.5:2056"]
I will post an upstream fix shortly and will extend the CI coverage to ensure we don't hit this again.
Thanks,
Andrew
Upstream fix can be seen here -> https://github.com/ovn-org/ovn-kubernetes/pull/2462 Verified on 4.10.0-0.nightly-2021-11-29-191648 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |
Description of problem: When `spec.exportNetworkFlows` is removed from `network.object` the existing collector targets are not cleared from OVS oc patch network.operator cluster --type='json' \ -p='[{"op":"remove", "path":"/spec/exportNetworkFlows"}]' does not remove the collector targets in OVS. Version-Release number of selected component (if applicable): 4.9.0-0.nightly-2021-07-21-081948 How reproducible: Always Steps to Reproduce: 1. patch network.object to add collector target spec: exportNetworkFlows: netFlow: collectors: - 10.129.0.7:2056 2. wait for ovnkube-node pods to recycle 3. Delete the `spec.exportNetworkFlows` oc patch network.operator cluster --type='json' \ -p='[{"op":"remove", "path":"/spec/exportNetworkFlows"}]' 4. Verify the collector target is still configured in OVS for f in $(oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-node -o jsonpath='{range@.items[*]}{.metadata.name}{"\n"}{end}' ) ; do oc -n openshift-ovn-kubernetes exec -c ovnkube-node $f -- bash -c 'for f in ipfix sflow netflow ; do ovs-vsctl find $f ; done' ; done Actual results: The ovnkube-node pods are immediately terminated and no OVS command is run to clear the collector targets. Expected results: OVS netflow collector targets are cleared when CNO `spec.exportNetworkFlows` is deleted. Additional info: Workaround is to clear the flows manually by running `ovs-vsctl -- clear Bridge br-int <FLOW>` for each flow type for f in $(oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-node -o jsonpath='{range@.items[*]}{.metadata.name}{"\n"}{end}' ) ; do oc -n openshift-ovn-kubernetes exec -c ovnkube-node $f -- bash -c 'for f in ipfix sflow netflow ; do ovs-vsctl -- clear Bridge br-int $f ; done' ; done If the administrator changes the collectors then the old targets will be over-written, so another workaround is to set the collector to a non-routeable address and take whatever hit OVS will incur sending to non-routeable addresses. `spec.exportNetworkFlows.netFlow.collectors` has `minItems: 1` in the API so we cannot clear the collectors with null or empty list.