Bug 2175041
Summary: | Slowness due to high number nftables rules causes CNV to pile up cnv-bridge commands on VM shutdown | ||
---|---|---|---|
Product: | Container Native Virtualization (CNV) | Reporter: | Germano Veit Michel <gveitmic> |
Component: | Networking | Assignee: | Edward Haas <edwardh> |
Status: | CLOSED ERRATA | QA Contact: | Yossi Segev <ysegev> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.12.5 | CC: | edwardh, mduarted, phoracek, psutter, rgertzbe |
Target Milestone: | --- | ||
Target Release: | 4.13.1 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | cnv-containernetworking-plugins-rhel9 v4.13.1-2 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-06-20 13:41:05 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Germano Veit Michel
2023-03-03 01:12:58 UTC
It also causes further performance issues for any remaining workload. SystemMemoryExceedsReservation is firing. About 1h later my system CPU is at 90% sys and becoming unresponsive... will need to reboot. Tasks: 420 total, 70 running, 349 sleeping, 0 stopped, 1 zombie %Cpu(s): 9.1 us, 90.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.4 hi, 0.5 si, 0.0 st MiB Mem : 63676.7 total, 44272.9 free, 10140.4 used, 9263.5 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 52758.4 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2676831 root 20 0 90548 45984 3752 R 14.6 0.1 1:56.92 nft 2688369 root 20 0 90548 46048 3812 R 14.3 0.1 1:17.33 nft 2640644 root 20 0 96752 46128 3796 R 14.0 0.1 4:49.54 nft 2635746 root 20 0 100876 46192 3800 R 12.0 0.1 5:47.65 nft 2629093 root 20 0 100984 46420 3908 R 11.3 0.1 7:03.59 nft 2642279 root 20 0 96752 46096 3764 R 11.0 0.1 4:47.25 nft 2691595 root 20 0 90548 45976 3740 R 11.0 0.1 1:10.51 nft 2701454 root 20 0 90548 45852 3620 R 9.6 0.1 0:45.91 nft 2632472 root 20 0 102328 47596 3908 R 9.0 0.1 6:17.97 nft 2673566 root 20 0 90548 45996 3764 R 8.3 0.1 2:05.88 nft 2662077 root 20 0 90548 46000 3764 R 8.0 0.1 2:47.36 nft 11778 nfsnobo+ 20 0 18.9g 3.5g 593068 S 7.6 5.7 416:48.34 prometheus 2670315 root 20 0 90548 45920 3620 R 6.3 0.1 2:16.28 nft 2719477 root 20 0 77788 45252 3572 R 6.3 0.1 0:06.07 nft 2630836 root 20 0 102328 47532 3840 R 6.0 0.1 6:42.57 nft 2652173 root 20 0 96752 46160 3836 R 6.0 0.1 3:40.10 nft 2660435 root 20 0 90548 46004 3772 R 5.6 0.1 2:50.65 nft 2618679 root 20 0 102572 47980 3880 R 5.3 0.1 11:31.10 nft 2665356 root 20 0 90548 46096 3860 R 5.3 0.1 2:30.95 nft 2712927 root 20 0 90548 46000 3764 R 5.3 0.1 0:18.51 nft 2616988 root 20 0 103296 48624 3800 R 5.0 0.1 12:27.95 nft 2625519 root 20 0 102244 47616 3840 R 5.0 0.1 8:13.98 nft 2650542 root 20 0 90548 46068 3836 R 5.0 0.1 3:46.58 nft 2694870 root 20 0 90548 46068 3836 R 5.0 0.1 1:00.83 nft 2707978 root 20 0 90532 46068 3836 R 5.0 0.1 0:27.24 nft 2620410 root 20 0 102304 47640 3808 R 4.7 0.1 10:31.64 nft 2622067 root 20 0 103632 49000 3836 R 4.7 0.1 9:35.03 nft 2637397 root 20 0 100876 46180 3796 R 4.7 0.1 5:28.79 nft 2647257 root 20 0 96752 46112 3780 R 4.7 0.1 4:01.83 nft 2648912 root 20 0 96752 46072 3752 R 4.7 0.1 4:08.45 nft 2657156 root 20 0 90548 46060 3808 R 4.7 0.1 3:15.75 nft 2658799 root 20 0 90548 45972 3740 R 4.7 0.1 3:06.88 nft 2668673 root 20 0 90548 46008 3772 R 4.7 0.1 2:22.91 nft 2686731 root 20 0 90548 46068 3836 R 4.7 0.1 1:19.20 nft 2689983 root 20 0 90524 46044 3812 R 4.7 0.1 1:13.95 nft 2704714 root 20 0 90548 46000 3764 R 4.7 0.1 0:35.78 nft 2709687 root 20 0 90548 45988 3756 R 4.7 0.1 0:26.04 nft 2711271 root 20 0 90548 46044 3812 R 4.7 0.1 0:22.96 nft 2623742 root 20 0 102388 47724 3808 R 4.3 0.1 8:56.21 nft 2634094 root 20 0 100876 46164 3776 R 4.3 0.1 5:57.05 nft 2667021 root 20 0 90548 45852 3620 R 4.3 0.1 2:23.67 nft 2671934 root 20 0 90548 46068 3836 R 4.3 0.1 2:14.16 nft 2685089 root 20 0 90548 45984 3752 R 4.3 0.1 1:25.33 nft 2706343 root 20 0 90548 45976 3740 R 4.3 0.1 0:32.82 nft 2627304 root 20 0 102684 47968 3812 R 4.0 0.1 7:25.48 nft 2653850 root 20 0 90548 46072 3836 R 4.0 0.1 3:26.21 nft 2663720 root 20 0 90548 46016 3784 R 4.0 0.1 2:42.30 nft 2680153 root 20 0 90548 45988 3752 R 4.0 0.1 1:36.95 nft 2681807 root 20 0 90548 46044 3812 R 4.0 0.1 1:35.82 nft 2683425 root 20 0 90548 46068 3836 R 4.0 0.1 1:33.73 nft 2693226 root 20 0 90548 46000 3768 R 4.0 0.1 1:04.78 nft 2696494 root 20 0 90548 46000 3764 R 4.0 0.1 0:56.00 nft 2699824 root 20 0 90548 46072 3836 R 4.0 0.1 0:48.42 nft 2714559 root 20 0 77260 44712 3564 R 4.0 0.1 0:15.21 nft 2639024 root 20 0 96752 46088 3764 R 3.7 0.1 4:57.81 nft 2643983 root 20 0 96752 46064 3748 R 3.7 0.1 4:34.95 nft 2645632 root 20 0 90548 46072 3836 R 3.7 0.1 4:21.11 nft 2675236 root 20 0 90548 46044 3808 R 3.7 0.1 1:57.68 nft 2678502 root 20 0 90548 46000 3768 R 3.7 0.1 1:49.10 nft 2698181 root 20 0 90548 46056 3824 R 3.7 0.1 0:52.71 nft 2703075 root 20 0 90548 46044 3808 R 3.7 0.1 0:38.60 nft Apologies for the trouble. Edy, would you please explore the potential solutions you had in mind? At the moment, this is what we are planning: - Introduce a timeout to the `nft` calls, which should assure the CNI command fails if `nft` is not responsive. - Investigate the option of moving the nftables rules into the pod network namespace. I would update on the investigation results soon. Verified on CNV 4.13.1 cnv-containernetworking-plugins-rhel9:v4.13.1-2 OS: Red Hat Enterprise Linux CoreOS 413.92.202305231734-0 (RHEL 9.2 based) 1. Create a linux bridge interface on a single node using this policy: apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: linux-bridge-ens11 spec: desiredState: interfaces: - name: test-br type: linux-bridge state: up ipv4: dhcp: true enabled: true bridge: options: stp: enabled: false port: - name: ens11 nodeSelector: kubernetes.io/hostname: c01-n-ys-4131o-k59g2-worker-0-lc5wc 2. Create NetworkAttachmentDefinition that utilizes the bridge interface: apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: annotations: k8s.v1.cni.cncf.io/resourceName: bridge.network.kubevirt.io/test-br name: test-br-nad namespace: yoss-ns spec: config: '{"cniVersion": "0.3.1", "name": "test-br", "type": "cnv-bridge", "bridge": "test-br", "macspoofchk":true,"ipam":{}}' 3. Create 14 similar VMs on the same node (the one where the bridge was created on section 1), with a secondary NIC backed by the NAD created in the previous section. All VMs were created by applying the attached VM spec yaml, with the only difference of VM name in each spec. $ oc apply -f vm1.yaml virtualmachine.kubevirt.io/vm1 created $ oc apply -f vm2.yaml virtualmachine.kubevirt.io/vm2 created ... 4. Start all VMs $ virtctl start vm1 VM vm14 was scheduled to start $ virtctl start vm2 VM vm2 was scheduled to start ... Note: After each VM start, I waited for the VM to be in running mode before running the next VM, for example: $ virtctl start vm5 VM vm5 was scheduled to start $ $ oc get vmi -w NAME AGE PHASE IP NODENAME READY vm1 8m57s Running 10.129.2.70 c01-n-ys-4131o-k59g2-worker-0-lc5wc True vm2 8m44s Running 10.129.2.72 c01-n-ys-4131o-k59g2-worker-0-lc5wc True vm3 8m31s Running 10.129.2.73 c01-n-ys-4131o-k59g2-worker-0-lc5wc True vm4 8m18s Running 10.129.2.75 c01-n-ys-4131o-k59g2-worker-0-lc5wc True vm5 2s Scheduling False vm5 4s Scheduled c01-n-ys-4131o-k59g2-worker-0-lc5wc False vm5 6s Scheduled c01-n-ys-4131o-k59g2-worker-0-lc5wc False vm5 6s Running 10.129.2.76 c01-n-ys-4131o-k59g2-worker-0-lc5wc False $ $ virtctl start vm6 VM vm6 was scheduled to start 5. On the node where all VMs run - run the script from the bug description in the background (. nft.sh &): for a in {1..500} do nft add table ip table$a for b in {1..500} do nft add chain ip table$a chain$b done done 6 After 10-15 minutes - I started shuttoing down all VMIs: $ virtctl stop vm1 VM vm1 was scheduled to stop $ $ virtctl stop vm2 VM vm2 was scheduled to stop ... 7. Every few minutes, for the next ~30 minutes, I checked top for: a. nft instances b. cnv-bridge instances c. CPU usage nft was seens every few socnds, in respect to the new nft trigger from the running script, but no "pile" os instances was obverved (I also verified it by checking for the nft in ps, and finding only one such process every few seconds). cnv-bridge was not seen at all. CPU usage remained solid, not increasing to the levels described in the BZ description (90%). For example - looking at it ~25-30 after shutting down VMs (while the nft script still runs), the CPU usage is ~16%. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 4.13.1 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:3686 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |