Description of problem: I have configured RHOSP 16.1 with OVN-DPDK and observing that PMD cores are running in kernel space which is getting resolved after reboot sometimes but utilization shows less. PMD cores are going from user to kernel space very frequently. Because of this NIC Performance throughput for OVN-DPDK is highly degraded. ~~~ [root@xxxxxxxxx-computeovsdpdk-0 heat-admin]# pidstat -t -p `pidof ovs-vswitchd` Linux 4.18.0-305.12.1.el8_4.x86_64 (overcloud-computeovsdpdk-0) 01/31/2022 _x86_64_ (128 CPU) 09:32:11 AM UID TGID TID %usr %system %guest %wait %CPU CPU Command 09:32:11 AM 993 2952 - 107.77 972.87 0.00 0.11 1080.64 0 ovs-vswitchd 09:32:11 AM 993 - 2952 0.79 0.23 0.00 0.11 1.02 0 |__ovs-vswitchd 09:32:11 AM 993 - 2953 0.00 0.00 0.00 0.00 0.00 56 |__eal-intr-thread 09:32:11 AM 993 - 2954 0.00 0.00 0.00 0.00 0.00 127 |__rte_mp_handle 09:32:11 AM 993 - 2955 0.00 0.00 0.00 0.00 0.00 32 |__lcore-worker-32 09:32:11 AM 993 - 2956 0.00 0.00 0.00 0.00 0.00 64 |__lcore-worker-64 09:32:11 AM 993 - 2957 0.00 0.00 0.00 0.00 0.00 96 |__lcore-worker-96 09:32:11 AM 993 - 2958 0.00 0.00 0.00 0.00 0.00 117 |__ovs-vswitchd 09:32:11 AM 993 - 2959 0.04 0.00 0.00 0.00 0.04 0 |__dpdk_watchdog1 09:32:11 AM 993 - 2961 0.01 0.22 0.00 0.02 0.23 0 |__urcu2 09:32:11 AM 993 - 2962 0.00 0.01 0.00 0.00 0.01 0 |__ct_clean4 09:32:11 AM 993 - 2963 0.00 0.00 0.00 0.00 0.00 0 |__ipf_clean3 09:32:11 AM 993 - 2968 0.00 0.00 0.00 0.00 0.00 0 |__handler6 09:32:11 AM 993 - 2969 0.06 0.12 0.00 0.00 0.18 0 |__revalidator5 09:32:11 AM 993 - 2972 11.96 87.92 0.00 0.00 99.89 6 |__pmd-c06/id:14 09:32:11 AM 993 - 2973 11.75 84.19 0.00 0.00 95.95 9 |__pmd-c09/id:13 09:32:11 AM 993 - 2974 1.84 98.08 0.00 0.00 99.92 11 |__pmd-c11/id:11 09:32:11 AM 993 - 2975 11.85 84.09 0.00 0.00 95.95 7 |__pmd-c07/id:12 09:32:11 AM 993 - 2976 11.88 88.06 0.00 0.00 99.93 8 |__pmd-c08/id:8 09:32:11 AM 993 - 2977 13.56 86.32 0.00 0.00 99.88 2 |__pmd-c02/id:10 09:32:11 AM 993 - 2978 13.54 86.37 0.00 0.00 99.91 4 |__pmd-c04/id:7 09:32:11 AM 993 - 2979 13.45 82.50 0.00 0.00 95.95 3 |__pmd-c03/id:9 09:32:11 AM 993 - 2980 1.76 94.18 0.00 0.00 95.95 10 |__pmd-c10/id:16 09:32:11 AM 993 - 2981 1.84 98.03 0.00 0.00 99.88 12 |__pmd-c12/id:15 09:32:11 AM 993 - 2982 13.41 82.54 0.00 0.00 95.95 5 |__pmd-c05/id:17 09:32:11 AM 993 - 2983 0.00 0.01 0.00 0.00 0.01 58 |__vhost_reconn 09:32:11 AM 993 - 2984 0.00 0.01 0.00 0.00 0.02 95 |__vhost-events ~~~ After reboot: ~~~ [root@xxxxxxxxx-computeovsdpdk-0 heat-admin]# pidstat -t -p `pidof ovs-vswitchd` Linux 4.18.0-305.12.1.el8_4.x86_64 (overcloud-computeovsdpdk-0) 02/02/2022 _x86_64_ (128 CPU) 10:49:20 AM UID TGID TID %usr %system %guest %wait %CPU CPU Command 10:49:20 AM 993 3007 - 906.94 1.16 0.00 0.04 908.10 0 ovs-vswitchd 10:49:20 AM 993 - 3007 1.32 1.08 0.00 0.04 2.41 0 |__ovs-vswitchd 10:49:20 AM 993 - 3008 0.00 0.00 0.00 0.00 0.00 124 |__eal-intr-thread 10:49:20 AM 993 - 3009 0.00 0.00 0.00 0.00 0.00 61 |__rte_mp_handle 10:49:20 AM 993 - 3010 0.00 0.00 0.00 0.00 0.00 32 |__lcore-worker-32 10:49:20 AM 993 - 3011 0.00 0.00 0.00 0.00 0.00 64 |__lcore-worker-64 10:49:20 AM 993 - 3012 0.00 0.00 0.00 0.00 0.00 96 |__lcore-worker-96 10:49:20 AM 993 - 3013 0.00 0.00 0.00 0.00 0.00 117 |__ovs-vswitchd 10:49:20 AM 993 - 3014 0.04 0.00 0.00 0.00 0.04 0 |__dpdk_watchdog1 10:49:20 AM 993 - 3016 0.01 0.03 0.00 0.01 0.04 0 |__urcu2 10:49:20 AM 993 - 3017 0.00 0.00 0.00 0.00 0.00 0 |__ct_clean4 10:49:20 AM 993 - 3018 0.00 0.00 0.00 0.00 0.00 0 |__ipf_clean3 10:49:20 AM 993 - 3023 0.00 0.00 0.00 0.01 0.00 0 |__handler6 10:49:20 AM 993 - 3025 0.03 0.00 0.00 0.01 0.03 0 |__revalidator5 10:49:20 AM 993 - 3029 89.84 0.00 0.00 0.01 89.84 6 |__pmd-c06/id:16 10:49:20 AM 993 - 3030 70.80 0.00 0.00 0.00 70.80 9 |__pmd-c09/id:15 10:49:20 AM 993 - 3031 92.98 0.01 0.00 0.00 92.99 11 |__pmd-c11/id:14 10:49:20 AM 993 - 3032 92.98 0.00 0.00 0.00 92.99 7 |__pmd-c07/id:9 10:49:20 AM 993 - 3033 74.31 0.00 0.00 0.00 74.31 8 |__pmd-c08/id:11 10:49:20 AM 993 - 3034 92.98 0.01 0.00 0.00 92.99 2 |__pmd-c02/id:12 10:49:20 AM 993 - 3035 86.28 0.00 0.00 0.00 86.28 4 |__pmd-c04/id:7 10:49:20 AM 993 - 3036 70.79 0.00 0.00 0.00 70.79 3 |__pmd-c03/id:10 10:49:20 AM 993 - 3037 92.98 0.01 0.00 0.00 92.99 10 |__pmd-c10/id:8 10:49:20 AM 993 - 3038 70.80 0.00 0.00 0.00 70.80 12 |__pmd-c12/id:13 10:49:20 AM 993 - 3039 70.79 0.00 0.00 0.00 70.79 5 |__pmd-c05/id:17 10:49:20 AM 993 - 3040 0.00 0.00 0.00 0.00 0.00 30 |__vhost_reconn 10:49:20 AM 993 - 3041 0.00 0.01 0.00 0.00 0.01 31 |__vhost-events ~~~ Further troubleshooting we noticed that the dpdk port was flapping: ~~~ less ovs-vswitchd.log | egrep "dpdk.*link" 2022-01-29T19:20:17.905Z|00019|dpdk|INFO|EAL: Detected static linkage of DPDK 2022-01-29T19:20:51.514Z|00016|dpdk|INFO|EAL: Detected static linkage of DPDK 2022-01-29T19:21:07.404Z|00116|bond|INFO|member dpdk0: link state up 2022-01-29T19:21:11.127Z|00119|bond|INFO|member dpdk1: link state up 2022-01-30T03:20:58.717Z|00016|dpdk|INFO|EAL: Detected static linkage of DPDK 2022-01-30T03:21:11.166Z|00229|bond|INFO|member dpdk0: link state down 2022-01-30T03:21:11.742Z|00233|bond|INFO|member dpdk0: link state up 2022-01-30T03:21:23.854Z|00279|bond|INFO|member dpdk1: link state up 2022-01-30T03:33:22.655Z|00016|dpdk|INFO|EAL: Detected static linkage of DPDK 2022-01-30T03:33:33.899Z|00235|bond|INFO|member dpdk0: link state down 2022-01-30T03:33:34.211Z|00239|bond|INFO|member dpdk0: link state up 2022-01-30T03:33:47.295Z|00289|bond|INFO|member dpdk1: link state up 2022-01-30T04:13:44.257Z|00593|bond|INFO|member dpdk1: link state down 2022-01-30T04:13:45.769Z|00595|bond|INFO|member dpdk1: link state up 2022-01-30T05:07:43.519Z|00016|dpdk|INFO|EAL: Detected static linkage of DPDK 2022-01-30T05:07:54.803Z|00235|bond|INFO|member dpdk0: link state down 2022-01-30T05:07:55.510Z|00239|bond|INFO|member dpdk0: link state up 2022-01-30T05:08:08.631Z|00262|bond|INFO|member dpdk1: link state up 2022-01-30T05:21:08.724Z|00016|dpdk|INFO|EAL: Detected static linkage of DPDK 2022-01-30T05:21:21.313Z|00226|bond|INFO|member dpdk0: link state down 2022-01-30T05:21:21.879Z|00230|bond|INFO|member dpdk0: link state up 2022-01-30T05:21:33.934Z|00247|bond|INFO|member dpdk1: link state up 2022-01-30T06:47:58.051Z|00016|dpdk|INFO|EAL: Detected static linkage of DPDK 2022-01-30T06:48:09.354Z|00236|bond|INFO|member dpdk0: link state down 2022-01-30T06:48:09.396Z|00238|bond|INFO|member dpdk0: link state up 2022-01-30T06:48:22.831Z|00263|bond|INFO|member dpdk1: link state up 2022-01-30T07:29:57.730Z|00016|dpdk|INFO|EAL: Detected static linkage of DPDK 2022-01-30T07:30:13.219Z|00234|bond|INFO|member dpdk0: link state down 2022-01-30T07:30:13.233Z|00236|bond|INFO|member dpdk0: link state up 2022-01-30T07:30:27.218Z|00261|bond|INFO|member dpdk1: link state up 2022-01-30T08:02:37.052Z|00588|bond|INFO|member dpdk1: link state down 2022-01-30T08:02:40.571Z|00590|bond|INFO|member dpdk1: link state up 2022-01-30T08:38:48.583Z|00592|bond|INFO|member dpdk0: link state down 2022-01-30T08:38:52.095Z|00595|bond|INFO|member dpdk0: link state up 2022-01-30T09:57:40.326Z|00016|dpdk|INFO|EAL: Detected static linkage of DPDK 2022-01-30T09:57:57.123Z|00228|bond|INFO|member dpdk0: link state down 2022-01-30T09:57:57.429Z|00232|bond|INFO|member dpdk0: link state up 2022-01-30T09:58:10.641Z|00249|bond|INFO|member dpdk1: link state up 2022-01-30T16:51:29.645Z|00584|bond|INFO|member dpdk1: link state down 2022-01-30T16:51:33.166Z|00586|bond|INFO|member dpdk1: link state up 2022-01-30T16:52:20.502Z|00588|bond|INFO|member dpdk0: link state down 2022-01-30T16:52:24.031Z|00591|bond|INFO|member dpdk0: link state up 2022-01-30T16:54:05.649Z|00593|bond|INFO|member dpdk1: link state down 2022-01-30T16:54:09.169Z|00596|bond|INFO|member dpdk1: link state up 2022-01-30T17:03:46.906Z|00598|bond|INFO|member dpdk0: link state down 2022-01-30T17:03:50.433Z|00601|bond|INFO|member dpdk0: link state up 2022-01-30T17:05:00.853Z|00603|bond|INFO|member dpdk1: link state down 2022-01-30T17:05:04.375Z|00606|bond|INFO|member dpdk1: link state up 2022-01-30T17:05:20.507Z|00608|bond|INFO|member dpdk0: link state down 2022-01-30T17:05:24.032Z|00611|bond|INFO|member dpdk0: link state up 2022-01-30T17:15:56.084Z|00613|bond|INFO|member dpdk1: link state down 2022-01-30T17:15:59.609Z|00616|bond|INFO|member dpdk1: link state up 2022-01-30T17:16:15.736Z|00618|bond|INFO|member dpdk0: link state down 2022-01-30T17:16:19.262Z|00621|bond|INFO|member dpdk0: link state up 2022-01-30T17:17:49.337Z|00623|bond|INFO|member dpdk0: link state down 2022-01-30T17:17:52.866Z|00625|bond|INFO|member dpdk0: link state up 2022-01-30T17:18:32.083Z|00627|bond|INFO|member dpdk1: link state down 2022-01-30T17:18:35.606Z|00630|bond|INFO|member dpdk1: link state up 2022-01-30T17:28:24.886Z|00632|bond|INFO|member dpdk1: link state down 2022-01-30T17:28:28.560Z|00634|bond|INFO|member dpdk1: link state up 2022-01-30T17:29:15.743Z|00636|bond|INFO|member dpdk0: link state down 2022-01-30T17:29:19.267Z|00639|bond|INFO|member dpdk0: link state up 2022-01-30T17:40:53.691Z|00641|bond|INFO|member dpdk1: link state down 2022-01-30T17:40:57.213Z|00644|bond|INFO|member dpdk1: link state up 2022-01-30T17:53:10.953Z|00646|bond|INFO|member dpdk0: link state down 2022-01-30T17:53:14.487Z|00649|bond|INFO|member dpdk0: link state up 2022-01-30T17:53:22.497Z|00651|bond|INFO|member dpdk1: link state down 2022-01-30T17:53:25.965Z|00654|bond|INFO|member dpdk1: link state up 2022-01-30T18:05:51.302Z|00656|bond|INFO|member dpdk1: link state down 2022-01-30T18:05:54.828Z|00658|bond|INFO|member dpdk1: link state up 2022-01-30T18:18:08.594Z|00660|bond|INFO|member dpdk0: link state down 2022-01-30T18:18:12.117Z|00663|bond|INFO|member dpdk0: link state up 2022-01-30T18:30:17.744Z|00665|bond|INFO|member dpdk1: link state down 2022-01-30T18:30:21.253Z|00668|bond|INFO|member dpdk1: link state up 2022-01-30T18:30:37.400Z|00670|bond|INFO|member dpdk0: link state down 2022-01-30T18:30:40.939Z|00673|bond|INFO|member dpdk0: link state up 2022-01-30T19:07:32.617Z|00675|bond|INFO|member dpdk0: link state down 2022-01-30T19:07:36.141Z|00677|bond|INFO|member dpdk0: link state up 2022-01-30T19:32:30.236Z|00679|bond|INFO|member dpdk0: link state down 2022-01-30T19:32:33.759Z|00681|bond|INFO|member dpdk0: link state up 2022-01-30T20:46:51.891Z|00683|bond|INFO|member dpdk0: link state down 2022-01-30T20:46:55.418Z|00685|bond|INFO|member dpdk0: link state up 2022-01-30T20:47:03.435Z|00687|bond|INFO|member dpdk1: link state down 2022-01-30T20:47:06.961Z|00690|bond|INFO|member dpdk1: link state up 2022-01-31T03:10:49.496Z|00692|bond|INFO|member dpdk1: link state down 2022-01-31T03:10:53.021Z|00694|bond|INFO|member dpdk1: link state up 2022-01-31T03:13:25.498Z|00696|bond|INFO|member dpdk1: link state down 2022-01-31T03:13:29.023Z|00698|bond|INFO|member dpdk1: link state up 2022-01-31T03:13:45.153Z|00700|bond|INFO|member dpdk0: link state down 2022-01-31T03:13:48.490Z|00703|bond|INFO|member dpdk0: link state up ~~~ Need some help to understand this behavior and how can we make sure the PDM cores runs in user space all/most of the time. Version-Release number of selected component (if applicable): puppet-ovn-15.5.0-2.20210601013539.a6b0f69.el8ost.2.noarch [we update the OVS version to latest during troubleshooting session] Additional info: PMD and VM cores are isolated properly Intel Card details: [root@xxxxxxxxx-computeovsdpdk-0 openvswitch]# lspci | grep -i ethernet 10:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) 10:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) 48:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) 48:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
(In reply to Sanjay Upadhyay from comment #31) > (In reply to rohit londhe from comment #30) > > (In reply to Sanjay Upadhyay from comment #27) > > > (In reply to Chris Fields from comment #26) > > > > Customer said this in the support case: > > > > > > > > I don't know of anything QE is doing for this bug. Sanjay, do you know of > > > > anything? > > > > > > No, we are not doing anything, unless there is a direction we should test. > > > AFAIK our testing with OVN DPDK, our performance results with latest 16.1 > > > compose RHOS-16.1-RHEL-8-20220315 are normal 1.4 mpps on DPDK geneve tunnel. > > > > For which frame size RHQE is seeing 1.4 mpps ? > > Hi Rohit, > > We use 64 byte packets. The DUT is a DPDK binded to single nic. I noticed > our DPDK tests are over a single nic and we have not tested performance over > a DPDK bond. > performance test done on the NFV side was proposed long back and may need to > be updated. (In reply to Sanjay Upadhyay from comment #29) > > > We run both balance-slb and balance-tcp on OVN deploymens. Both the bonds > > are ovs_dpdk_bond over ovs_user_bridge. > > this is only functional tests, and our DUT is not the bonded bridges. Customer is using frame sizes 512 bytes and above, Any thoughts on throughput performance?