Description of problem: Scenario: - Two virtual machines for load generation and reception on different compute nodes. - For case when both VMs (sender and receiver) are on same compute node it was able to achieve the 100% throughput capacity for larger packet sizes and the packet drop was almost none. - The traffic flow is unidirectional i.e. from one VM (generation) towards the second VM (reception). - Emulator threads policy set to isolated with NICs alignment on the same NUMA node. - There are separated the IsolCPUs that are used for PMD threads and VMs, so none of PMD thread core will be shared for handling interrupts. - The CPUs for handling OS processes are being separate than IsolCPUs. - DPDK socket memory was set for NUMA node 1 to 8192. This tuning gave the best results but still not achieving the 100% throughput capacity and packet loss is still there. - Bios Settings seems aligned with our recommendation[1] Customer is observing the consistent packet drop at physical DPDK ports. While the packet drop for different packet sizes is inconsistent. Due to which the overall output is not as expected. Implementing different tuning parameters, the throughput has considerably improved but not as expected, also inconsistent packet drop for different packet sizes (64, 1024, 2048, 8192) is still observed. The PMD threads are running on the same NUMA node. The vCPUs allocated to the VMs are also from the same NUMA. The following tunings were made: 1. Changing the Rx & Tx size of physical DPDK ports (but no effective improvements) 2. Changing the Rx & Tx queue numbers (but no effective improvements) 3. Emulator threads isolated (throughput increased) 4. NICs alignment changed from different NUMA to same NUMA node(throughput increased with inconsistent packet loss for different packet sizes) From our end we tried some other approaches: - The dpdk userspace bridge had link up; It was then modified to 'down'[2] - It was noticed a 2x improvement in forwarding rate for 1024 and above packet sizes between 1st and 2nd iteration. - The same performance improvement was not seen for 64bytes packets. - The bond mode on the compute nodes was changed from balance-tcp to balance-slb We don't see any saturation of cpu resources for PMD thread. -------------------------------------------------------------------------------------------------------------------------------------------- Compute 0: sos_commands/openvswitch/ovs-appctl_dpif-netdev.pmd-rxq-show pmd thread numa_id 1 core_id 3: isolated : false port: vhudcbc8718-7b queue-id: 0 pmd usage: 28 % Also since this instance is running on NUMA1 , the PMD that is polling its port is also from NUMA1 (core_id 3) and the instance is sending traffic over a port from provider network mapped to ovs-dpdk bridge on NUMA1. Compute 1: sos_commands/openvswitch/ovs-appctl_dpif-netdev.pmd-rxq-show pmd thread numa_id 1 core_id 3: isolated : false port: dpdk3 queue-id: 0 pmd usage: 16 % -------------------------------------------------------------------------------------------------------------------------------------------- [1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/network_functions_virtualization_planning_and_configuration_guide/index#review_bios_settings [2] https://access.redhat.com/solutions/3381011 Version-Release number of selected component (if applicable): - openvswitch-2.9.0-103.el7fdp.x86_64 How reproducible: Repeatedly Steps to Reproduce: 1. 2. 3. Actual results: Expected results: The same results for different compute nodes as it was obtained in case of same compute node. Additional info: The logs are on supportshell under /cases/02694622