Bug 1360431

Summary: OVS Remove PMD latency on seq_mutex
Product: Red Hat Enterprise Linux 7 Reporter: Flavio Leitner <fleitner>
Component: openvswitchAssignee: Flavio Leitner <fleitner>
Status: CLOSED CURRENTRELEASE QA Contact: Jean-Tsung Hsiao <jhsiao>
Severity: high Docs Contact:
Priority: high    
Version: 7.3CC: atheurer, atragler, fleitner, jhsiao, jniu, xzhou
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openvswitch-2.5.0-7.git20160727.el7fdb Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-01-12 17:50:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1342229    

Description Flavio Leitner 2016-07-26 17:27:59 UTC
Description of problem:

Current openvswitch-2.5.0-5.git20160628.el7fdb causes latency spikes on seq_mutex.

The following upstream patch fixes the issue:
commit 2e6a1eae96615cab458757d62c95ce9993df7202
Author: Flavio Leitner <fbl>
Date:   Tue Jul 5 10:33:38 2016 -0300

    dpif-netdev: Remove PMD latency on seq_mutex
    
    The PMD thread needs to keep processing RX queues in order
    to achieve maximum throughput. It also needs to sweep emc
    cache and quiesce which use seq_mutex. That mutex can
    eventually block the PMD thread causing latency spikes and
    affecting the throughput.
    
    Since there is no requirement for running those tasks at a
    specific time, this patch extend seq API to allow tentative
    locking instead.
    
    Reported-by: Karl Rister <krister>
    Co-authored-by: Karl Rister <krister>
    Signed-off-by: Flavio Leitner <fbl>
    Signed-off-by: Daniele Di Proietto <diproiettod>

Comment 4 Flavio Leitner 2016-08-01 12:57:07 UTC
*** Bug 1299635 has been marked as a duplicate of this bug. ***

Comment 7 Flavio Leitner 2016-09-09 17:43:56 UTC
Hi Jean,

On a OVS-DPDK Phy-to-Phy envinromnent, measure the packet latency.
Before the patch you will see significant spikes in latency and no spikes after the patch.

https://github.com/fleitner/XenaPythonLib/blob/master/examples/latency.py

Comment 8 Jean-Tsung Hsiao 2016-09-12 14:20:41 UTC
(In reply to Flavio Leitner from comment #7)
> Hi Jean,
> 
> On a OVS-DPDK Phy-to-Phy envinromnent, measure the packet latency.
> Before the patch you will see significant spikes in latency and no spikes
> after the patch.
> 
> https://github.com/fleitner/XenaPythonLib/blob/master/examples/latency.py

Hi Flavio,

Please see the numbers attached below.

There is not much difference between ovs-2.5.0-7 and ovs-2.5.0-5.

Each has 500+ us Max latancy, but average latency about the same.

Please advise.

Thanks!

Jean

== ovs-2.5.0-5 ==

Test 1
Mon Sep 12 00:15:40 EDT 2016
INFO:root:Packet: Using sample packet
INFO:root:XenaSocket: Connected
INFO:root:XenaManager: Logged succefully
INFO:root:XenaPort: 1/0 starting traffic
INFO:root:XenaPort: 1/0 stopping traffic
Average latency: 3599.00 ns
Max latency: 530192.00 ns
Min latency: 0.00 ns
Test 2
Mon Sep 12 00:21:11 EDT 2016
INFO:root:Packet: Using sample packet
INFO:root:XenaSocket: Connected
INFO:root:XenaManager: Logged succefully
INFO:root:XenaPort: 1/0 starting traffic
INFO:root:XenaPort: 1/0 stopping traffic
Average latency: 3593.00 ns
Max latency: 21485.00 ns
Min latency: 0.00 ns
Test 3
Mon Sep 12 00:26:44 EDT 2016
INFO:root:Packet: Using sample packet
INFO:root:XenaSocket: Connected
INFO:root:XenaManager: Logged succefully
INFO:root:XenaPort: 1/0 starting traffic
INFO:root:XenaPort: 1/0 stopping traffic
Average latency: 3591.00 ns
Max latency: 21926.00 ns
Min latency: 0.00 ns
 

== ovs-2.5.0-7 ==

Test 1
Mon Sep 12 00:40:28 EDT 2016
INFO:root:Packet: Using sample packet
INFO:root:XenaSocket: Connected
INFO:root:XenaManager: Logged succefully
INFO:root:XenaPort: 1/0 starting traffic
INFO:root:XenaPort: 1/0 stopping traffic
Average latency: 3604.00 ns
Max latency: 566185.00 ns
Min latency: 0.00 ns
Test 2
Mon Sep 12 00:46:00 EDT 2016
INFO:root:Packet: Using sample packet
INFO:root:XenaSocket: Connected
INFO:root:XenaManager: Logged succefully
INFO:root:XenaPort: 1/0 starting traffic
INFO:root:XenaPort: 1/0 stopping traffic
Average latency: 3607.00 ns
Max latency: 18767.00 ns
Min latency: 0.00 ns
Test 3
Mon Sep 12 00:51:32 EDT 2016
INFO:root:Packet: Using sample packet
INFO:root:XenaSocket: Connected
INFO:root:XenaManager: Logged succefully
INFO:root:XenaPort: 1/0 starting traffic
INFO:root:XenaPort: 1/0 stopping traffic
Average latency: 3604.00 ns
Max latency: 15183.00 ns
Min latency: 0.00 ns
[root@netqe-infra01 XenaScripts.old]#


[root@netqe5 dpdk-multique-scripts]# cat ovs_config_dpdk0_dpdk1.sh
ovs-vsctl --if-exists del-br ovsbr0
ovs-vsctl set Open_vSwitch . other_config={}
ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=4
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xaa0000
ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
ovs-vsctl add-port ovsbr0 dpdk0  \
    -- set interface dpdk0 type=dpdk ofport_request=10
ovs-vsctl add-port ovsbr0 dpdk1 \
    -- set interface dpdk1 type=dpdk ofport_request=11

ovs-ofctl del-flows ovsbr0
ovs-ofctl add-flow ovsbr0 in_port=10,actions=output:11
ovs-ofctl add-flow ovsbr0 in_port=11,actions=output:10
ovs-ofctl dump-flows ovsbr0

echo 0 > /proc/sys/kernel/watchdog

[root@netqe5 jhsiao]# cat set_non_pmd_cpu_range.sh
NON_PMD_CPU_RANGE="13,15"
OVS_PID="$(pidof ovs-vswitchd)"
for pid in $(ps -e -T  | grep ${OVS_PID} | grep -v 'pmd' | awk '{ print
$2 }')
do
    taskset -p -c ${NON_PMD_CPU_RANGE} ${pid}
done
 
tuna -t ovs-vswitchd -CP

Comment 9 Jean-Tsung Hsiao 2016-09-13 04:54:41 UTC
*** Verification Procedure ***

As the latency can be impacted by many factors, Flavio has suggested the following verification procedure:

The idea is to set up and get ready, then attach perf recording the
number of scheduling events:

1. Get everything ready

2. Push traffic, wait a couple seconds to stabilizes

3. attach perf
   # perf record -e sched:sched_switch -g -C <PMD CPU> sleep 10

4. perf will exit in 10 seconds

5. Review the report:
   # perf report -g  --no-children --stdio


You have reproduced if you see this backtrace:
       50.00%  pmd36    [kernel.kallsyms]  [k] __schedule
              |
              ---__schedule
                 schedule
                 futex_wait_queue_me
                 futex_wait
                 do_futex
                 sys_futex
                 tracesys
                 __lll_lock_wait


*** Verification ***

Ran "python latency.py" with ovs-2.5.0-5 saw the mutex trace for all pmd cores.

But, with  ovs-2.5.0-7 there were no such trace for all pmd cores.

Thus, the fix has been verified.