Bug 1401436

Summary: lockless en-queuing for vhost
Product: Red Hat Enterprise Linux 7 Reporter: jason wang <jasowang>
Component: kernelAssignee: Wei <wexu>
kernel sub component: Virtualization QA Contact: Quan Wenli <wquan>
Status: CLOSED ERRATA Docs Contact: Jiri Herrmann <jherrman>
Severity: unspecified    
Priority: high CC: ailan, chayang, juzhang, michen, mtessun, pezhang, weliao, wexu, wquan, yama
Version: 7.4Keywords: FutureFeature
Target Milestone: rc   
Target Release: 7.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-3.10.0-628.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-02 04:53:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1395265    

Description jason wang 2016-12-05 09:16:22 UTC
Description of problem:

The following patches needs to be backported:

commit 04b96e5528ca97199b429810fe963185a67dd40e
Author: Jason Wang <jasowang>
Date:   Mon Apr 25 22:14:33 2016 -0400

    vhost: lockless enqueuing
    
    We use spinlock to synchronize the work list now which may cause
    unnecessary contentions. So this patch switch to use llist to remove
    this contention. Pktgen tests shows about 5% improvement:
    
    Before:
    ~1300000 pps
    After:
    ~1370000 pps
    
    Signed-off-by: Jason Wang <jasowang>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Michael S. Tsirkin <mst>

commit 7235acdb1144460d9f520f0d931f3cbb79eb244c
Author: Jason Wang <jasowang>
Date:   Mon Apr 25 22:14:32 2016 -0400

    vhost: simplify work flushing
    
    We used to implement the work flushing through tracking queued seq,
    done seq, and the number of flushing. This patch simplify this by just
    implement work flushing through another kind of vhost work with
    completion. This will be used by lockless enqueuing patch.
    
    Signed-off-by: Jason Wang <jasowang>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Michael S. Tsirkin <mst>

Testing:
- compare the pktgen/netperf performance

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Wei 2017-03-15 10:18:34 UTC
Downstream test result on my laptop:
Before:
tap2 RX  1564831 pkts/s RX Dropped: 0 pkts/s
tap1 TX  2180650 pkts/s TX Dropped: 1677842 pkts/s

After:
tap2 RX  1582509 pkts/s RX Dropped: 0 pkts/s
tap1 TX  2232357 pkts/s TX Dropped: 1702915 pkts/s

Comment 3 Rafael Aquini 2017-03-25 00:27:16 UTC
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing

Comment 5 Rafael Aquini 2017-03-27 13:16:42 UTC
Patch(es) available on kernel-3.10.0-628.el7

Comment 8 Wei 2017-04-24 15:43:48 UTC
Hi Jiri,
I just had a quick look at previous release note for RHEL7.0, since this bz is a performance improvement which differs from a new feature, it is good to keep it out of the release note AFAICT.

Comment 9 xiywang 2017-05-23 02:29:14 UTC
Hi Wenli,

Could you help to do performance test?

Thanks,
Xiyue

Comment 10 Quan Wenli 2017-05-24 08:17:52 UTC
(In reply to xiywang from comment #9)
> Hi Wenli,
> 
> Could you help to do performance test?
> 
> Thanks,
> Xiyue

Pps performance indeed improve with kernel-3.10.0-628, so set it to verified. 

Steps:
1.run pktgen.sh on tap0 device. 
2.gather pps result on guest. 


host kernel           pkts/s
------------------+--------------+
kernel-3.10.0-627    1130227
------------------+--------------+
kernel-3.10.0-628    1166072
------------------+--------------+

Comment 13 errata-xmlrpc 2017-08-02 04:53:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1842