Bug 1472854

Summary: [fdProd] Update OVS to 2.7.2
Product: Red Hat Enterprise Linux 7 Reporter: Timothy Redaelli <tredaelli>
Component: openvswitchAssignee: Timothy Redaelli <tredaelli>
Status: CLOSED ERRATA QA Contact: ovs-qe
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4CC: atelang, atragler, ctrautma, fleitner, ovs-team, pezhang, qding
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
URL: https://mail.openvswitch.org/pipermail/ovs-announce/2017-July/000239.html
Whiteboard:
Fixed In Version: openvswitch-2.7.2-1.git20170719.el7fdp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-03 12:35:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Timothy Redaelli 2017-07-19 14:20:27 UTC
From ovs-announce mailing list:

"Due to an issue introduced in the previous release, it is highly recommended that users of 2.7.1 upgrade."

commit 9a03615624972142e750484950d7348190d68b8b
Author: Han Zhou <zhouhan>
Date:   Thu Jul 13 23:44:51 2017 -0700

    Revert "netdev: Fix netdev_open() to adhere to class type if given"
    
    This reverts commit d3b8f5052292b3ba9084ffed097e90b87f2950f5.
    
    The commit introduced a problem that "File exists" will be reported
    when trying to open br0.
    
    The operation that adds eth0 to br0 while moving IP address from
    eth0 to bridge internal interface br0 reproduces this issue.
    
    $ ip a del <ip> dev eth0; ip a add <ip> dev br0; ovs-vsctl add-port br0 eth0
    $ ovs-dpctl show
    ...
    port 1: br0 (internal: open failed (File exists))
    ...
    
    At this point restarting OVS will result in connection lost for the
    node.
    
    Reverting the change fixes the problem. Since adding physical interface
    to OVS bridge is quite normal operation, the problem is more severe
    than the original problem fixed by commit d3b8f5052, so revert this
    before a better fix is found for the original problem.
    
    Signed-off-by: Han Zhou <zhouhan>
    Signed-off-by: Justin Pettit <jpettit>

Comment 4 qding 2017-07-26 08:59:44 UTC
Verified with openvswitch-2.7.2-1.git20170719.el7fdp

Reference to https://bugzilla.redhat.com/show_bug.cgi?id=1451911#c10

Comment 5 Pei Zhang 2017-07-28 11:04:23 UTC
Live migration with single queue testing: PASS

Versions:
3.10.0-693.el7.x86_64
qemu-kvm-rhev-2.9.0-16.el7.x86_64
libvirt-3.2.0-14.el7.x86_64
openvswitch-2.7.2-1.git20170719.el7fdp.x86_64
tuned-profiles-cpu-partitioning-2.8.0-5.el7.noarch
dpdk-17.05-3.el7fdb.x86_64


==Scenario 1: live migration with vhost-user single queue==

Check downtime, totaltime, ping/MoonGen packets loss during live migration, ooks good.

Host&Guest: 1G hugepage size
Guest: 8G memory 
MoonGen flow: 3Mpps/bidirectional


(1) openvswitch acts as vhostuser server mode:
No Stream_Rate Downtime(ms) Totaltime(ms) Ping_Loss Moongen_Loss 
 0       3Mpps      157     12878          18      1121785 
 1       3Mpps      148     12855          17      1071197 
 2       3Mpps      156     13035          18      1129271 
 3       3Mpps      151     12893          16      1093942 
 4       3Mpps      159     12826          18      1138983 
 5       3Mpps      161     13065          18      1150769 
 6       3Mpps      165     12863          19      1179172 
 7       3Mpps      146     12888          17      1150652 
 8       3Mpps      165     12869          18      1179632 
 9       3Mpps      161     12896          18      1163880

(2) openvswitch acts as vhostuser client mode:
No Stream_Rate Downtime(ms) Totaltime(ms) Ping_Loss Moongen_Loss 
 0       3Mpps      154     15419          18      1135606 
 1       3Mpps      149     15199          16      1088184 
 2       3Mpps      159     14750          18      1147670 
 3       3Mpps      162     15580          18      1165776 
 4       3Mpps      160     15083          17      1154604 
 5       3Mpps      162     14918          18      1183521 
 6       3Mpps      155     15485          18      1123589 
 7       3Mpps      162     15329          18      1153911 
 8       3Mpps      163     14892          19      1173294 
 9       3Mpps      161     15719          18      1148970


==Scenario 2: live migration with vhost-user 2 queues==
Will be continued in next Comment..

Comment 6 Pei Zhang 2017-08-01 15:32:16 UTC
Live migration with 2 queues testing: PASS

==Scenario 2: live migration with vhost-user 2 queues==

Versions:
3.10.0-693.el7.x86_64
qemu-kvm-rhev-2.9.0-16.el7.x86_64
libvirt-3.2.0-14.el7.x86_64
openvswitch-2.7.2-1.git20170719.el7fdp.x86_64
tuned-profiles-cpu-partitioning-2.8.0-5.el7.noarch
dpdk-17.05-3.el7fdb.x86_64

Default parameters:
Traffic Generator: MoonGen      
Acceptable Loss: 0.002%        
Frame Size: 64Byte                    
Bidirectional: Yes          
Search run time:60s                         
Validation run time: 30s              
Virtio features: default             
CPU: Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz      
NIC: 10-Gigabit X540-AT2


Check throughput value after each migration, looks good. Below is ping-pong migration.

(1) openvswitch acts as vhostuser server mode:
run 1: 18.519083Mpps (throughput before migration)
run 2: 19.084536Mpps (throughput after migrating from host1 to host2 host)
run 3: 18.519032Mpps (throughput after migrating from host2 to host1 host)


(2) openvswitch acts as vhostuser client mode:
run 1: 19.084477Mpps (throughput before migration)
run 2: 19.084590Mpps (throughput after migrating from host1 to host2 host)
run 3: 19.763328Mpps (throughput after migrating from host2 to host1 host)


Note: As qemu bug[1] exists, that's why only check throughput value here.

[1]Bug 1450680 - Migrating guest with vhost-user 2 queues and packets flow over dpdk+openvswitch fails: guest hang, and qemu hang or crash

Comment 8 errata-xmlrpc 2017-08-03 12:35:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2418