The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1947398 - [ovn-controller] ovn-controller should update OF rules atomically
Summary: [ovn-controller] ovn-controller should update OF rules atomically
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: FDP 21.C
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Ilya Maximets
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-08 12:00 UTC by Ilya Maximets
Modified: 2021-07-29 20:05 UTC (History)
5 users (show)

Fixed In Version: ovn-2021-21.06.0-3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-29 20:05:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1947056 1 high CLOSED [ovn-controller] Packet drops when using logical_dp_groups when a lflow dp_group is updated. 2024-12-20 19:51:55 UTC
Red Hat Product Errata RHBA-2021:2969 0 None None None 2021-07-29 20:05:13 UTC

Description Ilya Maximets 2021-04-08 12:00:49 UTC
Currently, ovn-controller updates OF rules one by one.  This means that
if update of logical flows requires to remove one OF rule and add a different
one, ovn-controller will remove the rule first and add new one later.
This might lead to a case where service provided by the old rule will
no longer work until the new rule is installed leading to the dataplane
downtime and packet loss.  This might be significant for a large scale
setups with high number of OF rules.

To avoid this problem ovn-controller should add all the flow modifications
to an OF bundle and commit the bundle with all changes atomically.  This
way there will be no time period where no relevant OF rules installed.

This might also relieve some pressure on ovs-vswitchd that will not need
to create a new version of OF tables for each flow update and trigger
revalidation.

Comment 1 Ilya Maximets 2021-04-08 12:45:48 UTC
Sent to the mail-list fro review:
  https://patchwork.ozlabs.org/project/ovn/patch/20210408123112.678123-1-i.maximets@ovn.org/

Comment 3 Jianlin Shi 2021-06-08 09:46:40 UTC
reproduced with following script on ovn2.13-20.21.0-135.el7:

systemctl start openvswitch                                 
systemctl start ovn-northd                           
ovn-nbctl set-connection ptcp:6641   
ovn-sbctl set-connection ptcp:6642
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:127.0.0.1:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=127.0.0.1
systemctl restart ovn-controller

ovn-nbctl --wait=hv set NB_Global . options:use_logical_dp_groups=true
# Start ovn-nbctl daemon mode:
export OVN_NB_DAEMON=$(ovn-nbctl --detach)

# Enable vconn debug logs (ovn-controller to ovs-vswitchd openflow connection)
ovn-appctl -t ovn-controller vlog/disable-rate-limit vconn
ovn-appctl -t ovn-controller vlog/set vconn:dbg

if1=tap30687bca-dd
if2=tapf5637489-e3
ovn-nbctl ls-add ls1 
ovn-nbctl lsp-add ls1 $if1 
ovn-nbctl lsp-set-addresses $if1 "fa:16:3e:3a:25:31 10.0.126.50"

ovn-nbctl lsp-add ls1 $if2 
ovn-nbctl lsp-set-addresses $if2 "fa:16:3e:75:69:6e 10.0.126.80"

ovs-vsctl add-port br-int $if1 -- set interface $if1 type=internal external_ids:iface-id=$if1
ovs-vsctl add-port br-int $if2 -- set interface $if2 type=internal external_ids:iface-id=$if2

# Bind two of the OVS ports to OVN:
ip netns add vm1
ip link set $if1 netns vm1
ip netns exec vm1 ip link set $if1 address fa:16:3e:3a:25:31
ip netns exec vm1 ip addr add 10.0.126.50/24 dev $if1
ip netns exec vm1 ip link set $if1 up

ip netns add vm2
ip link set $if2 netns vm2
ip netns exec vm2 ip link set $if2 address fa:16:3e:75:69:6e
ip netns exec vm2 ip addr add 10.0.126.80/24 dev $if2
ip netns exec vm2 ip link set $if2 up

# Start continuous ping from one port to the other, e.g.: vm1 -> vm2
ip netns exec vm1 ping 10.0.126.80 -i 0.1  &> ping.log &
ping_pid=$!

# Add an unrelated logical switch with an internal OVS port attached to it:
ovs-vsctl add-port br-int vm-test -- set interface vm-test type=internal -- set interface vm-test external_ids:iface-id=vm-test                                                                            

# In a loop, simulate CMS changes to the topology by removing and adding the
# unrelated logical switch:
for i in {1..10}
do
  ovn-nbctl --wait=hv ls-add ls -- lsp-add ls vm-test
ovn-sbctl list logical_dp_group
  sleep 5
  ovn-nbctl --wait=hv ls-del ls
ovn-sbctl list logical_dp_group
  sleep 5
done

kill -2 $ping_pid
tail ping.log

[root@dell-per740-12 bz1947398]# rpm -qa | grep -E "openvswitch2.13|ovn2.13"
openvswitch2.13-2.13.0-96.el7fdp.x86_64
ovn2.13-host-20.12.0-135.el7fdp.x86_64
ovn2.13-central-20.12.0-135.el7fdp.x86_64
ovn2.13-20.12.0-135.el7fdp.x86_64

+ tail ping.log
64 bytes from 10.0.126.80: icmp_seq=1011 ttl=64 time=0.047 ms
64 bytes from 10.0.126.80: icmp_seq=1012 ttl=64 time=0.048 ms
64 bytes from 10.0.126.80: icmp_seq=1013 ttl=64 time=0.046 ms
64 bytes from 10.0.126.80: icmp_seq=1014 ttl=64 time=0.047 ms
64 bytes from 10.0.126.80: icmp_seq=1015 ttl=64 time=0.048 ms
64 bytes from 10.0.126.80: icmp_seq=1016 ttl=64 time=0.048 ms

--- 10.0.126.80 ping statistics ---
1016 packets transmitted, 1014 received, 0% packet loss, time 101510ms
rtt min/avg/max/mdev = 0.017/0.043/4.073/0.127 ms

<=== 2 packets lost

Comment 7 Jianlin Shi 2021-07-08 06:36:52 UTC
Verified on ovn-2021-21.06.0-4:

64 bytes from 10.0.126.80: icmp_seq=971 ttl=64 time=0.050 ms                                          
64 bytes from 10.0.126.80: icmp_seq=972 ttl=64 time=0.050 ms                                          
64 bytes from 10.0.126.80: icmp_seq=973 ttl=64 time=0.051 ms                                          
64 bytes from 10.0.126.80: icmp_seq=974 ttl=64 time=0.050 ms                                          
64 bytes from 10.0.126.80: icmp_seq=975 ttl=64 time=0.050 ms                                          
64 bytes from 10.0.126.80: icmp_seq=976 ttl=64 time=0.050 ms                                          
                                                                                                      
--- 10.0.126.80 ping statistics ---                                                                   
976 packets transmitted, 976 received, 0% packet loss, time 101394ms                                  
rtt min/avg/max/mdev = 0.038/0.050/1.046/0.032 ms                                                     
[root@wsfd-advnetlab16 bz1947398]# rpm -qa | grep -E "openvswitch2.15|ovn-2021"
ovn-2021-21.06.0-4.el8fdp.x86_64                                                                      
openvswitch2.15-2.15.0-26.el8fdp.x86_64                                                               
ovn-2021-central-21.06.0-4.el8fdp.x86_64                                                              
python3-openvswitch2.15-2.15.0-26.el8fdp.x86_64                                                       
ovn-2021-host-21.06.0-4.el8fdp.x86_64

Comment 9 errata-xmlrpc 2021-07-29 20:05:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2969


Note You need to log in before you can comment on or make changes to this bug.