The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1947056 - [ovn-controller] Packet drops when using logical_dp_groups when a lflow dp_group is updated.
Summary: [ovn-controller] Packet drops when using logical_dp_groups when a lflow dp_gr...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.13
Version: FDP 20.H
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Dumitru Ceara
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks: 1946420
TreeView+ depends on / blocked
 
Reported: 2021-04-07 15:01 UTC by Dumitru Ceara
Modified: 2024-12-20 19:51 UTC (History)
10 users (show)

Fixed In Version: ovn2.13-20.12.0-108.el8fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-21 14:44:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1212 0 None None None 2023-01-20 10:15:50 UTC
Red Hat Product Errata RHBA-2021:2507 0 None None None 2021-06-21 14:46:02 UTC

Internal Links: 1947398

Comment 10 Ilya Maximets 2021-04-08 12:48:28 UTC
BZ to track use OF bundles:
  https://bugzilla.redhat.com/show_bug.cgi?id=1947398

Comment 11 Dumitru Ceara 2021-04-08 18:47:21 UTC
Fix posted for review: http://patchwork.ozlabs.org/project/ovn/list/?series=238154&state=*

Comment 12 ffernand 2021-04-12 17:19:40 UTC
(In reply to Dumitru Ceara from comment #11)
> Fix posted for review:
> http://patchwork.ozlabs.org/project/ovn/list/?series=238154&state=*

V2 posted:
http://patchwork.ozlabs.org/project/ovn/list/?series=238728

Comment 13 Dan Williams 2021-04-20 19:58:23 UTC
Fix accepted upstream; waiting on backport downstream.

Comment 18 Jianlin Shi 2021-06-04 07:38:55 UTC
test with following script:

systemctl start openvswitch
# conf.db in attachment
# all db files is for openvswitch2.15.
cp -f new_version/conf.db /etc/openvswitch/conf.db
systemctl restart openvswitch                               
systemctl start ovn-northd    
# ovnnb_db.db in attachment                       
cp -f new_version/ovnnb_db.db  /var/lib/ovn/ovnnb_db.db
systemctl restart ovn-northd
       
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642                
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:127.0.0.1:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=127.0.0.1
systemctl restart ovn-controller
            
#ovn-nbctl --wait=hv set NB_Global . options:use_logical_dp_groups=true
# Start ovn-nbctl daemon mode:
export OVN_NB_DAEMON=$(ovn-nbctl --detach)

# Enable vconn debug logs (ovn-controller to ovs-vswitchd openflow connection)
ovn-appctl -t ovn-controller vlog/disable-rate-limit vconn
ovn-appctl -t ovn-controller vlog/set vconn:dbg

if1=tap30687bca-dd                                                         
if2=tapf5637489-e3                                                                                                                                                                                         

# Bind two of the OVS ports to OVN:                                         
ip netns add vm1           
ip link set $if1 netns vm1
ip netns exec vm1 ip link set $if1 address fa:16:3e:3a:25:31
ip netns exec vm1 ip addr add 10.0.126.50/24 dev $if1
ip netns exec vm1 ip link set $if1 up
          
ip netns add vm2     
ip link set $if2 netns vm2      
ip netns exec vm2 ip link set $if2 address fa:16:3e:75:69:6e
ip netns exec vm2 ip addr add 10.0.126.80/24 dev $if2
ip netns exec vm2 ip link set $if2 up
                  
while :                         
do
        if ip netns exec vm1 ping 10.0.126.80 -c 1
        then
                break
        else    
                sleep 1
        fi      
done 

# Start continuous ping from one port to the other, e.g.: vm1 -> vm2
ip netns exec vm1 ping 10.0.126.80 -i 0.1  &> ping.log &
ping_pid=$!          
                                
# Add an unrelated logical switch with an internal OVS port attached to it:
ovs-vsctl add-port br-int vm-test -- set interface vm-test type=internal -- set interface vm-test external_ids:iface-id=vm-test
                                     
# In a loop, simulate CMS changes to the topology by removing and adding the
# unrelated logical switch:     
for i in {1..3}
do                                                
  ovn-nbctl ls-add ls -- lsp-add ls vm-test
#ovn-sbctl list logical_dp_group
  sleep 10      
  ovn-nbctl ls-del ls  
#ovn-sbctl list logical_dp_group
  sleep 10
done
                                                                    
kill -2 $ping_pid
tail ping.log

reproduced on ovn2.13-20.12.0-104.el8:

--- 10.0.126.80 ping statistics ---
414 packets transmitted, 207 received, 50% packet loss, time 60076ms
rtt min/avg/max/mdev = 0.044/0.066/0.853/0.056 ms

Verified on ovn2.13-20.12.0-135.el8:

--- 10.0.126.80 ping statistics ---
579 packets transmitted, 579 received, 0% packet loss, time 60054ms
rtt min/avg/max/mdev = 0.016/0.051/0.664/0.033 ms

[root@dell-per730-03 bz1947056]# rpm -qa | grep -E "openvswitch2.15|ovn2.13"
openvswitch2.15-2.15.0-23.el8fdp.x86_64
ovn2.13-20.12.0-135.el8fdp.x86_64
ovn2.13-central-20.12.0-135.el8fdp.x86_64
ovn2.13-host-20.12.0-135.el8fdp.x86_64

Comment 19 Jianlin Shi 2021-06-04 07:52:28 UTC
also verified on ovn-2021-21.03.0-40.el8fdp.x86_64:

[root@dell-per730-03 bz1947056]# rpm -qa | grep -E "openvswitch2.15|ovn-2021"                         
openvswitch2.15-2.15.0-23.el8fdp.x86_64                                                               
ovn-2021-central-21.03.0-40.el8fdp.x86_64                                                             
ovn-2021-21.03.0-40.el8fdp.x86_64                                                                     
ovn-2021-host-21.03.0-40.el8fdp.x86_64  

--- 10.0.126.80 ping statistics ---                                                                   
579 packets transmitted, 579 received, 0% packet loss, time 60053ms                                   
rtt min/avg/max/mdev = 0.017/0.053/0.645/0.028 ms

Comment 20 Jianlin Shi 2021-06-04 07:57:20 UTC
for rhel7, the db files attached should be converted with "ovsdb-tool compact $db_file"

with db files converted, reproduced on ovn2.13-20.12.0-104.el7:

--- 10.0.126.80 ping statistics ---                                                                   
1007 packets transmitted, 1007 received, 0% packet loss, time 100600ms                                
rtt min/avg/max/mdev = 0.033/0.073/0.619/0.031 ms 

Verified on ovn2.13-20.12.0-135.el7:

[root@dell-per740-12 bz1947056]# rpm -qa | grep -E "openvswitch2.13|ovn2.13"                          
openvswitch2.13-2.13.0-96.el7fdp.x86_64                                                               
ovn2.13-host-20.12.0-135.el7fdp.x86_64                                                                
ovn2.13-central-20.12.0-135.el7fdp.x86_64                                                             
ovn2.13-20.12.0-135.el7fdp.x86_64 

--- 10.0.126.80 ping statistics ---                                                                   
1007 packets transmitted, 1007 received, 0% packet loss, time 100600ms                                
rtt min/avg/max/mdev = 0.033/0.073/0.619/0.031 ms

Comment 21 Jianlin Shi 2021-06-08 09:15:57 UTC
reproduced without the nb files with following script:

systemctl start openvswitch                                                                                
systemctl start ovn-northd                                                             
ovn-nbctl set-connection ptcp:6641                             
ovn-sbctl set-connection ptcp:6642
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:127.0.0.1:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=127.0.0.1
systemctl restart ovn-controller                                
          
ovn-nbctl --wait=hv set NB_Global . options:use_logical_dp_groups=true
# Start ovn-nbctl daemon mode:                                               
export OVN_NB_DAEMON=$(ovn-nbctl --detach)                                                                 
                                                                                                                                        
# Enable vconn debug logs (ovn-controller to ovs-vswitchd openflow connection)               
ovn-appctl -t ovn-controller vlog/disable-rate-limit vconn
ovn-appctl -t ovn-controller vlog/set vconn:dbg
                                                             
for i in {1..99}                                               
do                                                                                                    
        ovn-nbctl ls-add lstest$i                    
        ovn-nbctl ls-add lstest${i}_p                                                                                                            
        ovn-nbctl lr-add lrtest$i
        ovn-nbctl lrp-add lrtest$i lrt${i}-ls$i 00:00:00:00:00:$i 10.0.$i.1/24
        ovn-nbctl lrp-add lrtest$i lrt${i}-ls$i-p fa:00:00:00:00:$i 1.1.$i.1/24
        ovn-nbctl lsp-add lstest$i ls${i}-lrt$i                                                       
        ovn-nbctl lsp-set-type ls${i}-lrt$i router   
        ovn-nbctl lsp-set-addresses ls${i}-lrt$i router
        ovn-nbctl lsp-set-options ls${i}-lrt$i router-port=lrt${i}-ls$i
                                                    
        ovn-nbctl lsp-add lstest${i}_p ls${i}-p-lrt$i
        ovn-nbctl lsp-set-type ls${i}-p-lrt$i router         
        ovn-nbctl lsp-set-addresses ls${i}-p-lrt$i router                                             
        ovn-nbctl lsp-set-options ls${i}-p-lrt$i router-port=lrt${i}-ls$i-p                                
                                                                                       
        ovn-nbctl lr-nat-add lrtest$i snat 1.1.$i.11 10.0.$i.11
        pg_scale=""
                          
        for j in {1..10}                                        
        do                                                                                            
                ovn-nbctl lsp-add lstest$i lstest${i}p$j
                ovn-nbctl lsp-set-addresses lstest${i}p$j "00:$j:00:00:00:$i"
                pg_scale="$pg_scale lstest${i}p$j"                                                         
                ovs-vsctl add-port br-int lstest${i}p$j -- set interface lstest${i}p$j type=internal external_ids:iface-id=lstest${i}p$j
        done                                                                                                                                                                                               

        for j in {1..10}                                                    
        do                                                   
                ovn-nbctl lsp-add lstest${i}_p lstest${i}_p-p$j
                ovn-nbctl lsp-set-addresses lstest${i}_p-p$j "fa:$j:00:00:00:$i"
                pg_scale="$pg_scale lstest${i}_p-p$j"      
                ovs-vsctl add-port br-int lstest${i}_p-p$j -- set interface lstest${i}_p-p$j type=internal external_ids:iface-id=lstest${i}_p-p$j
        done                         
        ovn-nbctl pg-add pg1t$i                           
        ovn-nbctl pg-set-ports pg1t$i $pg_scale 
        ovn-nbctl --type=port-group acl-add pg1t$i from-lport 1001 "inport == @pg1$i" allow-related
done                                                 
ovn-nbctl ls-add ls1 

for i in {1..99}
do
        ovn-nbctl lsp-add ls1 ls1p$i
        ovn-nbctl lsp-set-addresses ls1p$i "fa:16:3e:3a:26:$i"
        ovs-vsctl add-port br-int ls1p$i -- set interface ls1p$i type=internal external_ids:iface-id=ls1p$i
        ovn-nbctl acl-add ls1 from-lport 1000 "inport==\"$ls1p$i\" && ip" allow-related
done    

ovn-nbctl ls-add ls2
for i in {1..99} 
do
        ovn-nbctl lsp-add ls2 ls2p$i
        ovn-nbctl lsp-set-addresses ls2p$i "fa:16:3e:3a:27:$i"
        ovs-vsctl add-port br-int ls2p$i -- set interface ls2p$i type=internal external_ids:iface-id=ls2p$i
        ovn-nbctl acl-add ls2 from-lport 1000 "inport==\"$ls2p$i\" && ip" allow-related
done    

ovn-nbctl lr-add lr1
ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:01:01 10.0.126.1/24
ovn-nbctl lsp-add ls1 ls1-lr1 
ovn-nbctl lsp-set-type ls1-lr1 router
ovn-nbctl lsp-set-options ls1-lr1 router-port=lr1-ls1
ovn-nbctl lsp-set-addresses ls1-lr1 router

ovn-nbctl lrp-add lr1 lr1-ls2 00:00:00:00:01:02 1.1.1.1/24
ovn-nbctl lsp-add ls2 ls2-lr1 
ovn-nbctl lsp-set-type ls2-lr1 router
ovn-nbctl lsp-set-options ls2-lr1 router-port=lr1-ls2
ovn-nbctl lsp-set-addresses ls2-lr1 router

ovn-nbctl set logical_router lr1 options:chassis=hv1
for i in {2..200}
do
        ovn-nbctl lr-nat-add lr1 snat 1.1.1.$i 10.0.126.$i
done    

if1=tap30687bca-dd
if2=tapf5637489-e3
ovn-nbctl lsp-add ls1 $if1
ovn-nbctl lsp-set-addresses $if1 "fa:16:3e:3a:25:31 10.0.126.50"

ovn-nbctl lsp-add ls1 $if2
ovn-nbctl lsp-set-addresses $if2 "fa:16:3e:75:69:6e 10.0.126.80"

ovs-vsctl add-port br-int $if1 -- set interface $if1 type=internal external_ids:iface-id=$if1
ovs-vsctl add-port br-int $if2 -- set interface $if2 type=internal external_ids:iface-id=$if2

# Bind two of the OVS ports to OVN:
ip netns add vm1
ip link set $if1 netns vm1
ip netns exec vm1 ip link set $if1 address fa:16:3e:3a:25:31
ip netns exec vm1 ip addr add 10.0.126.50/24 dev $if1
ip netns exec vm1 ip link set $if1 up

ip netns add vm2
ip link set $if2 netns vm2
ip netns exec vm2 ip link set $if2 address fa:16:3e:75:69:6e
ip netns exec vm2 ip addr add 10.0.126.80/24 dev $if2
ip netns exec vm2 ip link set $if2 up

while :
        do
        if ip netns exec vm1 ping 10.0.126.80 -c  1 -w 1 -W 1
        then
                break
        else    
                sleep 1
        fi      
done    

# Start continuous ping from one port to the other, e.g.: vm1 -> vm2
ip netns exec vm1 ping 10.0.126.80 -i 0.1  &> ping.log &
ping_pid=$!

# Add an unrelated logical switch with an internal OVS port attached to it:
ovs-vsctl add-port br-int vm-test -- set interface vm-test type=internal -- set interface vm-test external_ids:iface-id=vm-test

# In a loop, simulate CMS changes to the topology by removing and adding the
# unrelated logical switch:
for i in {1..10}
do
        ovn-nbctl --wait=hv ls-add ls -- lsp-add ls vm-test
        sleep 5
        ovn-nbctl --wait=hv ls-del ls
        sleep 5
done    

kill -2 $ping_pid
tail ping.log

reproduced on ovn2.13-20.21.0-104.el7:

[root@dell-per740-12 bz1947056]# tail ping_104.log                                                    
64 bytes from 10.0.126.80: icmp_seq=1397 ttl=64 time=0.033 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1398 ttl=64 time=0.035 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1399 ttl=64 time=0.035 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1400 ttl=64 time=0.035 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1401 ttl=64 time=0.045 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1402 ttl=64 time=0.036 ms                                         
                                                                                                      
--- 10.0.126.80 ping statistics ---                                                                   
1402 packets transmitted, 1168 received, 16% packet loss, time 141682ms                               
rtt min/avg/max/mdev = 0.031/0.062/0.707/0.025 ms 

Verified on ovn2.13-20.12.0-135.el7:

[root@dell-per740-12 bz1947056]# tail ping.log                                                        
64 bytes from 10.0.126.80: icmp_seq=1155 ttl=64 time=0.064 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1156 ttl=64 time=0.063 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1157 ttl=64 time=0.064 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1158 ttl=64 time=0.064 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1159 ttl=64 time=0.065 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1160 ttl=64 time=0.063 ms                                         
                                                                                                      
--- 10.0.126.80 ping statistics ---                                                                   
1160 packets transmitted, 1160 received, 0% packet loss, time 115901ms                                
rtt min/avg/max/mdev = 0.031/0.063/0.622/0.024 ms

Comment 22 Jianlin Shi 2021-06-08 09:43:48 UTC
also verified on ovn-2021-21.03.0-40.el8fdp.x86_64:

+ tail ping.log                                                                                       
64 bytes from 10.0.126.80: icmp_seq=1065 ttl=64 time=0.036 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1066 ttl=64 time=0.018 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1067 ttl=64 time=0.034 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1068 ttl=64 time=0.039 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1069 ttl=64 time=0.040 ms                                         
64 bytes from 10.0.126.80: icmp_seq=1070 ttl=64 time=0.017 ms                                         
                                                                                                      
--- 10.0.126.80 ping statistics ---                                                                   
1070 packets transmitted, 1070 received, 0% packet loss, time 111133ms                                
rtt min/avg/max/mdev = 0.017/0.036/0.648/0.027 ms                                                     
[root@dell-per730-03 bz1947056]# rpm -qa | grep -E "openvswitch2.15|ovn-2021"                         
ovn-2021-host-21.03.0-40.el8fdp.x86_64                                                                
openvswitch2.15-2.15.0-23.el8fdp.x86_64                                                               
ovn-2021-central-21.03.0-40.el8fdp.x86_64                                                             
python3-openvswitch2.15-2.15.0-23.el8fdp.x86_64                                                       
ovn-2021-21.03.0-40.el8fdp.x86_64

Comment 24 errata-xmlrpc 2021-06-21 14:44:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2507


Note You need to log in before you can comment on or make changes to this bug.