The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1928012 - [OVN] ovn-controller crashing with "failed in flood_remove_flows_for_sb_uuid()"
Summary: [OVN] ovn-controller crashing with "failed in flood_remove_flows_for_sb_uuid()"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.13
Version: FDP 20.H
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Numan Siddique
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks: 1929978
TreeView+ depends on / blocked
 
Reported: 2021-02-12 06:17 UTC by David Sedgmen
Modified: 2023-01-26 20:46 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1929978 (view as bug list)
Environment:
Last Closed: 2021-03-15 14:34:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1091 0 None None None 2022-02-21 14:52:48 UTC
Red Hat Knowledge Base (Solution) 6749861 0 None None None 2022-02-21 14:36:33 UTC
Red Hat Product Errata RHBA-2021:0839 0 None None None 2021-03-15 14:34:59 UTC

Comment 10 Jianlin Shi 2021-02-19 06:04:03 UTC
tested with following script:

enable_coredump()                                                                                     
{                                                                                                     
        ulimit -c unlimited                                                                           
        ulimit -s unlimited                                                                           
        sysctl -w fs.suid_dumpable=2                                                                  
        if ! sysctl kernel.core_pattern | grep systemd-coredump                                       
        then                                                                                          
                sysctl -w kernel.core_pattern="|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e"             
        fi                                                                                            
        rm -rf /var/lib/systemd/coredump/*                                                            
        rm -rf /run/log/journal/*                                                                     
        rm -rf /var/log/journal/*                                                                     
        systemctl restart systemd-journald                                                            
}

systemctl start openvswitch
systemctl start ovn-northd                                  
ovn-nbctl set-connection ptcp:6641                                                              
ovn-sbctl set-connection ptcp:6642                                                                                                                                                                         
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:1.1.175.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=1.1.175.25                                        
systemctl restart ovn-controller
                                        
ovs-vsctl \                                                                                     
    -- add-port br-int vif1 \                                                                                                                                                                              
    -- set Interface vif1 type=internal external_ids:iface-id=sw0-port1 \                                                                                                                                  
    ofport-request=1
    
ovn-nbctl ls-add sw0
ovn-nbctl lsp-add sw0 sw0-port1
ovn-nbctl lsp-set-addresses sw0-port1 "10:14:00:00:00:01 192.168.0.2"
                                                  
ovn-nbctl lsp-add sw0 sw0-port2                   
ovn-nbctl lsp-add sw0 sw0-port3                   
ovn-nbctl lsp-add sw0 sw0-port4                             
ovn-nbctl lsp-add sw0 sw0-port5                                       
ovn-nbctl lsp-add sw0 sw0-port6
ovn-nbctl lsp-add sw0 sw0-port7                   
                                                            
ovn-nbctl create address_set name=as1                                 
ovn-nbctl set address_set . addresses="10.0.0.10,10.0.0.11,10.0.0.12"
                                                  
ovn-nbctl pg-add pg1 sw0-port1 sw0-port2 sw0-port3          
ovn-nbctl acl-add pg1 to-lport 1002 "outport == @pg1 && ip4.dst == \$as1 && icmp4" drop
ovn-nbctl acl-add pg1 to-lport 1002 "outport == @pg1 && ip4.dst == \$as1 && tcp && tcp.dst >=10000 && tcp.dst <= 20000" drop
ovn-nbctl acl-add pg1 to-lport 1002 "outport == @pg1 && ip4.dst == \$as1 && udp && udp.dst >=10000 && udp.dst <= 20000" drop
    
ovn-nbctl pg-add pg2 sw0-port2 sw0-port3 sw0-port4 sw0-port5
ovn-nbctl acl-add pg2 to-lport 1002 "outport == @pg2 && ip4.dst == \$as1 && icmp4" allow-related
ovn-nbctl acl-add pg2 to-lport 1002 "outport == @pg2 && ip4.dst == \$as1 && tcp && tcp.dst >=30000 && tcp.dst <= 40000" drop
ovn-nbctl acl-add pg2 to-lport 1002 "outport == @pg2 && ip4.dst == \$as1 && udp && udp.dst >=30000 && udp.dst <= 40000" drop

ovn-nbctl pg-add pg3 sw0-port1 sw0-port5
ovn-nbctl acl-add pg3 to-lport 1002 "outport == @pg3 && ip4.dst == \$as1 && icmp4" allow-related
ovn-nbctl acl-add pg3 to-lport 1002 "outport == @pg3 && ip4.dst == \$as1 && tcp && tcp.dst >=20000 && tcp.dst <= 30000" allow-related
ovn-nbctl acl-add pg3 to-lport 1002 "outport == @pg3 && ip4.dst == \$as1 && udp && udp.dst >=20000 && udp.dst <= 30000" allow-related


for i in $(seq 1 10)
do
    ovn-nbctl --wait=hv clear port_Group pg1 ports
    ovn-nbctl --wait=hv clear port_Group pg2 ports
    ovn-nbctl --wait=hv clear port_Group pg3 ports
    ovn-nbctl --wait=hv pg-set-ports pg1 sw0-port1
    ovn-nbctl --wait=hv pg-set-ports pg1 sw0-port1 sw0-port4
    ovn-nbctl --wait=hv pg-set-ports pg1 sw0-port1 sw0-port4 sw0-port5
    
    ovn-nbctl --wait=hv pg-set-ports pg2 sw0-port2
    ovn-nbctl --wait=hv pg-set-ports pg2 sw0-port2 sw0-port6
    ovn-nbctl --wait=hv pg-set-ports pg2 sw0-port2 sw0-port6 sw0-port7
    
    ovn-nbctl --wait=hv pg-set-ports pg3 sw0-port1
    ovn-nbctl --wait=hv pg-set-ports pg3 sw0-port1 sw0-port3
    ovn-nbctl --wait=hv pg-set-ports pg3 sw0-port1 sw0-port3 sw0-port6
    
if coredumpctl
then
    break
fi  
    
done
    
echo $i

reproduced on 20.12.0-17:

[root@wsfd-advnetlab21 bz1928012]# rpm -qa | grep -E "openvswitch2.13|ovn2.13"                        
openvswitch2.13-2.13.0-82.el7fdp.x86_64
ovn2.13-central-20.12.0-17.el7fdp.x86_64                                                              
ovn2.13-20.12.0-17.el7fdp.x86_64                                                                      
ovn2.13-host-20.12.0-17.el7fdp.x86_64

+ ovn-nbctl --wait=hv pg-set-ports pg3 sw0-port1 sw0-port3 sw0-port6                                  
+ coredumpctl                                                                                                               
TIME                            PID   UID   GID SIG PRESENT EXE                                                             
Fri 2021-02-19 00:58:40 EST   85436   996   993   6 * /usr/bin/ovn-controller
+ break                                                     
+ echo 5                                                                                        
5

[root@wsfd-advnetlab21 bz1928012]# coredumpctl info                           
           PID: 85436 (ovn-controller)                              
           UID: 996 (openvswitch)                                   
           GID: 993 (openvswitch)               
        Signal: 6 (ABRT)                                  
     Timestamp: Fri 2021-02-19 00:58:38 EST (18s ago)                
  Command Line: ovn-controller unix:/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --user openvswitch:openvswitch --no-chdir --log-file=/var/log/ovn/ovn-controller.log --pidfile=/run/ovn
    Executable: /usr/bin/ovn-controller                                      
 Control Group: /system.slice/ovn-controller.service                                   
          Unit: ovn-controller.service                                                                                                                                                                     
         Slice: system.slice                                                                                                                                                                               
       Boot ID: bf0f58dfcf874536b854bcd802a113c0                             
    Machine ID: dd49cd68bab449beb64d72335f2acfc2            
      Hostname: wsfd-advnetlab21.anl.lab.eng.bos.redhat.com                                     
      Coredump: /var/lib/systemd/coredump/core.ovn-controller.996.bf0f58dfcf874536b854bcd802a113c0.85436.1613714318000000.xz                                                                               
       Message: Process 85436 (ovn-controller) of user 996 dumped core.                                                                                                                                    
                                                                  
                Stack trace of thread 85436:                                  
                #0  0x00007fdde98e4387 raise (libc.so.6)                                        
                #1  0x00007fdde98e5a78 abort (libc.so.6)                                                                                                                                                   
                #2  0x0000555eec7143ae ovs_abort_valist (ovn-controller)                                                                                                                                   
                #3  0x0000555eec71bb70 vlog_abort_valist (ovn-controller)
                #4  0x0000555eec71bc04 vlog_abort (ovn-controller)           
                #5  0x0000555eec7140ec ovs_assert_failure (ovn-controller)   
                #6  0x0000555eec642d32 flood_remove_flows_for_sb_uuid (ovn-controller)
                #7  0x0000555eec642c0b flood_remove_flows_for_sb_uuid (ovn-controller)
                #8  0x0000555eec642dd2 ofctrl_flood_remove_flows (ovn-controller)
                #9  0x0000555eec63d07f lflow_handle_changed_flows (ovn-controller)
                #10 0x0000555eec65981d flow_output_sb_logical_flow_handler (ovn-controller)
                #11 0x0000555eec6731c0 engine_run (ovn-controller)           
                #12 0x0000555eec630290 main (ovn-controller)                  
                #13 0x00007fdde98d0555 __libc_start_main (libc.so.6)
                #14 0x0000555eec631b75 _start (ovn-controller)      
                                                            
                Stack trace of thread 85439:                          
                #0  0x00007fdde99a1c3d poll (libc.so.6)              
                #1  0x0000555eec70f624 time_poll (ovn-controller)                                                                                                                                           
                #2  0x0000555eec704e9c poll_block (ovn-controller)           
                #3  0x0000555eec70426c stopwatch_thread (ovn-controller)               
                #4  0x0000555eec6ee45f ovsthread_wrapper (ovn-controller)                                                   
                #5  0x00007fddea48eea5 start_thread (libpthread.so.0)                                                       
                #6  0x00007fdde99ac96d __clone (libc.so.6)                   
                                                            
                Stack trace of thread 85437:                                                    
                #0  0x00007fdde99a1c3d poll (libc.so.6)                                                                     
                #1  0x0000555eec70f624 time_poll (ovn-controller)                                                                                                  
                #2  0x0000555eec704e9c poll_block (ovn-controller)
                #3  0x0000555eec6502e3 pinctrl_handler (ovn-controller)       
                #4  0x0000555eec6ee45f ovsthread_wrapper (ovn-controller)                       
                #5  0x00007fddea48eea5 start_thread (libpthread.so.0)
                #6  0x00007fdde99ac96d __clone (libc.so.6)
   
                Stack trace of thread 85438:
                #0  0x00007fdde99a1c3d poll (libc.so.6)
                #1  0x0000555eec70f624 time_poll (ovn-controller)
                #2  0x0000555eec704e9c poll_block (ovn-controller)
                #3  0x0000555eec6ec09e ovsrcu_postpone_thread (ovn-controller)
                #4  0x0000555eec6ee45f ovsthread_wrapper (ovn-controller)
                #5  0x00007fddea48eea5 start_thread (libpthread.so.0)
                #6  0x00007fdde99ac96d __clone (libc.so.6)

Verified on 20.12.0-20:

[root@wsfd-advnetlab21 bz1928012]# rpm -qa | grep -E "openvswitch2.13|ovn2.13"
openvswitch2.13-2.13.0-82.el7fdp.x86_64
ovn2.13-host-20.12.0-20.el7fdp.x86_64
ovn2.13-central-20.12.0-20.el7fdp.x86_64
ovn2.13-20.12.0-20.el7fdp.x86_64

+ coredumpctl
No coredumps found.
+ echo 10
10

[root@wsfd-advnetlab21 bz1928012]# coredumpctl
No coredumps found.

Comment 11 Jianlin Shi 2021-02-19 06:09:09 UTC
also Verified on rhel8 version:

+ ovn-nbctl --wait=hv pg-set-ports pg3 sw0-port1 sw0-port3 sw0-port6
+ coredumpctl
No coredumps found.
+ echo 10
10
[root@dell-per740-12 bz1928012]# rpm -qa | grep -E "openvswitch2.13|ovn2.13"
ovn2.13-central-20.12.0-20.el8fdp.x86_64
openvswitch2.13-2.13.0-95.el8fdp.x86_64
ovn2.13-20.12.0-20.el8fdp.x86_64
ovn2.13-host-20.12.0-20.el8fdp.x86_64

Comment 14 Jianlin Shi 2021-03-08 02:55:24 UTC
also no crash on 20.12.0-24 with reproducer in comment 10, set VERIFIED

Comment 16 errata-xmlrpc 2021-03-15 14:34:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0839


Note You need to log in before you can comment on or make changes to this bug.