Bug 2203003

Summary: ovs-appctl dpctl/dump-flows hung
Product: Red Hat Enterprise Linux Fast Datapath Reporter: ovs-bugzilla <ovs-bugzilla>
Component: openvswitch3.0Assignee: Eelco Chaudron <echaudro>
Status: VERIFIED --- QA Contact: qding
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: FDP 23.CCC: ctrautma, jhsiao, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openvswitch3.0-3.0.0-44.el9fdp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description ovs-bugzilla 2023-05-11 03:31:28 UTC
+++ This bug was initially created as a clone of Bug #2182541 +++

Description of problem:
ovs-appctl dpctl/dump-flows --names -m is suspended

root     2053841  0.0  0.0  44436  4104 ?        S    19:59   0:00 ovs-appctl dpctl/dump-flows --names -m

message below is found in ovs-vswitchd.log. For more details, please see the attached ovs-vswitchd.log.

2023-03-28T23:59:47.950Z|00291|dpif|WARN|system@ovs-system: failed to flow_del (No such file or directory) ufid:a923c656-7cd0-4477-8549-5e52e34be41a recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),eth(src=e4:11:22:33:44:70,dst=00:00:00:00:01:57),eth_type(0x8100),vlan(vid=0,pcp=0),encap(eth_type(0x0800),ipv4(src=2.2.2.1,dst=2.2.2.2,proto=17,tos=0,ttl=64,frag=no),udp(src=100,dst=44))
2023-03-28T23:59:48.986Z|00001|ovs_rcu(urcu4)|WARN|blocked 1001 ms waiting for revalidator39 to quiesce
2023-03-28T23:59:49.986Z|00002|ovs_rcu(urcu4)|WARN|blocked 2001 ms waiting for revalidator39 to quiesce
2023-03-28T23:59:51.987Z|00003|ovs_rcu(urcu4)|WARN|blocked 4002 ms waiting for revalidator39 to quiesce
2023-03-28T23:59:55.989Z|00004|ovs_rcu(urcu4)|WARN|blocked 8004 ms waiting for revalidator39 to quiesce
2023-03-29T00:00:03.993Z|00005|ovs_rcu(urcu4)|WARN|blocked 16008 ms waiting for revalidator39 to quiesce
2023-03-29T00:00:20.002Z|00006|ovs_rcu(urcu4)|WARN|blocked 32017 ms waiting for revalidator39 to quiesce
2023-03-29T00:00:52.017Z|00007|ovs_rcu(urcu4)|WARN|blocked 64032 ms waiting for revalidator39 to quiesce
2023-03-29T00:01:56.049Z|00008|ovs_rcu(urcu4)|WARN|blocked 128064 ms waiting for revalidator39 to quiesce
2023-03-29T00:04:04.086Z|00009|ovs_rcu(urcu4)|WARN|blocked 256101 ms waiting for revalidator39 to quiesce
2023-03-29T00:08:20.085Z|00010|ovs_rcu(urcu4)|WARN|blocked 512100 ms waiting for revalidator39 to quiesce
2023-03-29T00:16:52.085Z|00011|ovs_rcu(urcu4)|WARN|blocked 1024100 ms waiting for revalidator39 to quiesce
2023-03-29T00:33:56.059Z|00012|ovs_rcu(urcu4)|WARN|blocked 2048074 ms waiting for revalidator39 to quiesce


[root@dell-per750-28 ~]# uname -r
4.18.0-372.50.1.el8_6.x86_64
[root@dell-per750-28 ~]# rpm -qa | grep openvswitch
openvswitch-selinux-extra-policy-1.0-29.el8fdp.noarch
kernel-kernel-networking-openvswitch-tunnel_offload-geneve_offload-1.0-38.noarch
openvswitch3.1-3.1.0-8.el8fdp.x86_64
kernel-kernel-networking-openvswitch-common-3.0-16.noarch
kernel-kernel-networking-openvswitch-tunnel_offload-common-1.0-44.noarch
[root@dell-per750-28 ~]# ovs-vsctl show
39c2a6aa-68b1-4f81-bc56-9450ec321350
    Bridge ovsbr0
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port eth0
            Interface eth0
        Port geneve1
            Interface geneve1
                type: geneve
                options: {key="100", local_ip="1.1.1.1", remote_ip="1.1.1.2"}
    ovs_version: "3.1.1"
[root@dell-per750-28 ~]# ovs-vsctl get Open_vSwitch . other_config
{hw-offload="true", max-idle="3600000", max-revalidator="3600000", tc-policy=none}
[root@dell-per750-28 ~]# ethtool -i ens4f0
driver: mlx5_core
version: 4.18.0-372.50.1.el8_6.x86_64
firmware-version: 22.35.1012 (MT_0000000528)
expansion-rom-version: 
bus-info: 0000:b3:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
[root@dell-per750-28 ~]# lspci -m -s 0000:b3:00.0
b3:00.0 "Ethernet controller" "Mellanox Technologies" "MT2892 Family [ConnectX-6 Dx]" "Mellanox Technologies" "Device 0083"
[root@dell-per750-28 ~]#

Version-Release number of selected component (if applicable):
openvswitch3.1-3.1.0-8.el8fdp.x86_64
kernel-4.18.0-372.50.1.el8_6.x86_64

How reproducible: frequently


Steps to Reproduce:
beaker jobs:
https://beaker.engineering.redhat.com/jobs/7676426
https://beaker.engineering.redhat.com/jobs/7676427
https://beaker.engineering.redhat.com/jobs/7676428

Actual results:
beaker job timeout after running ovs-appctl dpctl/dump-flows --names -m

Expected results:


Additional info:

Comment 1 ovs-bugzilla 2023-05-11 03:31:31 UTC
* Wed May 10 2023 Open vSwitch CI <ovs-ci> - 3.0.0-44
- Merging upstream branch-3.0 [RH git: 9b67191d75]
    Commit list:
    4ddfdaff1c netdev-offload: Fix deadlock/recursive use of the netdev_hmap_rwlock rwlock. (#2182541)
    a1dbda7162 ofproto-dpif-xlate: Fix use-after-free when xlate_actions().