Bug 1857537

Summary: ovn-controller is not programming the flows properly when ovn-monitor-all is set
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Numan Siddique <nusiddiq>
Component: ovn2.13Assignee: Numan Siddique <nusiddiq>
Status: CLOSED ERRATA QA Contact: Jianlin Shi <jishi>
Severity: urgent Docs Contact:
Priority: urgent    
Version: RHEL 8.0CC: ctrautma, jishi, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-18 11:23:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Numan Siddique 2020-07-16 06:08:16 UTC
Description of problem:

Steps to reproduce
------

# on all the nodes where ovn-controller is running run the below command
ovs-vsctl set open . external_ids:ovn-monitor-all=true

# create ovn resources
ovn-nbctl ls-add sw0
ovn-nbctl lsp-add sw0 sw0-port1
ovn-nbctl lsp-set-addresses sw0-port1 "10:54:00:00:00:03 10.0.0.3"

ovn-nbctl lsp-add sw0 sw0-port2
ovn-nbctl lsp-set-addresses sw0-port2 "10:54:00:00:00:04 10.0.0.4"


# Create a logical router and attach both logical switches
ovn-nbctl lr-add lr0
ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
ovn-nbctl lsp-add sw0 sw0-lr0
ovn-nbctl lsp-set-type sw0-lr0 router
ovn-nbctl lsp-set-addresses sw0-lr0 00:00:00:00:ff:01
ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0


# On any node where ovn-controller is running

ovs-vsctl add-port br-int sw0p1 -- set interface sw0p1 type=internal
ip netns add sw0p1
ip link set sw0p1 netns sw0p1
ip netns exec sw0p1 ip link set lo up
ip netns exec sw0p1 ip link set sw0p1 up
ip netns exec sw0p1 ip link set sw0p1 address 10:54:00:00:00:03
ip netns exec sw0p1 ip addr add 10.0.0.3/24 dev sw0p1
ip netns exec sw0p1 ip route add default via 10.0.0.1 dev sw0p1
ovs-vsctl set Interface sw0p1 external_ids:iface-id=sw0p1

# now ping to router IP - 10.0.0.1
ip netns exec sw0p1 ping -c3 10.0.0.1

The ping fails.

Expected result:
Ping should succeed

The issue is not seen when ovn-monitor-all is not set to True.

The issue is seen with the Version : ovn2.13-20.06.1-2 (which is not available in FDP yet)

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Jianlin Shi 2020-07-20 09:33:07 UTC
failed to reproduce with the reproducer in description.
tried to test with https://github.com/dceara/ovn-heater.
reproduced on ovn2.13.0-20.06.01-2:

install with: 
RPM_SELINUX=http://download-node-02.eng.bos.redhat.com/brewroot/packages/openvswitch-selinux-extra-policy/1.0/23.el8fdp/noarch/openvswitch-selinux-extra-policy-1.0-23.el8fdp.noarch.rpm RPM_OVS=http://download-node-02.eng.bos.redhat.com/brewroot/packages/openvswitch2.13/2.13.0/48.el8fdp/x86_64/openvswitch2.13-2.13.0-48.el8fdp.x86_64.rpm RPM_OVN_COMMON=http://download-node-02.eng.bos.redhat.com/brewroot/packages/ovn2.13/20.06.1/2.el8fdp/x86_64/ovn2.13-20.06.1-2.el8fdp.x86_64.rpm ./do.sh install

then for following test for 10 times:
./do.sh browbeat-run browbeat-scenarios/switch-per-node-low-scale.yml

the issue occurred for 2 times as follows:

PING 16.1.255.254 (16.1.255.254) 56(84) bytes of data.

--- 16.1.255.254 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms                                         

PING 16.1.255.254 (16.1.255.254) 56(84) bytes of data.                                                

--- 16.1.255.254 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

PING 16.1.255.254 (16.1.255.254) 56(84) bytes of data.

--- 16.1.255.254 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms                                         

PING 16.1.255.254 (16.1.255.254) 56(84) bytes of data.                                                

--- 16.1.255.254 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

PING 16.1.255.254 (16.1.255.254) 56(84) bytes of data.

--- 16.1.255.254 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

<=== failed to ping router in the test

Verified on ovn2.13.0-20.06.1-3:

RPM_SELINUX=http://download-node-02.eng.bos.redhat.com/brewroot/packages/openvswitch-selinux-extra-policy/1.0/23.el8fdp/noarch/openvswitch-selinux-extra-policy-1.0-23.el8fdp.noarch.rpm RPM_OVS=http://download-node-02.eng.bos.redhat.com/brewroot/packages/openvswitch2.13/2.13.0/48.el8fdp/x86_64/openvswitch2.13-2.13.0-48.el8fdp.x86_64.rpm RPM_OVN_COMMON=http://download-node-02.eng.bos.redhat.com/brewroot/packages/ovn2.13/20.06.1/3.el8fdp/x86_64/ovn2.13-20.06.1-3.el8fdp.x86_64.rpm ./do.sh install

the issue didn't occur after test 10 times

Comment 5 errata-xmlrpc 2020-08-18 11:23:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3488