Bug 1908391

Summary: [ovn] Missing/wrong openflow rules after ofport change
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Daniel Alvarez Sanchez <dalvarez>
Component: ovn2.13Assignee: Dumitru Ceara <dceara>
Status: CLOSED ERRATA QA Contact: Fei Liu <feliu>
Severity: urgent Docs Contact:
Priority: urgent    
Version: FDP 21.ICC: alisci, ctrautma, dceara, ekuris, jishi, jlibosva, mmichels, ralongi, skaplons, smooney
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovn2.13-20.12.0-1.el8fdp ovn2.13-20.12.0-1.el7fdp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1956748 (view as bug list) Environment:
Last Closed: 2021-02-03 21:55:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1956748, 1970948    
Attachments:
Description Flags
dbs and logs
none
OVN controller logs none

Description Daniel Alvarez Sanchez 2020-12-16 15:25:10 UTC
Created attachment 1739682 [details]
dbs and logs

When a port is replugged several times to the OVN bridge, its ofport will change.

ovn-controller doesn't seem to honor these changes in table 65 and traffic from such port will not work anymore.

The issue gets fixed triggering a recompute via: ovn-appctl -t ovn-controller recompute so everything seems to point to an issue of the incremental processing framework.

Tests have reported this failing with ovn2.13-20.06/20.09 and a scratch build rebased on 20.12.

Example:

Port with ofport 745 shows in table 0:

bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745
 cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863, n_bytes=111678, idle_age=1, priority=100,in_port=745 actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)


However 737 is used in table 65:


bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e
 cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848, n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d actions=output:737


Older entry in the OVS database shows that 737 was the previous ofport of this port before it was re-added to the bridge:

record 6516: 2020-12-14 22:22:54.184
  table Interface row "tap71a5dfc1-10" (073801e2):
    ofport=737


I'm attaching NB/SB/OVS databases as well as physical and logical flows dumps for analysis.

Comment 1 Slawek Kaplonski 2020-12-16 15:41:54 UTC
Created attachment 1739685 [details]
OVN controller logs

Comment 5 Fei Liu 2020-12-22 06:51:23 UTC
Hi  Dumitru

   I failed to reproduce it. Could you help to check if I am missing something?

Steps:

	ovn-nbctl ls-add sw
	ovn-nbctl lsp-add sw lsp1
	ovn-nbctl lsp-add sw lsp2

	ovs-vsctl \
	    -- add-port br-int vm1 \
	    -- set Interface vm1  external_ids:iface-id=lsp1 \
	    ofport-request=1  
	ovs-vsctl \
	    -- add-port br-int vm2 \
	    -- set Interface vm2  external_ids:iface-id=lsp2 \
	    ofport-request=2


	for i in `seq 3 100`
	do
		ovs-vsctl \
		    -- del-port vm1 \
		    -- del-port vm2

		ovs-vsctl \
		    -- add-port br-int vm1 \
		    -- set Interface vm1  external_ids:iface-id=lsp1 \
		    ofport-request=$i  \
		    -- add-port br-int vm2 \
		    -- set Interface vm2  external_ids:iface-id=lsp2 \
		    ofport-request=$(($i+1))

	echo $i
	done

# ovs-ofctl dump-flows br-int table=0 |grep in_port
 cookie=0xc5e16eb4, duration=25.042s, table=0, n_packets=0, n_bytes=0, idle_age=25, priority=100,in_port=101 actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],resubmit(,8)
 cookie=0x1192e79a, duration=25.042s, table=0, n_packets=0, n_bytes=0, idle_age=25, priority=100,in_port=100 actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)

# ovs-ofctl dump-flows br-int table=65|grep output
 cookie=0xc5e16eb4, duration=36.695s, table=65, n_packets=0, n_bytes=0, idle_age=36, priority=100,reg15=0x2,metadata=0x1 actions=output:101
 cookie=0x1192e79a, duration=36.695s, table=65, n_packets=0, n_bytes=0, idle_age=36, priority=100,reg15=0x1,metadata=0x1 actions=output:100


[root@dell-per730-17 ~]# rpm -qa|grep ovn
ovn2.13-20.09.0-8.el8fdp.x86_64
ovn2.13-central-20.09.0-8.el8fdp.x86_64
ovn2.13-host-20.09.0-8.el8fdp.x86_64
[root@dell-per730-17 ~]# rpm -qa|grep openv
python3-openvswitch2.13-2.13.0-9.el8fdp.x86_64
openvswitch2.13-2.13.0-9.el8fdp.x86_64
[root@dell-per730-17 ~]#

Comment 6 Dumitru Ceara 2020-12-22 10:38:50 UTC
Hi Fei Liu,

With ovn2.13-20.09.0-23.el8fdp I did:

ovn-nbctl ls-add sw
ovn-nbctl lsp-add sw lsp1
ovn-nbctl lsp-add sw lsp2

ovs-vsctl \
    -- add-port br-int vm1 \
    -- set Interface vm1 type=internal \
    -- set Interface vm1  external_ids:iface-id=lsp1 \
    ofport-request=1  
ovs-vsctl \
    -- add-port br-int vm2 \
    -- set Interface vm2 type=internal \
    -- set Interface vm2  external_ids:iface-id=lsp2 \
    ofport-request=2

while [ $(ovn-sbctl show | grep -c Port_Binding) -ne 2 ]; do sleep 1; done;

ovn-appctl -t ovn-controller debug/pause
ovs-vsctl \
    -- del-port vm1 \
    -- del-port vm2
ovs-vsctl \
    -- add-port br-int vm1 \
    -- set Interface vm1 type=internal \
    -- set Interface vm1  external_ids:iface-id=lsp1 \
    ofport-request=$i  \
    -- add-port br-int vm2 \
    -- set Interface vm2 type=internal \
    -- set Interface vm2  external_ids:iface-id=lsp2 \
    ofport-request=$(($i+1))
ovn-appctl -t ovn-controller debug/resume

$ ovs-ofctl dump-flows br-int table=0 |grep in_port
 cookie=0xa3cb8b6d, duration=13.346s, table=0, n_packets=0, n_bytes=0, idle_age=13, priority=100,in_port=1 actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
 cookie=0xbd13d6c3, duration=13.334s, table=0, n_packets=0, n_bytes=0, idle_age=13, priority=100,in_port=2 actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],resubmit(,8)
 cookie=0xa3cb8b6d, duration=13.263s, table=0, n_packets=0, n_bytes=0, idle_age=13, priority=100,in_port=3 actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
 cookie=0xbd13d6c3, duration=13.263s, table=0, n_packets=0, n_bytes=0, idle_age=13, priority=100,in_port=4 actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],resubmit(,8)

$ ovs-ofctl dump-flows br-int table=65|grep output
 cookie=0xa3cb8b6d, duration=19.024s, table=65, n_packets=0, n_bytes=0, idle_age=19, priority=100,reg15=0x1,metadata=0x1 actions=output:1
 cookie=0xbd13d6c3, duration=19.013s, table=65, n_packets=0, n_bytes=0, idle_age=19, priority=100,reg15=0x2,metadata=0x1 actions=output:2

# Now if we trigger a manual recompute we see the correct flows:
$ ovn-appctl -t ovn-controller recompute

$ ovs-ofctl dump-flows br-int table=0 |grep in_port
 cookie=0xa3cb8b6d, duration=194.976s, table=0, n_packets=0, n_bytes=0, idle_age=194, priority=100,in_port=3 actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
 cookie=0xbd13d6c3, duration=194.976s, table=0, n_packets=0, n_bytes=0, idle_age=194, priority=100,in_port=4 actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],resubmit(,8)

$ ovs-ofctl dump-flows br-int table=65|grep output
 cookie=0xa3cb8b6d, duration=212.088s, table=65, n_packets=0, n_bytes=0, idle_age=212, hard_age=36, priority=100,reg15=0x1,metadata=0x1 actions=output:3
 cookie=0xbd13d6c3, duration=212.076s, table=65, n_packets=0, n_bytes=0, idle_age=212, hard_age=36, priority=100,reg15=0x2,metadata=0x1 actions=output:4

Regards,
Dumitru

Comment 7 Fei Liu 2020-12-23 02:37:10 UTC
(In reply to Dumitru Ceara from comment #6)
> Hi Fei Liu,
> 
> With ovn2.13-20.09.0-23.el8fdp I did:
> 
> ovn-nbctl ls-add sw
> ovn-nbctl lsp-add sw lsp1
> ovn-nbctl lsp-add sw lsp2
> 
> ovs-vsctl \
>     -- add-port br-int vm1 \
>     -- set Interface vm1 type=internal \
>     -- set Interface vm1  external_ids:iface-id=lsp1 \
>     ofport-request=1  
> ovs-vsctl \
>     -- add-port br-int vm2 \
>     -- set Interface vm2 type=internal \
>     -- set Interface vm2  external_ids:iface-id=lsp2 \
>     ofport-request=2
> 
> while [ $(ovn-sbctl show | grep -c Port_Binding) -ne 2 ]; do sleep 1; done;
> 
> ovn-appctl -t ovn-controller debug/pause
> ovs-vsctl \
>     -- del-port vm1 \
>     -- del-port vm2
> ovs-vsctl \
>     -- add-port br-int vm1 \
>     -- set Interface vm1 type=internal \
>     -- set Interface vm1  external_ids:iface-id=lsp1 \
>     ofport-request=$i  \
>     -- add-port br-int vm2 \
>     -- set Interface vm2 type=internal \
>     -- set Interface vm2  external_ids:iface-id=lsp2 \
>     ofport-request=$(($i+1))
> ovn-appctl -t ovn-controller debug/resume
> 
> $ ovs-ofctl dump-flows br-int table=0 |grep in_port
>  cookie=0xa3cb8b6d, duration=13.346s, table=0, n_packets=0, n_bytes=0,
> idle_age=13, priority=100,in_port=1
> actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2-
> >NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],
> resubmit(,8)
>  cookie=0xbd13d6c3, duration=13.334s, table=0, n_packets=0, n_bytes=0,
> idle_age=13, priority=100,in_port=2
> actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2-
> >NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],
> resubmit(,8)
>  cookie=0xa3cb8b6d, duration=13.263s, table=0, n_packets=0, n_bytes=0,
> idle_age=13, priority=100,in_port=3
> actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2-
> >NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],
> resubmit(,8)
>  cookie=0xbd13d6c3, duration=13.263s, table=0, n_packets=0, n_bytes=0,
> idle_age=13, priority=100,in_port=4
> actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2-
> >NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],
> resubmit(,8)
> 
> $ ovs-ofctl dump-flows br-int table=65|grep output
>  cookie=0xa3cb8b6d, duration=19.024s, table=65, n_packets=0, n_bytes=0,
> idle_age=19, priority=100,reg15=0x1,metadata=0x1 actions=output:1
>  cookie=0xbd13d6c3, duration=19.013s, table=65, n_packets=0, n_bytes=0,
> idle_age=19, priority=100,reg15=0x2,metadata=0x1 actions=output:2
> 
> # Now if we trigger a manual recompute we see the correct flows:
> $ ovn-appctl -t ovn-controller recompute
> 
> $ ovs-ofctl dump-flows br-int table=0 |grep in_port
>  cookie=0xa3cb8b6d, duration=194.976s, table=0, n_packets=0, n_bytes=0,
> idle_age=194, priority=100,in_port=3
> actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2-
> >NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],
> resubmit(,8)
>  cookie=0xbd13d6c3, duration=194.976s, table=0, n_packets=0, n_bytes=0,
> idle_age=194, priority=100,in_port=4
> actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2-
> >NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],
> resubmit(,8)
> 
> $ ovs-ofctl dump-flows br-int table=65|grep output
>  cookie=0xa3cb8b6d, duration=212.088s, table=65, n_packets=0, n_bytes=0,
> idle_age=212, hard_age=36, priority=100,reg15=0x1,metadata=0x1
> actions=output:3
>  cookie=0xbd13d6c3, duration=212.076s, table=65, n_packets=0, n_bytes=0,
> idle_age=212, hard_age=36, priority=100,reg15=0x2,metadata=0x1
> actions=output:4
> 
> Regards,
> Dumitru

Thanks for your reply,it help me a lot. BTW , On which version is this issue fixed??

Comment 8 Dumitru Ceara 2020-12-23 09:22:08 UTC
(In reply to Fei Liu from comment #7)
[...]
> 
> Thanks for your reply,it help me a lot. BTW , On which version is this issue
> fixed??

This is fixed in ovn2.13-20.12.0-1.

Comment 9 Fei Liu 2020-12-24 03:12:34 UTC
# cat reproducer.sh 

ovn-nbctl ls-add sw
ovn-nbctl lsp-add sw lsp1
ovn-nbctl lsp-add sw lsp2

ovs-vsctl \
    -- add-port br-int vm1 \
    -- set Interface vm1 type=internal \
    -- set Interface vm1  external_ids:iface-id=lsp1 \
    ofport-request=1  
ovs-vsctl \
    -- add-port br-int vm2 \
    -- set Interface vm2 type=internal \
    -- set Interface vm2  external_ids:iface-id=lsp2 \
    ofport-request=2

while [ $(ovn-sbctl show | grep -c Port_Binding) -ne 2 ]; do sleep 1; done;

ovn-appctl -t ovn-controller debug/pause
ovs-vsctl \
    -- del-port vm1 \
    -- del-port vm2
ovs-vsctl \
    -- add-port br-int vm1 \
    -- set Interface vm1 type=internal \
    -- set Interface vm1  external_ids:iface-id=lsp1 \
    ofport-request=3  \
    -- add-port br-int vm2 \
    -- set Interface vm2 type=internal \
    -- set Interface vm2  external_ids:iface-id=lsp2 \
    ofport-request=4
ovn-appctl -t ovn-controller debug/resume

 ovs-ofctl dump-flows br-int table=0 |grep in_port

 ovs-ofctl dump-flows br-int table=65|grep output

 ovn-appctl -t ovn-controller recompute

 ovs-ofctl dump-flows br-int table=0 |grep in_port

 ovs-ofctl dump-flows br-int table=65|grep output

reproduce with ovn2.13-20.09.0-23

# ./reproducer.sh 
 cookie=0xd4c894f6, duration=0.100s, table=0, n_packets=0, n_bytes=0, idle_age=0, priority=100,in_port=1 actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
 cookie=0xd017a563, duration=0.085s, table=0, n_packets=0, n_bytes=0, idle_age=0, priority=100,in_port=2 actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],resubmit(,8)
 cookie=0xd4c894f6, duration=0.004s, table=0, n_packets=0, n_bytes=0, idle_age=0, priority=100,in_port=3 actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
 cookie=0xd017a563, duration=0.004s, table=0, n_packets=0, n_bytes=0, idle_age=0, priority=100,in_port=4 actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],resubmit(,8)
 cookie=0xd4c894f6, duration=0.105s, table=65, n_packets=0, n_bytes=0, idle_age=0, priority=100,reg15=0x1,metadata=0x1 actions=output:1
 cookie=0xd017a563, duration=0.090s, table=65, n_packets=0, n_bytes=0, idle_age=0, priority=100,reg15=0x2,metadata=0x1 actions=output:2
 cookie=0xd4c894f6, duration=0.016s, table=0, n_packets=0, n_bytes=0, idle_age=0, priority=100,in_port=3 actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
 cookie=0xd017a563, duration=0.016s, table=0, n_packets=0, n_bytes=0, idle_age=0, priority=100,in_port=4 actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],resubmit(,8)
 cookie=0xd4c894f6, duration=0.116s, table=65, n_packets=0, n_bytes=0, idle_age=0, priority=100,reg15=0x1,metadata=0x1 actions=output:3
 cookie=0xd017a563, duration=0.101s, table=65, n_packets=0, n_bytes=0, idle_age=0, priority=100,reg15=0x2,metadata=0x1 actions=output:4


verify with ovn2.13-20.12.0-1 both in rhel7 and rhel8 
# ./reproducer.sh 
 cookie=0x3237e55e, duration=0.004s, table=0, n_packets=0, n_bytes=0, idle_age=0, priority=100,in_port=4 actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],resubmit(,8)
 cookie=0x5a7d20b0, duration=0.004s, table=0, n_packets=0, n_bytes=0, idle_age=0, priority=100,in_port=3 actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
 cookie=0x3237e55e, duration=0.009s, table=65, n_packets=0, n_bytes=0, idle_age=0, priority=100,reg15=0x2,metadata=0x1 actions=output:4
 cookie=0x5a7d20b0, duration=0.009s, table=65, n_packets=0, n_bytes=0, idle_age=0, priority=100,reg15=0x1,metadata=0x1 actions=output:3
 cookie=0x3237e55e, duration=0.015s, table=0, n_packets=0, n_bytes=0, idle_age=0, priority=100,in_port=4 actions=load:0x4->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],resubmit(,8)
 cookie=0x5a7d20b0, duration=0.015s, table=0, n_packets=0, n_bytes=0, idle_age=0, priority=100,in_port=3 actions=load:0x1->NXM_NX_REG13[],load:0x3->NXM_NX_REG11[],load:0x2->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
 cookie=0x3237e55e, duration=0.020s, table=65, n_packets=0, n_bytes=0, idle_age=0, priority=100,reg15=0x2,metadata=0x1 actions=output:4
 cookie=0x5a7d20b0, duration=0.020s, table=65, n_packets=0, n_bytes=0, idle_age=0, priority=100,reg15=0x1,metadata=0x1 actions=output:3

Comment 11 errata-xmlrpc 2021-02-03 21:55:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0407

Comment 12 Elvira 2021-03-15 10:15:10 UTC
*** Bug 1918616 has been marked as a duplicate of this bug. ***

Comment 13 Jakub Libosvar 2021-03-17 15:08:00 UTC
*** Bug 1889419 has been marked as a duplicate of this bug. ***

Comment 15 Lucas Alvares Gomes 2021-09-22 08:39:17 UTC
*** Bug 1976792 has been marked as a duplicate of this bug. ***