Bug 2076604

Summary: Random tempest test failures due to missing flows for metadata port
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Yatin Karel <ykarel>
Component: ovn-2021Assignee: Ales Musil <amusil>
Status: CLOSED ERRATA QA Contact: Ehsan Elahi <eelahi>
Severity: unspecified Docs Contact:
Priority: high    
Version: FDP 22.ACC: amusil, ctrautma, jiji, jishi, mmichels
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2088454 (view as bug list) Environment:
Last Closed: 2022-06-30 18:00:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2088454, 2093901, 2129866, 2144564    
Attachments:
Description Flags
OVN, OVS DB's archive and flows before and after recompute none

Description Yatin Karel 2022-04-19 13:26:37 UTC
Created attachment 1873516 [details]
OVN, OVS DB's archive and flows before and after recompute

Description of problem:
Originally detected in Upstream OpenStack job https://bugs.launchpad.net/tripleo/+bug/1968732.
This is seen after ovn-2021 update[1] in CentOS Stream NFV SIG(ovn-2021-21.06.0-17.el9s --> ovn-2021-21.12.0-11.el9s).

When the issue happens(ssh fails as vm cannot connect with metadata proxy 169.254.169.254:80), flows for metadata port(ip 10.100.0.2) are missing. After running recompute(ovs-appctl -t /var/run/ovn/ovn-controller.2.ctl recompute) flows appears and all works fine.

[root@node-0002427663 /]# ovs-ofctl show br-int                           
OFPT_FEATURES_REPLY (xid=0x2): dpid:0000ea1fd3f2be98
n_tables:254, n_buffers:0
capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst
 1(patch-br-int-to): addr:32:44:73:93:c8:40
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 46(tapef1645c0-13): addr:fe:16:3e:a7:0e:10
     config:     0                                                       
     state:      0
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max
 47(tapa3803bbf-b0): addr:ea:fe:c4:d2:ab:b8
     config:     0                                                       
     state:      0
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 48(tap4987daed-f3): addr:fe:16:3e:e2:c5:59
     config:     0                                                       
     state:      0        
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max
 49(tap96ca70a5-6d): addr:fe:16:3e:29:c1:e9
     config:     0
     state:      0
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max
 50(tape6d9123e-6f): addr:fe:16:3e:9a:67:86
     config:     0
     state:      0
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max
 51(tapd59710fd-ed): addr:fe:16:3e:a0:2c:2e
     config:     0
     state:      0
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max
 52(tapfe28971b-5c): addr:fe:16:3e:fa:ee:21
     config:     0
     state:      0
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max
 71(tap67e2e6ae-0e): addr:fe:16:3e:8a:77:69
     config:     0
     state:      0
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max
 72(tapd00f18e1-90): addr:42:f4:52:46:51:da
     config:     0
     state:      0
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 LOCAL(br-int): addr:ea:1f:d3:f2:be:98
     config:     PORT_DOWN
     state:      LINK_DOWN
     speed: 0 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0

Metadata port ==> 72(tapd00f18e1-90): addr:42:f4:52:46:51:da

[root@node-0002427663 /]# ovs-ofctl dump-flows br-int|grep output:
 cookie=0xf91d47f4, duration=12983.494s, table=65, n_packets=2043, n_bytes=207683, idle_age=155, priority=100,reg15=0x1,metadata=0x2 actions=output:1
 cookie=0xe2f863bd, duration=5577.208s, table=65, n_packets=68, n_bytes=7626, idle_age=5475, priority=100,reg15=0x3,metadata=0x4 actions=output:46
 cookie=0xa397bc61, duration=5576.884s, table=65, n_packets=583, n_bytes=50253, idle_age=5530, priority=100,reg15=0x1,metadata=0x4 actions=output:47
 cookie=0x2124f02e, duration=5575.984s, table=65, n_packets=67, n_bytes=7500, idle_age=5479, priority=100,reg15=0x4,metadata=0x4 actions=output:48
 cookie=0x42b7518f, duration=5575.761s, table=65, n_packets=64, n_bytes=7374, idle_age=5487, priority=100,reg15=0x5,metadata=0x4 actions=output:49
 cookie=0xb2cef043, duration=5575.489s, table=65, n_packets=63, n_bytes=7332, idle_age=5477, priority=100,reg15=0x6,metadata=0x4 actions=output:50
 cookie=0x72093ac3, duration=5575.131s, table=65, n_packets=61, n_bytes=7248, idle_age=5471, priority=100,reg15=0x7,metadata=0x4 actions=output:51
 cookie=0x1cddeec2, duration=5574.421s, table=65, n_packets=58, n_bytes=7098, idle_age=5469, priority=100,reg15=0x8,metadata=0x4 actions=output:52
 cookie=0xf589365e, duration=946.264s, table=65, n_packets=62, n_bytes=4	544, idle_age=15, priority=100,reg15=0x3,metadata=0x8 actions=output:71

# Flow missing ^ for metadata port 72

[root@node-0002427663 /]# ovs-appctl -t /var/run/ovn/ovn-controller.2.ctl recompute
 
 
 [root@node-0002427663 /]# ovs-ofctl dump-flows br-int|grep output:
 cookie=0xf91d47f4, duration=13472.388s, table=65, n_packets=2056, n_bytes=208745, idle_age=371, priority=100,reg15=0x1,metadata=0x2 actions=output:1
 cookie=0xe2f863bd, duration=6066.102s, table=65, n_packets=68, n_bytes=7626, idle_age=5964, priority=100,reg15=0x3,metadata=0x4 actions=output:46
 cookie=0xa397bc61, duration=6065.778s, table=65, n_packets=583, n_bytes=50253, idle_age=6019, priority=100,reg15=0x1,metadata=0x4 actions=output:47
 cookie=0x2124f02e, duration=6064.878s, table=65, n_packets=67, n_bytes=7500, idle_age=5968, priority=100,reg15=0x4,metadata=0x4 actions=output:48
 cookie=0x42b7518f, duration=6064.655s, table=65, n_packets=64, n_bytes=7374, idle_age=5976, priority=100,reg15=0x5,metadata=0x4 actions=output:49
 cookie=0xb2cef043, duration=6064.383s, table=65, n_packets=63, n_bytes=7332, idle_age=5966, priority=100,reg15=0x6,metadata=0x4 actions=output:50
 cookie=0x72093ac3, duration=6064.025s, table=65, n_packets=61, n_bytes=7248, idle_age=5959, priority=100,reg15=0x7,metadata=0x4 actions=output:51
 cookie=0x1cddeec2, duration=6063.315s, table=65, n_packets=58, n_bytes=7098, idle_age=5958, priority=100,reg15=0x8,metadata=0x4 actions=output:52
 cookie=0xf589365e, duration=1435.158s, table=65, n_packets=66, n_bytes=4768, idle_age=366, priority=100,reg15=0x3,metadata=0x8 actions=output:71
 cookie=0x6f326bd2, duration=52.307s, table=65, n_packets=0, n_bytes=0, idle_age=52, priority=100,reg15=0x1,metadata=0x8 actions=output:72

# After recompute flow for port 72 appears ^. Following additional flows appears after running recompute:-

cookie=0x6f326bd2, duration=30.050s, table=0, n_packets=0, n_bytes=0, idle_age=30, priority=100,in_port=72 actions=load:0x1b->NXM_NX_REG13[],load:0x18->NXM_NX_REG11[],load:0x19->NXM_NX_REG12[],load:0x8->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],load:0x1->NXM_NX_REG10[10],resubmit(,8)
cookie=0x6f326bd2, duration=30.053s, table=37, n_packets=0, n_bytes=0, idle_age=30, priority=150,reg14=0x1,metadata=0x8 actions=resubmit(,38)
cookie=0x6f326bd2, duration=30.053s, table=38, n_packets=0, n_bytes=0, idle_age=30, priority=100,reg15=0x1,metadata=0x8 actions=load:0x1b->NXM_NX_REG13[],load:0x18->NXM_NX_REG11[],load:0x19->NXM_NX_REG12[],resubmit(,39)
cookie=0x6f326bd2, duration=30.053s, table=39, n_packets=0, n_bytes=0, idle_age=30, priority=100,reg10=0/0x1,reg14=0x1,reg15=0x1,metadata=0x8 actions=drop
cookie=0x6f326bd2, duration=30.054s, table=64, n_packets=0, n_bytes=0, idle_age=30, priority=100,reg10=0x1/0x1,reg15=0x1,metadata=0x8 actions=push:NXM_OF_IN_PORT[],load:0xffff->NXM_OF_IN_PORT[],resubmit(,65),pop:NXM_OF_IN_PORT[]
cookie=0x6f326bd2, duration=30.054s, table=65, n_packets=0, n_bytes=0, idle_age=30, priority=100,reg15=0x1,metadata=0x8 actions=output:72


[1] https://review.rdoproject.org/r/c/nfvinfo/+/40817

Version-Release number of selected component (if applicable):
ovn-2021-21.12.0-11.el9s

How reproducible:
Randomly reproduces as one of tempest test fails

Steps to Reproduce:
1. OpenStack wallaby ovn based multinode deployment
2. run following until it fails:- sudo tempest run --regex tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_router_admin_state


Actual results:
Test fails while performing SSH.

Expected results:
Test should pass

Additional info:
This a regression caused with ovn-2021-21.12.0-11.el9s.
Attaching following in a tar archive:-
- ovnnb_db before recompute
- ovnsb_db before recompute
- ovs db before and after recompute
- Output of "ovs-ofctl dump-flows br-int" before and after running recompute

Comment 6 Ales Musil 2022-05-23 08:47:42 UTC
Patches posted: https://patchwork.ozlabs.org/project/ovn/list/?series=301546

Comment 10 Ehsan Elahi 2022-06-06 21:01:49 UTC
### Reproduced on 
[root@bz-2076604 ~]# rpm -qa |grep -E 'ovn|openvswitch'
openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch
ovn-2021-21.12.0-11.el8fdp.x86_64
ovn-2021-central-21.12.0-11.el8fdp.x86_64
ovn-2021-host-21.12.0-11.el8fdp.x86_64
openvswitch2.15-2.15.0-93.el8fdp.x86_64


systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642
systemctl start openvswitch
ovs-vsctl set open . external_ids:system-id=hv1

# IP address configuration to physical interface
ifconfig ens1f0 42.42.42.1 netmask 255.255.0.0

ovs-vsctl set open . external_ids:ovn-remote=tcp:42.42.42.1:6642
ovs-vsctl set open . external_ids:ovn-encap-type=geneve
ovs-vsctl set open . external_ids:ovn-encap-ip=42.42.42.1
systemctl start ovn-controller

ovn-nbctl ls-add ls1
 
ovn-nbctl lsp-add ls1 ls1p1
ovn-nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:11 42.42.42.11"
 
ovn-nbctl lsp-add ls1 ls1lp
ovn-nbctl lsp-set-type ls1lp localport
ovn-nbctl lsp-set-addresses ls1lp "00:00:00:01:01:02 42.42.42.2"
 
ovn-nbctl lr-add lr1                                                                                  
ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:00:01 42.42.42.154/24 2001::a/64
ovn-nbctl lsp-add ls1 ls1-lr1                                                                        
ovn-nbctl lsp-set-addresses ls1-lr1 "00:00:00:00:00:01"
ovn-nbctl lsp-set-type ls1-lr1 router                                                                
ovn-nbctl lsp-set-options ls1-lr1 router-port=lr1-ls1
                                                                                                     
ovn-nbctl lrp-add lr1 lr1-pub  00:00:00:00:00:02 172.17.1.154/24 7011::a/64
ovn-nbctl lrp-set-gateway-chassis lr1-pub hv1                                                        
ovn-nbctl lr-route-add lr1 0.0.0.0/0 172.17.1.100 lr1-pub
ovn-nbctl lr-route-add lr1 ::/0 7011::100 lr1-pub                                                    
                                                                                                     
ovn-nbctl ls-add pub                                                                                  
ovn-nbctl lsp-add pub pub-lr1                                                                        
ovn-nbctl lsp-set-type pub-lr1 router                                                                
ovn-nbctl lsp-set-addresses pub-lr1 router                                                            
ovn-nbctl lsp-set-options pub-lr1 router-port=lr1-pub
                                                                                                     
ovn-nbctl lsp-add pub ln0                                                                            
ovn-nbctl lsp-set-type ln0 localnet                                                                  
ovn-nbctl lsp-set-options ln0 network_name=phys                                                      
ovn-nbctl lsp-set-addresses ln0 unknown                
 
ovs-vsctl add-port br-int ls1p1 -- set interface ls1p1 type=internal external_ids:iface-id=ls1p1
ovs-vsctl add-port br-int ls1lp -- set interface ls1lp type=internal external_ids:iface-id=ls1lp

COOKIE=$(ovn-sbctl find port_binding logical_port=ls1lp|grep uuid|cut -d: -f2| cut -c1-9)
ovs-ofctl dump-flows br-int|grep $COOKIE

cookie=0x46c772c8, duration=2.354s, table=0, n_packets=0, n_bytes=0, idle_age=2, priority=100,in_port=2 actions=load:0x7->NXM_NX_REG13[],load:0x1->NXM_NX_REG11[],load:0x5->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],load:0x1->NXM_NX_REG10[10],resubmit(,8)
 cookie=0x46c772c8, duration=2.354s, table=38, n_packets=0, n_bytes=0, idle_age=2, priority=100,reg15=0x2,metadata=0x1 actions=load:0x7->NXM_NX_REG13[],load:0x1->NXM_NX_REG11[],load:0x5->NXM_NX_REG12[],resubmit(,39)
 cookie=0x46c772c8, duration=2.354s, table=39, n_packets=0, n_bytes=0, idle_age=2, priority=100,reg10=0/0x1,reg14=0x2,reg15=0x2,metadata=0x1 actions=drop
 cookie=0x46c772c8, duration=2.354s, table=64, n_packets=0, n_bytes=0, idle_age=2, priority=100,reg10=0x1/0x1,reg15=0x2,metadata=0x1 actions=push:NXM_OF_IN_PORT[],load:0xffff->NXM_OF_IN_PORT[],resubmit(,65),pop:NXM_OF_IN_PORT[]
 cookie=0x46c772c8, duration=2.354s, table=65, n_packets=0, n_bytes=0, idle_age=2, priority=100,reg15=0x2,metadata=0x1 actions=output:2

# Cleanup 
ovs-vsctl del-port br-int ls1p1
ovn-nbctl lrp-del-gateway-chassis lr1-pub hv1
ovn-nbctl lrp-del lr1-pub
ovn-nbctl lrp-del lr1-ls1
 
ovn-nbctl lsp-del pub-lr1
ovn-nbctl lsp-del ls1-lr1
 
ovn-nbctl lsp-del ln0
ovn-nbctl ls-del pub
 
ovn-nbctl lsp-del ls1lp
ovn-nbctl ls-del ls1
ovn-nbctl lr-del lr1
ovs-vsctl del-port br-int ls1lp

ovs-ofctl dump-flows br-int|grep $COOKIE
cookie=0x46c772c8, duration=1.034s, table=37, n_packets=0, n_bytes=0, idle_age=1, priority=150,reg14=0x2,metadata=0x1 actions=resubmit(,38)

<=========================== One flow still exists

### Verified on 
[root@bz-2076604 ~]# rpm -qa |grep -E 'ovn|openvswitch'
openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch
ovn-2021-21.12.0-73.el8fdp.x86_64
ovn-2021-central-21.12.0-73.el8fdp.x86_64
ovn-2021-host-21.12.0-73.el8fdp.x86_64
openvswitch2.15-2.15.0-93.el8fdp.x86_64

ovs-ofctl dump-flows br-int|grep $COOKIE
<=========================== All flows removed

Comment 11 Ehsan Elahi 2022-06-07 12:41:48 UTC
### Also verified on 

[root@bz-2074537 ~]# rpm -qa |grep -E 'ovn|openvswitch'
 openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch
 ovn22.03-22.03.0-52.el8fdp.x86_64
 ovn22.03-host-22.03.0-52.el8fdp.x86_64
 openvswitch2.15-2.15.0-93.el8fdp.x86_64
 ovn22.03-central-22.03.0-52.el8fdp.x86_64

### And verified on

[root@bz-2074537 ~]# rpm -qa |grep -E 'ovn|openvswitch'
openvswitch-selinux-extra-policy-1.0-31.el9fdp.noarch
openvswitch2.16-2.16.0-52.el9fdp.x86_64
ovn22.03-22.03.0-52.el9fdp.x86_64
ovn22.03-central-22.03.0-52.el9fdp.x86_64
ovn22.03-host-22.03.0-52.el9fdp.x86_64

Comment 13 errata-xmlrpc 2022-06-30 18:00:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5446