Created attachment 1873516 [details] OVN, OVS DB's archive and flows before and after recompute Description of problem: Originally detected in Upstream OpenStack job https://bugs.launchpad.net/tripleo/+bug/1968732. This is seen after ovn-2021 update[1] in CentOS Stream NFV SIG(ovn-2021-21.06.0-17.el9s --> ovn-2021-21.12.0-11.el9s). When the issue happens(ssh fails as vm cannot connect with metadata proxy 169.254.169.254:80), flows for metadata port(ip 10.100.0.2) are missing. After running recompute(ovs-appctl -t /var/run/ovn/ovn-controller.2.ctl recompute) flows appears and all works fine. [root@node-0002427663 /]# ovs-ofctl show br-int OFPT_FEATURES_REPLY (xid=0x2): dpid:0000ea1fd3f2be98 n_tables:254, n_buffers:0 capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst 1(patch-br-int-to): addr:32:44:73:93:c8:40 config: 0 state: 0 speed: 0 Mbps now, 0 Mbps max 46(tapef1645c0-13): addr:fe:16:3e:a7:0e:10 config: 0 state: 0 current: 10MB-FD COPPER speed: 10 Mbps now, 0 Mbps max 47(tapa3803bbf-b0): addr:ea:fe:c4:d2:ab:b8 config: 0 state: 0 current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max 48(tap4987daed-f3): addr:fe:16:3e:e2:c5:59 config: 0 state: 0 current: 10MB-FD COPPER speed: 10 Mbps now, 0 Mbps max 49(tap96ca70a5-6d): addr:fe:16:3e:29:c1:e9 config: 0 state: 0 current: 10MB-FD COPPER speed: 10 Mbps now, 0 Mbps max 50(tape6d9123e-6f): addr:fe:16:3e:9a:67:86 config: 0 state: 0 current: 10MB-FD COPPER speed: 10 Mbps now, 0 Mbps max 51(tapd59710fd-ed): addr:fe:16:3e:a0:2c:2e config: 0 state: 0 current: 10MB-FD COPPER speed: 10 Mbps now, 0 Mbps max 52(tapfe28971b-5c): addr:fe:16:3e:fa:ee:21 config: 0 state: 0 current: 10MB-FD COPPER speed: 10 Mbps now, 0 Mbps max 71(tap67e2e6ae-0e): addr:fe:16:3e:8a:77:69 config: 0 state: 0 current: 10MB-FD COPPER speed: 10 Mbps now, 0 Mbps max 72(tapd00f18e1-90): addr:42:f4:52:46:51:da config: 0 state: 0 current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max LOCAL(br-int): addr:ea:1f:d3:f2:be:98 config: PORT_DOWN state: LINK_DOWN speed: 0 Mbps now, 0 Mbps max OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0 Metadata port ==> 72(tapd00f18e1-90): addr:42:f4:52:46:51:da [root@node-0002427663 /]# ovs-ofctl dump-flows br-int|grep output: cookie=0xf91d47f4, duration=12983.494s, table=65, n_packets=2043, n_bytes=207683, idle_age=155, priority=100,reg15=0x1,metadata=0x2 actions=output:1 cookie=0xe2f863bd, duration=5577.208s, table=65, n_packets=68, n_bytes=7626, idle_age=5475, priority=100,reg15=0x3,metadata=0x4 actions=output:46 cookie=0xa397bc61, duration=5576.884s, table=65, n_packets=583, n_bytes=50253, idle_age=5530, priority=100,reg15=0x1,metadata=0x4 actions=output:47 cookie=0x2124f02e, duration=5575.984s, table=65, n_packets=67, n_bytes=7500, idle_age=5479, priority=100,reg15=0x4,metadata=0x4 actions=output:48 cookie=0x42b7518f, duration=5575.761s, table=65, n_packets=64, n_bytes=7374, idle_age=5487, priority=100,reg15=0x5,metadata=0x4 actions=output:49 cookie=0xb2cef043, duration=5575.489s, table=65, n_packets=63, n_bytes=7332, idle_age=5477, priority=100,reg15=0x6,metadata=0x4 actions=output:50 cookie=0x72093ac3, duration=5575.131s, table=65, n_packets=61, n_bytes=7248, idle_age=5471, priority=100,reg15=0x7,metadata=0x4 actions=output:51 cookie=0x1cddeec2, duration=5574.421s, table=65, n_packets=58, n_bytes=7098, idle_age=5469, priority=100,reg15=0x8,metadata=0x4 actions=output:52 cookie=0xf589365e, duration=946.264s, table=65, n_packets=62, n_bytes=4 544, idle_age=15, priority=100,reg15=0x3,metadata=0x8 actions=output:71 # Flow missing ^ for metadata port 72 [root@node-0002427663 /]# ovs-appctl -t /var/run/ovn/ovn-controller.2.ctl recompute [root@node-0002427663 /]# ovs-ofctl dump-flows br-int|grep output: cookie=0xf91d47f4, duration=13472.388s, table=65, n_packets=2056, n_bytes=208745, idle_age=371, priority=100,reg15=0x1,metadata=0x2 actions=output:1 cookie=0xe2f863bd, duration=6066.102s, table=65, n_packets=68, n_bytes=7626, idle_age=5964, priority=100,reg15=0x3,metadata=0x4 actions=output:46 cookie=0xa397bc61, duration=6065.778s, table=65, n_packets=583, n_bytes=50253, idle_age=6019, priority=100,reg15=0x1,metadata=0x4 actions=output:47 cookie=0x2124f02e, duration=6064.878s, table=65, n_packets=67, n_bytes=7500, idle_age=5968, priority=100,reg15=0x4,metadata=0x4 actions=output:48 cookie=0x42b7518f, duration=6064.655s, table=65, n_packets=64, n_bytes=7374, idle_age=5976, priority=100,reg15=0x5,metadata=0x4 actions=output:49 cookie=0xb2cef043, duration=6064.383s, table=65, n_packets=63, n_bytes=7332, idle_age=5966, priority=100,reg15=0x6,metadata=0x4 actions=output:50 cookie=0x72093ac3, duration=6064.025s, table=65, n_packets=61, n_bytes=7248, idle_age=5959, priority=100,reg15=0x7,metadata=0x4 actions=output:51 cookie=0x1cddeec2, duration=6063.315s, table=65, n_packets=58, n_bytes=7098, idle_age=5958, priority=100,reg15=0x8,metadata=0x4 actions=output:52 cookie=0xf589365e, duration=1435.158s, table=65, n_packets=66, n_bytes=4768, idle_age=366, priority=100,reg15=0x3,metadata=0x8 actions=output:71 cookie=0x6f326bd2, duration=52.307s, table=65, n_packets=0, n_bytes=0, idle_age=52, priority=100,reg15=0x1,metadata=0x8 actions=output:72 # After recompute flow for port 72 appears ^. Following additional flows appears after running recompute:- cookie=0x6f326bd2, duration=30.050s, table=0, n_packets=0, n_bytes=0, idle_age=30, priority=100,in_port=72 actions=load:0x1b->NXM_NX_REG13[],load:0x18->NXM_NX_REG11[],load:0x19->NXM_NX_REG12[],load:0x8->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],load:0x1->NXM_NX_REG10[10],resubmit(,8) cookie=0x6f326bd2, duration=30.053s, table=37, n_packets=0, n_bytes=0, idle_age=30, priority=150,reg14=0x1,metadata=0x8 actions=resubmit(,38) cookie=0x6f326bd2, duration=30.053s, table=38, n_packets=0, n_bytes=0, idle_age=30, priority=100,reg15=0x1,metadata=0x8 actions=load:0x1b->NXM_NX_REG13[],load:0x18->NXM_NX_REG11[],load:0x19->NXM_NX_REG12[],resubmit(,39) cookie=0x6f326bd2, duration=30.053s, table=39, n_packets=0, n_bytes=0, idle_age=30, priority=100,reg10=0/0x1,reg14=0x1,reg15=0x1,metadata=0x8 actions=drop cookie=0x6f326bd2, duration=30.054s, table=64, n_packets=0, n_bytes=0, idle_age=30, priority=100,reg10=0x1/0x1,reg15=0x1,metadata=0x8 actions=push:NXM_OF_IN_PORT[],load:0xffff->NXM_OF_IN_PORT[],resubmit(,65),pop:NXM_OF_IN_PORT[] cookie=0x6f326bd2, duration=30.054s, table=65, n_packets=0, n_bytes=0, idle_age=30, priority=100,reg15=0x1,metadata=0x8 actions=output:72 [1] https://review.rdoproject.org/r/c/nfvinfo/+/40817 Version-Release number of selected component (if applicable): ovn-2021-21.12.0-11.el9s How reproducible: Randomly reproduces as one of tempest test fails Steps to Reproduce: 1. OpenStack wallaby ovn based multinode deployment 2. run following until it fails:- sudo tempest run --regex tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_router_admin_state Actual results: Test fails while performing SSH. Expected results: Test should pass Additional info: This a regression caused with ovn-2021-21.12.0-11.el9s. Attaching following in a tar archive:- - ovnnb_db before recompute - ovnsb_db before recompute - ovs db before and after recompute - Output of "ovs-ofctl dump-flows br-int" before and after running recompute
Patches posted: https://patchwork.ozlabs.org/project/ovn/list/?series=301546
### Reproduced on [root@bz-2076604 ~]# rpm -qa |grep -E 'ovn|openvswitch' openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch ovn-2021-21.12.0-11.el8fdp.x86_64 ovn-2021-central-21.12.0-11.el8fdp.x86_64 ovn-2021-host-21.12.0-11.el8fdp.x86_64 openvswitch2.15-2.15.0-93.el8fdp.x86_64 systemctl start ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 systemctl start openvswitch ovs-vsctl set open . external_ids:system-id=hv1 # IP address configuration to physical interface ifconfig ens1f0 42.42.42.1 netmask 255.255.0.0 ovs-vsctl set open . external_ids:ovn-remote=tcp:42.42.42.1:6642 ovs-vsctl set open . external_ids:ovn-encap-type=geneve ovs-vsctl set open . external_ids:ovn-encap-ip=42.42.42.1 systemctl start ovn-controller ovn-nbctl ls-add ls1 ovn-nbctl lsp-add ls1 ls1p1 ovn-nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:11 42.42.42.11" ovn-nbctl lsp-add ls1 ls1lp ovn-nbctl lsp-set-type ls1lp localport ovn-nbctl lsp-set-addresses ls1lp "00:00:00:01:01:02 42.42.42.2" ovn-nbctl lr-add lr1 ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:00:01 42.42.42.154/24 2001::a/64 ovn-nbctl lsp-add ls1 ls1-lr1 ovn-nbctl lsp-set-addresses ls1-lr1 "00:00:00:00:00:01" ovn-nbctl lsp-set-type ls1-lr1 router ovn-nbctl lsp-set-options ls1-lr1 router-port=lr1-ls1 ovn-nbctl lrp-add lr1 lr1-pub 00:00:00:00:00:02 172.17.1.154/24 7011::a/64 ovn-nbctl lrp-set-gateway-chassis lr1-pub hv1 ovn-nbctl lr-route-add lr1 0.0.0.0/0 172.17.1.100 lr1-pub ovn-nbctl lr-route-add lr1 ::/0 7011::100 lr1-pub ovn-nbctl ls-add pub ovn-nbctl lsp-add pub pub-lr1 ovn-nbctl lsp-set-type pub-lr1 router ovn-nbctl lsp-set-addresses pub-lr1 router ovn-nbctl lsp-set-options pub-lr1 router-port=lr1-pub ovn-nbctl lsp-add pub ln0 ovn-nbctl lsp-set-type ln0 localnet ovn-nbctl lsp-set-options ln0 network_name=phys ovn-nbctl lsp-set-addresses ln0 unknown ovs-vsctl add-port br-int ls1p1 -- set interface ls1p1 type=internal external_ids:iface-id=ls1p1 ovs-vsctl add-port br-int ls1lp -- set interface ls1lp type=internal external_ids:iface-id=ls1lp COOKIE=$(ovn-sbctl find port_binding logical_port=ls1lp|grep uuid|cut -d: -f2| cut -c1-9) ovs-ofctl dump-flows br-int|grep $COOKIE cookie=0x46c772c8, duration=2.354s, table=0, n_packets=0, n_bytes=0, idle_age=2, priority=100,in_port=2 actions=load:0x7->NXM_NX_REG13[],load:0x1->NXM_NX_REG11[],load:0x5->NXM_NX_REG12[],load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[],load:0x1->NXM_NX_REG10[10],resubmit(,8) cookie=0x46c772c8, duration=2.354s, table=38, n_packets=0, n_bytes=0, idle_age=2, priority=100,reg15=0x2,metadata=0x1 actions=load:0x7->NXM_NX_REG13[],load:0x1->NXM_NX_REG11[],load:0x5->NXM_NX_REG12[],resubmit(,39) cookie=0x46c772c8, duration=2.354s, table=39, n_packets=0, n_bytes=0, idle_age=2, priority=100,reg10=0/0x1,reg14=0x2,reg15=0x2,metadata=0x1 actions=drop cookie=0x46c772c8, duration=2.354s, table=64, n_packets=0, n_bytes=0, idle_age=2, priority=100,reg10=0x1/0x1,reg15=0x2,metadata=0x1 actions=push:NXM_OF_IN_PORT[],load:0xffff->NXM_OF_IN_PORT[],resubmit(,65),pop:NXM_OF_IN_PORT[] cookie=0x46c772c8, duration=2.354s, table=65, n_packets=0, n_bytes=0, idle_age=2, priority=100,reg15=0x2,metadata=0x1 actions=output:2 # Cleanup ovs-vsctl del-port br-int ls1p1 ovn-nbctl lrp-del-gateway-chassis lr1-pub hv1 ovn-nbctl lrp-del lr1-pub ovn-nbctl lrp-del lr1-ls1 ovn-nbctl lsp-del pub-lr1 ovn-nbctl lsp-del ls1-lr1 ovn-nbctl lsp-del ln0 ovn-nbctl ls-del pub ovn-nbctl lsp-del ls1lp ovn-nbctl ls-del ls1 ovn-nbctl lr-del lr1 ovs-vsctl del-port br-int ls1lp ovs-ofctl dump-flows br-int|grep $COOKIE cookie=0x46c772c8, duration=1.034s, table=37, n_packets=0, n_bytes=0, idle_age=1, priority=150,reg14=0x2,metadata=0x1 actions=resubmit(,38) <=========================== One flow still exists ### Verified on [root@bz-2076604 ~]# rpm -qa |grep -E 'ovn|openvswitch' openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch ovn-2021-21.12.0-73.el8fdp.x86_64 ovn-2021-central-21.12.0-73.el8fdp.x86_64 ovn-2021-host-21.12.0-73.el8fdp.x86_64 openvswitch2.15-2.15.0-93.el8fdp.x86_64 ovs-ofctl dump-flows br-int|grep $COOKIE <=========================== All flows removed
### Also verified on [root@bz-2074537 ~]# rpm -qa |grep -E 'ovn|openvswitch' openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch ovn22.03-22.03.0-52.el8fdp.x86_64 ovn22.03-host-22.03.0-52.el8fdp.x86_64 openvswitch2.15-2.15.0-93.el8fdp.x86_64 ovn22.03-central-22.03.0-52.el8fdp.x86_64 ### And verified on [root@bz-2074537 ~]# rpm -qa |grep -E 'ovn|openvswitch' openvswitch-selinux-extra-policy-1.0-31.el9fdp.noarch openvswitch2.16-2.16.0-52.el9fdp.x86_64 ovn22.03-22.03.0-52.el9fdp.x86_64 ovn22.03-central-22.03.0-52.el9fdp.x86_64 ovn22.03-host-22.03.0-52.el9fdp.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (ovn bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:5446