Bug 1766410 - [OVN] A stale ports in ovn-northd
Summary: [OVN] A stale ports in ovn-northd
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Terry Wilson
QA Contact: Eran Kuris
URL:
Whiteboard:
: 1770295 (view as bug list)
Depends On:
Blocks: 1770953
TreeView+ depends on / blocked
 
Reported: 2019-10-29 00:32 UTC by Robin Cernin
Modified: 2023-10-06 18:45 UTC (History)
13 users (show)

Fixed In Version: python-networking-ovn-4.0.3-19.el7ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-10 11:26:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-29383 0 None None None 2023-10-06 18:45:03 UTC
Red Hat Knowledge Base (Solution) 4565331 0 None None None 2020-04-07 08:32:16 UTC
Red Hat Product Errata RHBA-2020:0770 0 None None None 2020-03-10 11:26:58 UTC

Description Robin Cernin 2019-10-29 00:32:27 UTC
Description of problem:

A stale ports in ovn-northd

Version-Release number of selected component (if applicable):

neutron-metadata-agent-ovn:13.0-73, with hotfix version of ovn_controller: ovn-controller:13.0-86 with BZ# 1740335

How reproducible:

Metadata server responds with HTTP 404 to VMs[1]
OVN ovn-metadata-agent in debug on that compute. Logs show that port is not bond to that chassis [2][3]
Checking the ports with ovn-nbctl show 2 ports. The one mention in [3] is unbound [4]
Deleting the port (ovn-nbctl lsp-del) makes metadata work again [5]
Checked with a similar VM [6]. If deleting the VM, the stale port remains, so it is junk from a previous activity/failure.

[1]
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:aa:3b:c8:4f brd ff:ff:ff:ff:ff:ff
    inet x.y.z.z/24 brd x.y.z.255 scope global ens3
       valid_lft forever preferred_lft forever
# curl -D -  http://169.254.169.254/openstack
HTTP/1.1 404 Not Found
Content-Length: 154
Content-Type: text/html; charset=UTF-8
Date: Tue, 15 Oct 2019 20:49:58 GMT

<html>
 <head>
  <title>404 Not Found</title>
 </head>
 <body>
  <h1>404 Not Found</h1>
  The resource could not be found.<br /><br />

[2]
/var/log/containers/neutron/ovn-metadata-agent.log
X-Forwarded-For: x.y.z.z
X-Ovn-Network-Id: 0ae4a2d3-4230-4d96-ad03-a38425dcd271 __call__ /usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/server.py:64
2019-10-15 20:48:09.487 310202 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn command(idx=0): DbListCommand(if_exists=False, records=None, table=Port_Binding, columns=None, row=True) do_commit /usr/lib/python2.7
/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84
2019-10-15 20:48:09.491 310202 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Transaction caused no change do_commit /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:121
2019-10-15 20:48:09.541 310152 DEBUG ovsdbapp.backend.ovs_idl.event [-] PortBindingChassisEvent : Matched Port_Binding, update, None None matches /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/event.py:40
2019-10-15 20:48:09.570 310202 INFO eventlet.wsgi.server [-] 10.152.127.123,<local> "GET /openstack HTTP/1.1" status: 404  len: 297 time: 0.0841720


[3]
/var/log/containers/neutron/ovn-metadata-agent.log
2019-10-15 20:48:08.370 310204 DEBUG eventlet.wsgi.server [-] (310204) accepted '' server /usr/lib/python2.7/site-packages/eventlet/wsgi.py:883
2019-10-15 20:48:08.371 310204 DEBUG networking_ovn.agent.metadata.server [-] Request: GET /openstack HTTP/1.0
Accept: */*
Connection: close
Content-Type: text/plain
Host: 169.254.169.254
User-Agent: curl/7.47.0
X-Forwarded-For: x.y.z.z
X-Ovn-Network-Id: 0ae4a2d3-4230-4d96-ad03-a38425dcd271 __call__ /usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/server.py:64
2019-10-15 20:48:08.372 310204 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn command(idx=0): DbListCommand(if_exists=False, records=None, table=Port_Binding, columns=None, row=True) do_commit /usr/lib/python2.7
/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84
2019-10-15 20:48:08.375 310204 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Transaction caused no change do_commit /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:121
2019-10-15 20:48:08.453 310204 INFO eventlet.wsgi.server [-] 10.152.127.123,<local> "GET /openstack HTTP/1.1" status: 404  len: 297 time: 0.0826011

[4]
ovn-nbctl show
switch 5f3db358-e37c-4b12-9830-6cc67f101eaf (neutron-ea56cf76-622f-4113-8691-36a2c524773c) (aka Network-Test)
    port 4c2169b8-ca46-421b-b298-32a08fa942b3
        addresses: ["fa:16:aa:66:d2:4b x.y.z.z"]
    port 2ae58c96-ce55-4e9f-bda9-226905a2328d
        addresses: ["fa:16:aa:3b:c8:4f x.y.z.z"]

()[root@controller-2 /]# ovn-nbctl lsp-get-options 4c2169b8-ca46-421b-b298-32a08fa942b3
requested-chassis=
()[root@controller-2 /]# ovn-nbctl lsp-get-options 2ae58c96-ce55-4e9f-bda9-226905a2328d
requested-chassis=compute.example.com

[5]
root@xyz:~# curl -D -  http://169.254.169.254/openstack
HTTP/1.1 200 OK
Content-Type: text/plain; charset=UTF-8
Content-Length: 83
Date: Tue, 15 Oct 2019 22:01:54 GMT

2012-08-10
2013-04-04
2013-10-17
2015-10-15
2016-06-30
2016-10-06
2017-02-22

[6]
    port df6d5ffd-afb4-4099-a7c2-bd9054b5ebd0
        addresses: ["fa:16:aa:8e:de:86 x.y.z.z"]
	  port aff8ef4b-30da-447c-9bc8-582858acfd80
        addresses: ["fa:16:aa:31:9f:9d x.y.z.z"]

()[root@controller-2 /]# ovn-nbctl lsp-get-options df6d5ffd-afb4-4099-a7c2-bd9054b5ebd0
requested-chassis=
()[root@controller-2 /]# ovn-nbctl lsp-get-options aff8ef4b-30da-447c-9bc8-582858acfd80
requested-chassis=compute.example.com

After deleting the target VM, the stale port remains there.

===

The stale ports are now reproduced in the environment

The stale ports seem to be reproduced. The below ones are generated in the last 24 hours.[1][2]

Attaching SQL dump and OVN port list to crosscheck.[3]


[1]
_uuid               : 2463d9ba-d4db-4fbb-82db-8bff1722dd27
addresses           : ["fa:16:aa:78:ad:65 y.x.z.z"]
dhcpv4_options      : aa745ae1-21b0-4369-a833-c5652e2f2d53
dhcpv6_options      : []
dynamic_addresses   : []
enabled             : true
external_ids        : {"neutron:cidrs"="y.x.z.z/23", "neutron:device_id"="db9ff13f-0119-4d0b-b838-a7a57ac17ace", "neutron:device_owner"="", "neutron:network_name"="neutron-589d89c5-45b7-4f8d-b7f7-bb725b403474", "neutron:port_name"="", "neutron:project_id"="ded6134a50fe44dab021417d872d6582", "neutron:revision_number"="6", "neutron:security_group_ids"="58d194f8-bfae-41e2-a4f6-304270f6a4bc b8928044-879d-44d2-bc38-000790d52a28 c9839265-3d5c-4db9-ab58-d8368de1ba0c eedea03f-d200-45c2-98e1-ccedee8626e1"}
ha_chassis_group    : []
name                : "c9abf07f-07cf-44d2-ad8c-a5eaa3d169f4"
options             : {requested-chassis=""}
parent_name         : []
port_security       : ["fa:16:aa:78:ad:65 y.x.z.z"]
tag                 : []
tag_request         : []
type                : ""
up                  : false

$> nova instance-action-list  db9ff13f-0119-4d0b-b838-a7a57ac17ace
+--------+------------------------------------------+---------+----------------------------+----------------------------+
| Action | Request_ID                               | Message | Start_Time                 | Updated_At                 |
+--------+------------------------------------------+---------+----------------------------+----------------------------+
| create | req-20cebc3c-228c-46d3-ae59-ce2a8283efaa | -       | 2019-10-24T06:20:45.000000 | 2019-10-24T06:24:27.000000 |
| delete | req-30addcaa-2ed8-40bd-bb0f-75e3d7ad4861 | -       | 2019-10-24T22:18:47.000000 | 2019-10-24T22:19:30.000000 |
+--------+------------------------------------------+---------+----------------------------+----------------------------+

[2]
_uuid               : 27d57cfd-1aea-49f9-8a35-23bc6c32628a
addresses           : ["fa:16:aa:41:b6:7a y.x.z.z"]
dhcpv4_options      : aa745ae1-21b0-4369-a833-c5652e2f2d53
dhcpv6_options      : []
dynamic_addresses   : []
enabled             : true
external_ids        : {"neutron:cidrs"="y.x.z.z/23", "neutron:device_id"="f75c0e18-5552-4d79-8c60-ca7f9ff7dd36", "neutron:device_owner"="", "neutron:network_name"="neutron-589d89c5-45b7-4f8d-b7f7-bb725b403474", "neutron:port_name"="", "neutron:project_id"="ded6134a50fe44dab021417d872d6582", "neutron:revision_number"="6", "neutron:security_group_ids"="7750e306-107b-4804-8a9f-a179d2df28c4 8b6ec5a6-9656-425b-950b-df5bfd1e7f79 b8928044-879d-44d2-bc38-000790d52a28"}
ha_chassis_group    : []
name                : "f798a667-bf31-4298-b301-c41ef007bc90"
options             : {requested-chassis=""}
parent_name         : []
port_security       : ["fa:16:aa:41:b6:7a y.x.z.z"]
tag                 : []
tag_request         : []
type                : ""
up                  : false

nova instance-action-list  f75c0e18-5552-4d79-8c60-ca7f9ff7dd36
+--------+------------------------------------------+---------+----------------------------+----------------------------+
| Action | Request_ID                               | Message | Start_Time                 | Updated_At                 |
+--------+------------------------------------------+---------+----------------------------+----------------------------+
| create | req-5ff8167e-c5f9-4fcd-a002-2e17dadc79ae | -       | 2019-10-24T05:59:38.000000 | 2019-10-24T06:01:33.000000 |
| delete | req-1a494fe2-2f4f-4aa8-a043-0d577b5008b6 | -       | 2019-10-24T06:07:55.000000 | 2019-10-24T06:08:21.000000 |
+--------+------------------------------------------+---------+----------------------------+----------------------------+

[3]
docker exec -it galera-bundle-docker-1 mysql -B -e 'select mac_address from ports' ovs_neutron >/tmp/neutron_macs
docker exec -it ovn-dbs-bundle-docker-0 ovn-nbctl list Logical_Switch_Port >/tmp/ovn_logical_ports


We have looked at https://bugzilla.redhat.com/show_bug.cgi?id=1762777 but that doesn't seems to be the case

Comment 3 Numan Siddique 2019-11-04 13:21:05 UTC
OVN doesn't create any logical ports on its own. Its CMS (networking-ovn) which creates the logical ports and it is its responsibility to clean it up.
Moving the BZ to networking-ovn.

Comment 40 Eran Kuris 2020-02-10 10:15:13 UTC
fixed in version 
python3-networking-ovn-6.0.1-0.20200116080451.e669382.el8ost.noarch

Red Hat Enterprise Linux release 8.1 (Ootpa) 
RHOS_TRUNK-15.0-RHEL-8-20200130.n.1

created cloud with 2 networks 2 vms for each network . 
Checked metadata service is working.Also validated that connectivity is working internal & external.

Comment 43 errata-xmlrpc 2020-03-10 11:26:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0770

Comment 44 Numan Siddique 2020-04-07 08:32:17 UTC
*** Bug 1770295 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.