Bug 2196735 - [Octavia] Interface name collision in the amphora
Summary: [Octavia] Interface name collision in the amphora
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-octavia
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: async
: 16.1 (Train on RHEL 8.2)
Assignee: Gregory Thiemonge
QA Contact: Bruna Bonguardo
Greg Rakauskas
URL:
Whiteboard:
Depends On: 2192413 2196744
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-10 06:10 UTC by Gregory Thiemonge
Modified: 2023-09-18 06:42 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2192413
Environment:
Last Closed: 2023-09-18 06:39:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 885727 0 None MERGED Avoid interface name collisions in the amphora 2023-06-19 15:32:42 UTC
Red Hat Issue Tracker OSP-25082 0 None None None 2023-05-15 23:43:50 UTC

Description Gregory Thiemonge 2023-05-10 06:10:56 UTC
+++ This bug was initially created as a clone of Bug #2192413 +++

Description of problem:

After adding/deleting members from different networks, adding a new member may trigger an interface name collision in the amphora.



it fails because eth3 already exists

Version-Release number of selected component (if applicable):
This issue happens in 16.1 and 16.2 


How reproducible:
Always

Steps to reproduce:
- create a LB, a listener and a pool
- add member1 (subnet1/network1) -> eth2 is created in the ns
- add member2 (subnet2/network2) -> eth3 is created
- delete member1 (subnet1/network1) -> eth2 is deleted
- add member3 (subnet3/network3) -> Plugged interface ens8 will become eth3 in the namespace amphora-haproxy

Actual results:


Expected results:


Additional info:

Upstream Bug details:
https://bugs.launchpad.net/octavia/+bug/2017894

--- Additional comment from Gregory Thiemonge on 2023-05-02 06:10:49 UTC ---

Filtered logs of the amphora:

[root@controller-0 heat-admin]# grep -e POST -e NetlinkError -e eth octavia-amphora.log.1 
Apr 27 03:08:15 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/vip/10.0.0.167
Apr 27 03:08:25 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:08:25 -0400] "POST /1.0/plug/vip/10.0.0.167 HTTP/1.1" 202 69 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:08:43 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/network
Apr 27 03:08:43 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:08:43.062 1387 INFO octavia.amphorae.backends.agent.api_server.plug [-] Plugged interface eth1 will become eth2 in the namespace amphora-haproxy
Apr 27 03:08:47 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:08:47 -0400] "POST /1.0/plug/network HTTP/1.1" 202 54 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:08:58 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/network
Apr 27 03:08:58 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:08:58.171 1387 INFO octavia.amphorae.backends.agent.api_server.plug [-] Plugged interface eth1 will become eth3 in the namespace amphora-haproxy
Apr 27 03:09:02 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:09:02 -0400] "POST /1.0/plug/network HTTP/1.1" 202 54 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/network
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:09:17 -0400] "POST /1.0/plug/network HTTP/1.1" 500 47 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:09:17.430 1387 INFO octavia.amphorae.backends.agent.api_server.plug [-] Plugged interface eth1 will become eth3 in the namespace amphora-haproxy
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:09:17.437 1387 ERROR flask.app [-] Exception on /1.0/plug/network [POST]: pyroute2.netlink.exceptions.NetlinkError: (17, 'File exists')#0122023-04-27 03:09:17.437 1387 ERROR flask.app Traceback (most recent call last):#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app#0122023-04-27 03:09:17.437 1387 ERROR flask.app     response = self.full_dispatch_request()#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     rv = self.handle_user_exception(e)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception#0122023-04-27 03:09:17.437 1387 ERROR flask.app     reraise(exc_type, exc_value, tb)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise#0122023-04-27 03:09:17.437 1387 ERROR flask.app     raise value#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     rv = self.dispatch_request()#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     return self.view_functions[rule.endpoint](**req.view_args)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/octavia/amphorae/backends/agent/api_server/server.py", line 210, in plug_network#0122023-04-27 03:09:17.437 1387 ERROR flask.app     port_info.get('mtu'))#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/octavia/amphorae/backends/agent/api_server/plug.py", line 218, in plug_network#0122023-04-27 03:09:17.437 1387 ERROR flask.app     IFLA_IFNAME=netns_interface)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/iproute/linux.py", line 1163, in link#0122023-04-27 03:09:17.437 1387 ERROR flask.app     msg_flags=msg_flags)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 373, in nlm_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     return tuple(self._genlm_request(*argv, **kwarg))#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 864, in nlm_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     callback=callback):#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 376, in get#0122023-04-27 03:09:17.437 1387 ERROR flask.app     return tuple(self._genlm_get(*argv, **kwarg))#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 701, in get#0122023-04-27 03:09:17.437 1387 ERROR flask.app     raise msg['header']['error']#0122023-04-27 03:09:17.437 1387 ERROR flask.app pyroute2.netlink.exceptions.NetlinkError: (17, 'File exists')#0122023-04-27 03:09:17.437 1387 ERROR flask.app

--- Additional comment from Gregory Thiemonge on 2023-05-02 06:53:04 UTC ---

(In reply to chrisbro from comment #0)

> - delete member1 (subnet1/network1) -> eth2 is deleted

to be precise:
- in >=17.0, eth2 is removed when the member is deleted
- in 16.2, eth2 is not deleted at this point, but it is removed when the next member is added.

that said, this bug affects 16.x and 17.x


Note You need to log in before you can comment on or make changes to this bug.