Bug 2192413 - [Octavia] Interface name collision in the amphora
Summary: [Octavia] Interface name collision in the amphora
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-octavia
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z6
: 16.2 (Train on RHEL 8.4)
Assignee: Gregory Thiemonge
QA Contact: Omer Schwartz
Greg Rakauskas
URL:
Whiteboard:
Depends On:
Blocks: 2196735 2196744
TreeView+ depends on / blocked
 
Reported: 2023-05-01 23:24 UTC by chrisbro@redhat.com
Modified: 2023-11-08 19:19 UTC (History)
5 users (show)

Fixed In Version: openstack-octavia-5.1.3-2.20230616184949.355f6b1.el8ost
Doc Type: Bug Fix
Doc Text:
Before this update, the name of a new networking interface in the amphora instance could conflict with the name of an existing interface. As a result, adding a new member on a new subnet failed. With this update, the Load-balancing service (octavia) now ensures that the names of the networking interfaces are unique.
Clone Of:
: 2196735 2196744 (view as bug list)
Environment:
Last Closed: 2023-11-08 19:18:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
reproducer script (2.32 KB, text/plain)
2023-05-02 06:10 UTC, Gregory Thiemonge
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 2017894 0 None None None 2023-05-02 06:11:12 UTC
OpenStack gerrit 881719 0 None MERGED Avoid interface name collisions in the amphora 2023-06-19 15:29:17 UTC
OpenStack gerrit 885727 0 None MERGED Avoid interface name collisions in the amphora 2023-06-19 15:29:18 UTC
Red Hat Issue Tracker OSP-24670 0 None None None 2023-05-01 23:26:44 UTC
Red Hat Product Errata RHBA-2023:6307 0 None None None 2023-11-08 19:19:28 UTC

Description chrisbro@redhat.com 2023-05-01 23:24:57 UTC
Description of problem:

After adding/deleting members from different networks, adding a new member may trigger an interface name collision in the amphora.



it fails because eth3 already exists

Version-Release number of selected component (if applicable):
This issue happens in 16.1 and 16.2 


How reproducible:
Always

Steps to reproduce:
- create a LB, a listener and a pool
- add member1 (subnet1/network1) -> eth2 is created in the ns
- add member2 (subnet2/network2) -> eth3 is created
- delete member1 (subnet1/network1) -> eth2 is deleted
- add member3 (subnet3/network3) -> Plugged interface ens8 will become eth3 in the namespace amphora-haproxy

Actual results:


Expected results:


Additional info:

Upstream Bug details:
https://bugs.launchpad.net/octavia/+bug/2017894

Comment 1 Gregory Thiemonge 2023-05-02 06:10:05 UTC
Created attachment 1961593 [details]
reproducer script

Comment 2 Gregory Thiemonge 2023-05-02 06:10:49 UTC
Filtered logs of the amphora:

[root@controller-0 heat-admin]# grep -e POST -e NetlinkError -e eth octavia-amphora.log.1 
Apr 27 03:08:15 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/vip/10.0.0.167
Apr 27 03:08:25 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:08:25 -0400] "POST /1.0/plug/vip/10.0.0.167 HTTP/1.1" 202 69 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:08:43 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/network
Apr 27 03:08:43 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:08:43.062 1387 INFO octavia.amphorae.backends.agent.api_server.plug [-] Plugged interface eth1 will become eth2 in the namespace amphora-haproxy
Apr 27 03:08:47 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:08:47 -0400] "POST /1.0/plug/network HTTP/1.1" 202 54 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:08:58 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/network
Apr 27 03:08:58 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:08:58.171 1387 INFO octavia.amphorae.backends.agent.api_server.plug [-] Plugged interface eth1 will become eth3 in the namespace amphora-haproxy
Apr 27 03:09:02 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:09:02 -0400] "POST /1.0/plug/network HTTP/1.1" 202 54 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/network
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:09:17 -0400] "POST /1.0/plug/network HTTP/1.1" 500 47 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:09:17.430 1387 INFO octavia.amphorae.backends.agent.api_server.plug [-] Plugged interface eth1 will become eth3 in the namespace amphora-haproxy
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:09:17.437 1387 ERROR flask.app [-] Exception on /1.0/plug/network [POST]: pyroute2.netlink.exceptions.NetlinkError: (17, 'File exists')#0122023-04-27 03:09:17.437 1387 ERROR flask.app Traceback (most recent call last):#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app#0122023-04-27 03:09:17.437 1387 ERROR flask.app     response = self.full_dispatch_request()#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     rv = self.handle_user_exception(e)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception#0122023-04-27 03:09:17.437 1387 ERROR flask.app     reraise(exc_type, exc_value, tb)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise#0122023-04-27 03:09:17.437 1387 ERROR flask.app     raise value#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     rv = self.dispatch_request()#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     return self.view_functions[rule.endpoint](**req.view_args)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/octavia/amphorae/backends/agent/api_server/server.py", line 210, in plug_network#0122023-04-27 03:09:17.437 1387 ERROR flask.app     port_info.get('mtu'))#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/octavia/amphorae/backends/agent/api_server/plug.py", line 218, in plug_network#0122023-04-27 03:09:17.437 1387 ERROR flask.app     IFLA_IFNAME=netns_interface)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/iproute/linux.py", line 1163, in link#0122023-04-27 03:09:17.437 1387 ERROR flask.app     msg_flags=msg_flags)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 373, in nlm_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     return tuple(self._genlm_request(*argv, **kwarg))#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 864, in nlm_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     callback=callback):#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 376, in get#0122023-04-27 03:09:17.437 1387 ERROR flask.app     return tuple(self._genlm_get(*argv, **kwarg))#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 701, in get#0122023-04-27 03:09:17.437 1387 ERROR flask.app     raise msg['header']['error']#0122023-04-27 03:09:17.437 1387 ERROR flask.app pyroute2.netlink.exceptions.NetlinkError: (17, 'File exists')#0122023-04-27 03:09:17.437 1387 ERROR flask.app

Comment 18 errata-xmlrpc 2023-11-08 19:18:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.2.6 (Train) bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6307


Note You need to log in before you can comment on or make changes to this bug.