Bug 2196744

Summary: [Octavia] Interface name collision in the amphora
Product: Red Hat OpenStack Reporter: Gregory Thiemonge <gthiemon>
Component: openstack-octaviaAssignee: Gregory Thiemonge <gthiemon>
Status: POST --- QA Contact: Bruna Bonguardo <bbonguar>
Severity: medium Docs Contact: Greg Rakauskas <gregraka>
Priority: medium    
Version: 17.1 (Wallaby)CC: bbonguar, chrisbro, gregraka, gthiemon, pgrist, tweining
Target Milestone: z1Keywords: Triaged
Target Release: 17.1Flags: ifrangs: needinfo? (gthiemon)
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2192413 Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2192413    
Bug Blocks: 2196735    

Description Gregory Thiemonge 2023-05-10 06:12:34 UTC
+++ This bug was initially created as a clone of Bug #2192413 +++

Description of problem:

After adding/deleting members from different networks, adding a new member may trigger an interface name collision in the amphora.



it fails because eth3 already exists

Version-Release number of selected component (if applicable):
This issue happens in 16.1 and 16.2 


How reproducible:
Always

Steps to reproduce:
- create a LB, a listener and a pool
- add member1 (subnet1/network1) -> eth2 is created in the ns
- add member2 (subnet2/network2) -> eth3 is created
- delete member1 (subnet1/network1) -> eth2 is deleted
- add member3 (subnet3/network3) -> Plugged interface ens8 will become eth3 in the namespace amphora-haproxy

Actual results:


Expected results:


Additional info:

Upstream Bug details:
https://bugs.launchpad.net/octavia/+bug/2017894


--- Additional comment from Gregory Thiemonge on 2023-05-02 06:10:49 UTC ---

Filtered logs of the amphora:

[root@controller-0 heat-admin]# grep -e POST -e NetlinkError -e eth octavia-amphora.log.1 
Apr 27 03:08:15 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/vip/10.0.0.167
Apr 27 03:08:25 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:08:25 -0400] "POST /1.0/plug/vip/10.0.0.167 HTTP/1.1" 202 69 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:08:43 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/network
Apr 27 03:08:43 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:08:43.062 1387 INFO octavia.amphorae.backends.agent.api_server.plug [-] Plugged interface eth1 will become eth2 in the namespace amphora-haproxy
Apr 27 03:08:47 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:08:47 -0400] "POST /1.0/plug/network HTTP/1.1" 202 54 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:08:58 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/network
Apr 27 03:08:58 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:08:58.171 1387 INFO octavia.amphorae.backends.agent.api_server.plug [-] Plugged interface eth1 will become eth3 in the namespace amphora-haproxy
Apr 27 03:09:02 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:09:02 -0400] "POST /1.0/plug/network HTTP/1.1" 202 54 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.error: [1387] POST /1.0/plug/network
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b gunicorn.gunicorn.access: [1387] ::ffff:172.24.2.103 - - [27/Apr/2023:03:09:17 -0400] "POST /1.0/plug/network HTTP/1.1" 500 47 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:09:17.430 1387 INFO octavia.amphorae.backends.agent.api_server.plug [-] Plugged interface eth1 will become eth3 in the namespace amphora-haproxy
Apr 27 03:09:17 amphora-4ed50bc1-237c-4014-b831-b419356fe84b amphora-agent[1387]: 2023-04-27 03:09:17.437 1387 ERROR flask.app [-] Exception on /1.0/plug/network [POST]: pyroute2.netlink.exceptions.NetlinkError: (17, 'File exists')#0122023-04-27 03:09:17.437 1387 ERROR flask.app Traceback (most recent call last):#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app#0122023-04-27 03:09:17.437 1387 ERROR flask.app     response = self.full_dispatch_request()#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     rv = self.handle_user_exception(e)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception#0122023-04-27 03:09:17.437 1387 ERROR flask.app     reraise(exc_type, exc_value, tb)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise#0122023-04-27 03:09:17.437 1387 ERROR flask.app     raise value#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     rv = self.dispatch_request()#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     return self.view_functions[rule.endpoint](**req.view_args)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/octavia/amphorae/backends/agent/api_server/server.py", line 210, in plug_network#0122023-04-27 03:09:17.437 1387 ERROR flask.app     port_info.get('mtu'))#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/octavia/amphorae/backends/agent/api_server/plug.py", line 218, in plug_network#0122023-04-27 03:09:17.437 1387 ERROR flask.app     IFLA_IFNAME=netns_interface)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/iproute/linux.py", line 1163, in link#0122023-04-27 03:09:17.437 1387 ERROR flask.app     msg_flags=msg_flags)#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 373, in nlm_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     return tuple(self._genlm_request(*argv, **kwarg))#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 864, in nlm_request#0122023-04-27 03:09:17.437 1387 ERROR flask.app     callback=callback):#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 376, in get#0122023-04-27 03:09:17.437 1387 ERROR flask.app     return tuple(self._genlm_get(*argv, **kwarg))#0122023-04-27 03:09:17.437 1387 ERROR flask.app   File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 701, in get#0122023-04-27 03:09:17.437 1387 ERROR flask.app     raise msg['header']['error']#0122023-04-27 03:09:17.437 1387 ERROR flask.app pyroute2.netlink.exceptions.NetlinkError: (17, 'File exists')#0122023-04-27 03:09:17.437 1387 ERROR flask.app

--- Additional comment from Gregory Thiemonge on 2023-05-02 06:53:04 UTC ---

(In reply to chrisbro from comment #0)

> - delete member1 (subnet1/network1) -> eth2 is deleted

to be precise:
- in >=17.0, eth2 is removed when the member is deleted
- in 16.2, eth2 is not deleted at this point, but it is removed when the next member is added.

that said, this bug affects 16.x and 17.x