Bug 1886622

Summary: Floating IP assigned to an instance is not accessible in scale up Multistack env with spine& leaf topology
Product: Red Hat OpenStack Reporter: Marian Krcmarik <mkrcmari>
Component: openstack-neutronAssignee: Slawek Kaplonski <skaplons>
Status: CLOSED ERRATA QA Contact: Marian Krcmarik <mkrcmari>
Severity: high Docs Contact:
Priority: high    
Version: 16.1 (Train)CC: amuller, apevec, bcafarel, bdobreli, bhaley, ccamposr, chrisw, hjensas, owalsh, ralonsoh, rhos-maint, scohen, skaplons
Target Milestone: z3Keywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-neutron-15.2.1-1.20210329123525.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-26 13:49:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
controller0-2 logs
none
controller0-1 logs
none
controller0-0 logs
none
compute3-0 logs none

Description Marian Krcmarik 2020-10-08 22:46:25 UTC
Description of problem:
(Reporting under openswitch because only visible ERROR i can see is failed binding of a port)
The deployment is a Mulstistack deployment with one Central site which hosts control plane and two DCN sites (with compute nodes only) based on spine&leaf network topology and OVS network backend. The deployment was scaled up with one new stack on a new leaf - undercloud was updated with a new leaf for new stack firstly , otherwise new stack has indentical deployment command line and tripleo templates as the other DCN stacks (only with the difference of leaf and its network ranges). The procedure used to work without getting any problems as described below.
The proble is that once an instance is spawned on such newly scaled up stack, Ping/SSH into the instance does not work (Instance has FIP associated from tenant network). And the port which holds the VIP does not appear as ACTIVE. This does not happen with previusly deployed DCN sites (Stacks).
Always when the instance is created I can see following backtrace in the log:
2020-10-08 21:08:54.862 33 DEBUG neutron.plugins.ml2.managers [req-bd9f0aee-1da2-409b-808c-5882337570ab - - - - -] DB exception raised by Mechanism driver 'openvswitch' in update_port_precommit _call_on_drivers 
/usr/lib/python3.6/site-packages/neutron/plugins/ml2/managers.py:484
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers Traceback (most recent call last):
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/managers.py", line 477, in _call_on_drivers
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     getattr(driver.obj, method_name)(context)
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/drivers/mech_agent.py", line 66, in update_port_precommit
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     self._insert_provisioning_block(context)
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/drivers/mech_agent.py", line 81, in _insert_provisioning_block
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     if context.host_agents(self.agent_type):
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/driver_context.py", line 297, in host_agents
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     'host': [self._binding.host]})
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 233, in wrapped
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     return method(*args, **kwargs)
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.6/site-packages/neutron/db/agents_db.py", line 301, in get_agents
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     agents = agent_obj.Agent.get_objects(context, **filters)
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.6/site-packages/neutron/objects/base.py", line 640, in get_objects
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     cls, context, _pager=_pager, **cls.modify_fields_to_db(kwargs))
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.6/site-packages/neutron/objects/db/api.py", line 52, in get_objects
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     **(_pager.to_kwargs(context, obj_cls) if _pager else {}))
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python3.6/site-packages/neutron_lib/db/model_query.py", line 311, in get_collection
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     for c in query
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3316, in __iter__
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     self.session._autoflush()
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/session.py", line 1576, in _autoflush
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     self.flush()
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/session.py", line 2451, in flush
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     self._flush(objects)
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/session.py", line 2589, in _flush
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     transaction.rollback(_capture_exception=True)
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     compat.reraise(exc_type, exc_value, exc_tb)
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 129, in reraise
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     raise value
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/session.py", line 2549, in _flush
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     flush_context.execute()
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     rec.execute(self)
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     uow,
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 236, in save_obj
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     update,
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1011, in _emit_update_statements
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers     % (table.description, len(records), rows)
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers sqlalchemy.orm.exc.StaleDataError: UPDATE statement on table 'ml2_port_bindings' expected to update 1 row(s); 0 were matched.
2020-10-08 21:08:54.862 33 ERROR neutron.plugins.ml2.managers 
2020-10-08 21:08:54.864 33 DEBUG neutron_lib.db.api [req-bd9f0aee-1da2-409b-808c-5882337570ab - - - - -] Retry wrapper got retriable exception: UPDATE statement on table 'ml2_port_bindings' expected to update 1 
row(s); 0 were matched. wrapped /usr/lib/python3.6/site-packages/neutron_lib/db/api.py:183

I will upload more logs (from the timeof spawning this instance) but in short, my instance is:
| b9b56efe-a581-45e9-958c-d50d6dae5c7f | tempest-ServersTestManualDisk-server-1734458280 | f312bf1e3d6442e0b62842d344a62b75 | ACTIVE | -          | Running     | tempest-ServersTestManualDisk-1041035872-network=10.100.0.9, 10.0.10.209 |
So the FIP is 10.0.10.209

The list of port:
(central) [stack@site-undercloud-0 ~]$ openstack port list
+--------------------------------------+-------------------------------------------------+-------------------+--------------------------------------------------------------------------------+--------+
| ID                                   | Name                                            | MAC Address       | Fixed IP Addresses                                                             | Status |
+--------------------------------------+-------------------------------------------------+-------------------+--------------------------------------------------------------------------------+--------+
| 239f3f09-be8f-44f3-a42c-c1445bc532ae |                                                 | fa:16:3e:44:25:c4 | ip_address='10.0.10.221', subnet_id='58885031-cc28-4d13-9d6e-520d0d2dc422'     | ACTIVE |
| 2c50139d-e69f-47dc-91ef-236412a0913e |                                                 | fa:16:3e:72:2c:85 | ip_address='10.100.0.3', subnet_id='c2fdfff5-a220-409f-8b26-f7ba362794ad'      | ACTIVE |
| 454f8d83-f927-4528-b170-ffa2d6aa8e43 |                                                 | fa:16:3e:28:dd:72 | ip_address='10.100.0.1', subnet_id='c2fdfff5-a220-409f-8b26-f7ba362794ad'      | ACTIVE |
| 4d0a6db3-4d03-4f9e-95cc-2b5c008bd79c | HA port tenant f312bf1e3d6442e0b62842d344a62b75 | fa:16:3e:f0:15:37 | ip_address='169.254.193.59', subnet_id='2e54ed2e-bb90-4683-ba5a-00b8256ba428'  | ACTIVE |
| 512dac3c-529d-47c4-9778-709835a74902 |                                                 | fa:16:3e:b8:e6:a1 | ip_address='192.168.199.2', subnet_id='906b95fc-3752-4da5-99fb-cd0b8cad83c5'   | ACTIVE |
| 5f99d520-f882-4e56-af19-6b2943bbc436 |                                                 | fa:16:3e:ea:d7:f3 | ip_address='10.100.0.2', subnet_id='c2fdfff5-a220-409f-8b26-f7ba362794ad'      | ACTIVE |
| 736ae698-904f-46b1-9ca4-bb462fe6a9d9 | HA port tenant f312bf1e3d6442e0b62842d344a62b75 | fa:16:3e:d2:f2:92 | ip_address='169.254.192.135', subnet_id='2e54ed2e-bb90-4683-ba5a-00b8256ba428' | ACTIVE |
| 80e7a08c-0260-423a-8f92-aeb9e1bc071b | HA port tenant f312bf1e3d6442e0b62842d344a62b75 | fa:16:3e:fe:99:b2 | ip_address='169.254.192.169', subnet_id='2e54ed2e-bb90-4683-ba5a-00b8256ba428' | ACTIVE |
| b1125019-2821-4343-aa33-7975fbc5fca0 |                                                 | fa:16:3e:28:70:28 | ip_address='10.0.10.209', subnet_id='58885031-cc28-4d13-9d6e-520d0d2dc422'     | N/A    |
| d7caaf5b-fb05-43f2-86f0-e65280ae6fc0 |                                                 | fa:16:3e:04:e0:f4 | ip_address='192.168.199.3', subnet_id='906b95fc-3752-4da5-99fb-cd0b8cad83c5'   | ACTIVE |
| e910bd67-6655-46c0-a069-be9f0aff77b5 |                                                 | fa:16:3e:9c:6d:6b | ip_address='10.100.0.9', subnet_id='c2fdfff5-a220-409f-8b26-f7ba362794ad'      | ACTIVE |
+--------------------------------------+-------------------------------------------------+-------------------+--------------------------------------------------------------------------------+--------+
The port holding the FIP is in N/A Status

More detail about the port:
+-------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                   | Value                                                                                                                                            |
+-------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up          | UP                                                                                                                                               |
| allowed_address_pairs   |                                                                                                                                                  |
| binding_host_id         |                                                                                                                                                  |
| binding_profile         |                                                                                                                                                  |
| binding_vif_details     |                                                                                                                                                  |
| binding_vif_type        | unbound                                                                                                                                          |
| binding_vnic_type       | normal                                                                                                                                           |
| created_at              | 2020-10-08T21:09:01Z                                                                                                                             |
| data_plane_status       | None                                                                                                                                             |
| description             |                                                                                                                                                  |
| device_id               | 93275416-a311-4af7-b569-1863616195b7                                                                                                             |
| device_owner            | network:floatingip                                                                                                                               |
| dns_assignment          | None                                                                                                                                             |
| dns_domain              | None                                                                                                                                             |
| dns_name                | None                                                                                                                                             |
| extra_dhcp_opts         |                                                                                                                                                  |
| fixed_ips               | ip_address='10.0.10.209', subnet_id='58885031-cc28-4d13-9d6e-520d0d2dc422'                                                                       |
| id                      | b1125019-2821-4343-aa33-7975fbc5fca0                                                                                                             |
| location                | cloud='', project.domain_id=, project.domain_name=, project.id='f312bf1e3d6442e0b62842d344a62b75', project.name=, region_name='regionOne', zone= |
| mac_address             | fa:16:3e:28:70:28                                                                                                                                |
| name                    |                                                                                                                                                  |
| network_id              | 782f4329-f9b9-44a7-9746-fd41c6e0ddbd                                                                                                             |
| port_security_enabled   | False                                                                                                                                            |
| project_id              | f312bf1e3d6442e0b62842d344a62b75                                                                                                                 |
| propagate_uplink_status | None                                                                                                                                             |
| qos_policy_id           | None                                                                                                                                             |
| resource_request        | None                                                                                                                                             |
| revision_number         | 2                                                                                                                                                |
| security_group_ids      |                                                                                                                                                  |
| status                  | N/A                                                                                                                                              |
| tags                    |                                                                                                                                                  |
| trunk_details           | None                                                                                                                                             |
| updated_at              | 2020-10-08T21:09:02Z                                                                                                                             |
+-------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+

(central) [stack@site-undercloud-0 ~]$ openstack router list
+--------------------------------------+-------------------------------------------------+--------+-------+----------------------------------+------+
| ID                                   | Name                                            | Status | State | Project                          | HA   |
+--------------------------------------+-------------------------------------------------+--------+-------+----------------------------------+------+
| 5d59ca17-df71-472d-b525-1a304d640787 | tempest-ServersTestManualDisk-1041035872-router | ACTIVE | UP    | f312bf1e3d6442e0b62842d344a62b75 | True |
+--------------------------------------+-------------------------------------------------+--------+-------+----------------------------------+------+

(central) [stack@site-undercloud-0 ~]$ openstack port list --router 5d59ca17-df71-472d-b525-1a304d640787
+--------------------------------------+-------------------------------------------------+-------------------+--------------------------------------------------------------------------------+--------+
| ID                                   | Name                                            | MAC Address       | Fixed IP Addresses                                                             | Status |
+--------------------------------------+-------------------------------------------------+-------------------+--------------------------------------------------------------------------------+--------+
| 239f3f09-be8f-44f3-a42c-c1445bc532ae |                                                 | fa:16:3e:44:25:c4 | ip_address='10.0.10.221', subnet_id='58885031-cc28-4d13-9d6e-520d0d2dc422'     | ACTIVE |
| 454f8d83-f927-4528-b170-ffa2d6aa8e43 |                                                 | fa:16:3e:28:dd:72 | ip_address='10.100.0.1', subnet_id='c2fdfff5-a220-409f-8b26-f7ba362794ad'      | ACTIVE |
| 4d0a6db3-4d03-4f9e-95cc-2b5c008bd79c | HA port tenant f312bf1e3d6442e0b62842d344a62b75 | fa:16:3e:f0:15:37 | ip_address='169.254.193.59', subnet_id='2e54ed2e-bb90-4683-ba5a-00b8256ba428'  | ACTIVE |
| 736ae698-904f-46b1-9ca4-bb462fe6a9d9 | HA port tenant f312bf1e3d6442e0b62842d344a62b75 | fa:16:3e:d2:f2:92 | ip_address='169.254.192.135', subnet_id='2e54ed2e-bb90-4683-ba5a-00b8256ba428' | ACTIVE |
| 80e7a08c-0260-423a-8f92-aeb9e1bc071b | HA port tenant f312bf1e3d6442e0b62842d344a62b75 | fa:16:3e:fe:99:b2 | ip_address='169.254.192.169', subnet_id='2e54ed2e-bb90-4683-ba5a-00b8256ba428' | ACTIVE |
+--------------------------------------+-------------------------------------------------+-------------------+--------------------------------------------------------------------------------+--------+

I can provide the deployment for debugging If needed. The problem appeared in tempest tests which previously was not failing this way and best to my knowledge nothing has changed in configuration.

Version-Release number of selected component (if applicable):
Red Hat OpenStack Platform release 16.1.2 GA (Train)
RHOS-16.1-RHEL-8-20200930.n.0

How reproducible:
Always with specific tempest tests

The used tempest test I was using for debugging is:
tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_host_name_is_same_as_server_name

Additional info:
I am not sure what exact information yo provide since the deployment may be complex but I can provide any requested info regarding the env and provide env for debugging.

Comment 1 Marian Krcmarik 2020-10-08 22:47:20 UTC
Created attachment 1720077 [details]
controller0-2 logs

Comment 2 Marian Krcmarik 2020-10-08 22:48:18 UTC
Created attachment 1720078 [details]
controller0-1 logs

Comment 3 Marian Krcmarik 2020-10-08 22:48:52 UTC
Created attachment 1720079 [details]
controller0-0 logs

Comment 4 Marian Krcmarik 2020-10-08 22:49:25 UTC
Created attachment 1720081 [details]
compute3-0 logs

Comment 5 Brian Haley 2020-10-09 14:38:30 UTC
Moved to neutron component and vNES squad.

Comment 31 errata-xmlrpc 2021-05-26 13:49:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.6 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2097