Bug 2214259 - VM with trunk port becomes inaccessible after live vm migration performed after neutron backend ovs-to-ovn migration
Summary: VM with trunk port becomes inaccessible after live vm migration performed aft...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ga
: 17.1
Assignee: Jakub Libosvar
QA Contact: Roman Safronov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-12 11:43 UTC by Roman Safronov
Modified: 2023-08-16 01:16 UTC (History)
10 users (show)

Fixed In Version: openstack-neutron-18.6.1-1.20230518200969.el9ost
Doc Type: Bug Fix
Doc Text:
Before this update, in an environment that had been migrated from the OVS mechanism driver to the OVN mechanism driver, an instance with a trunk port could become inaccessible after an operation such as a live migration. Now, you can live migrate, shutdown, or reboot instances with a trunk port without issues after migration to the OVN mechanism driver.
Clone Of:
Environment:
Last Closed: 2023-08-16 01:15:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-25744 0 None None None 2023-06-12 11:45:37 UTC
Red Hat Product Errata RHEA-2023:4577 0 None None None 2023-08-16 01:16:19 UTC

Description Roman Safronov 2023-06-12 11:43:51 UTC
Description of problem:
VM becomes inaccessible after live vm migration performed after neutron backend ovs-to-ovn migration


From d/s ci job connectivity summary file created during validation of workload operations

-----------------------------------------------------------------------------------------------
| Status of server 69ea0073-f6b0-4c06-a7ad-1d278d269b68 connectivity before VM migration test |
-----------------------------------------------------------------------------------------------
Connectivity checks from external network
INFO: ping to 10.46.21.239 = passed

INFO: ssh to 10.46.21.239 = passed

Connectivity checks from internal network
INFO: ping 192.168.200.25 via 10.46.21.239 = passed

-------------------------------------------------------
| Overall connectivity status after VM migration test |
-------------------------------------------------------
Connectivity checks from external network
INFO: ping to 10.46.21.245 = passed
INFO: ping to 10.46.21.228 = passed
INFO: ping to 10.46.21.239 = failed
INFO: ping to 10.46.21.251 = passed

INFO: ssh to 10.46.21.245 = passed
INFO: ssh to 10.46.21.228 = passed
INFO: ssh to 10.46.21.239 = failed
INFO: ssh to 10.46.21.251 = passed

Connectivity checks from internal network
INFO: ping 192.168.200.161 via 10.46.21.245 = failed

IP: 10.46.21.239. Failed to retrieve ip address of peer subport

----------------------------------------------------------------------------------------------
| Status of server 69ea0073-f6b0-4c06-a7ad-1d278d269b68 connectivity after VM migration test |
----------------------------------------------------------------------------------------------
Connectivity checks from external network
INFO: ping to 10.46.21.239 = failed

INFO: ssh to 10.46.21.239 = failed



Version-Release number of selected component (if applicable):
RHOS-17.1-RHEL-9-20230607.n.2
openstack-neutron-ovn-migration-tool-18.6.1-1.20230518200966.el9ost.noarch

How reproducible:
100% with vms using trunk ports. Vms with normal ports are accessible after migration

Steps to Reproduce:
1. Deploy HA environment with ovs neutron backend
2. Create external and internal networks and subnets, 2 trunk ports on external (aka provider) network, create 2 VM using these trunk ports and also subports on the internal network
3. Migrate neutron backend on the environment from ovs to ovn
4. Make sure that VM with trunk port responds to ping after neutron backend migration 
5. Live migrate the vm with trunk port


Actual results:
VM is inaccessible after the migration

Expected results:
VM is accessible after the migration, i.e. pingable from the external network and also subports on internal network are responding on ping requests from the internal network

Additional info:

Comment 2 Jakub Libosvar 2023-06-12 20:30:49 UTC
I reproduced this issue. It seems similar to bug 2192726

2023-06-12 20:14:07.548 2 DEBUG os_vif [req-0e642c7f-ee56-41c7-9636-a2a67f781e1c e6c7c2efe0974edda3d00224bb9d3554 b0f1a039a8754c068b00b0cb70453d62 - default default] Plugging vif VIFOpenVSwitch(active=True,address=fa:16:3e:e0:5b:8c,bridge_name='tbr-7b3be573-9',has_traffic_filtering=True,id=9d286de8-2662-4a4f-b4c5-2eb5c3dea10e,network=Network(8d16ee77-b9d5-466e-adef-392036433273),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tap9d286de8-26') plug /usr/lib/python3.9/site-packages/os_vif/__init__.py:76
2023-06-12 20:14:07.550 2 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): AddBridgeCommand(name=tbr-7b3be573-9, may_exist=True, datapath_type=system) do_commit /usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:89
2023-06-12 20:14:07.589 2 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): AddPortCommand(bridge=tbr-7b3be573-9, port=tap9d286de8-26, may_exist=True) do_commit /usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:89

however nova in this case was notified

2023-06-12 19:12:57.160 9 DEBUG nova.api.openstack.wsgi [req-46593683-915d-4a07-910b-d5f2ce3090ab 12464e59d5254ec98f1773b91173283d 795d47bf05a24f0b921b2a16a011e52b - default default] Action: 'create', calling method: <bound method ServerExternalEventsController.create of <nova.api.openstack.compute.server_external_events.ServerExternalEventsController object at 0x7f99cdfe1a00>>, body: {"events": [{"name": "network-changed", "server_uuid": "53212cfb-c0fb-4dec-8971-50579718118e", "tag": "441fa082-0c7d-48aa-be1e-3b7689b3c81b"}, {"name": "network-changed", "server_uuid": "a0ff2953-422d-4fec-b314-8ed8ab9949d6", "tag": "9d286de8-2662-4a4f-b4c5-2eb5c3dea10e"}]} _process_stack /usr/lib/python3.9/site-packages/nova/api/openstack/wsgi.py:511
2023-06-12 19:12:57.177 9 DEBUG oslo_concurrency.lockutils [req-46593683-915d-4a07-910b-d5f2ce3090ab 12464e59d5254ec98f1773b91173283d 795d47bf05a24f0b921b2a16a011e52b - default default] Lock "6136762c-db40-4584-b9c3-7b409a0093d1" acquired by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python3.9/site-packages/oslo_concurrency/lockutils.py:355
2023-06-12 19:12:57.178 9 DEBUG oslo_concurrency.lockutils [req-46593683-915d-4a07-910b-d5f2ce3090ab 12464e59d5254ec98f1773b91173283d 795d47bf05a24f0b921b2a16a011e52b - default default] Lock "6136762c-db40-4584-b9c3-7b409a0093d1" released by "nova.context.set_target_cell.<locals>.get_or_set_cached_cell_and_set_connections" :: held 0.001s inner /usr/lib/python3.9/site-packages/oslo_concurrency/lockutils.py:367
2023-06-12 19:12:57.199 9 INFO nova.api.openstack.compute.server_external_events [req-46593683-915d-4a07-910b-d5f2ce3090ab 12464e59d5254ec98f1773b91173283d 795d47bf05a24f0b921b2a16a011e52b - default default] Creating event network-changed:441fa082-0c7d-48aa-be1e-3b7689b3c81b for instance 53212cfb-c0fb-4dec-8971-50579718118e on compute-0.redhat.local
2023-06-12 19:12:57.200 9 INFO nova.api.openstack.compute.server_external_events [req-46593683-915d-4a07-910b-d5f2ce3090ab 12464e59d5254ec98f1773b91173283d 795d47bf05a24f0b921b2a16a011e52b - default default] Creating event network-changed:9d286de8-2662-4a4f-b4c5-2eb5c3dea10e for instance a0ff2953-422d-4fec-b314-8ed8ab9949d6 on compute-1.redhat.local

Comment 3 Jakub Libosvar 2023-06-12 21:22:33 UTC
Neutron side logs:

2023-06-12 19:12:56.069 81 DEBUG neutron.notifiers.nova [-] Sending events: [{'name': 'network-changed', 'server_uuid': '53212cfb-c0fb-4dec-8971-50579718118e', 'tag': '441fa082-0c7d-48aa-be1e-3b7689b3c81b'}, {'name': 'network-changed', 'server_uuid': 'a0ff2953-422d-4fec-b314-8ed8ab9949d6', 'tag': '9d286de8-2662-4a4f-b4c5-2eb5c3dea10e'}] send_events /usr/lib/python3.9/site-packages/neutron/notifiers/nova.py:279
2023-06-12 19:12:56.624 81 DEBUG novaclient.v2.client [-] REQ: curl -g -i -X POST http://172.17.1.91:8774/v2.1/os-server-external-events -H "Accept: application/json" -H "Content-Type: application/json" -H "User-Agent: python-novaclient" -H "X-Auth-Token: {SHA256}b940002fcf9af167338618dd2f325683d38ed39d3173fc7146ebc96f5bd3390c" -H "X-OpenStack-Nova-API-Version: 2.1" -H "X-OpenStack-Request-ID: req-6841ae01-b41a-499b-a863-73185c733dff" -d '{"events": [{"name": "network-changed", "server_uuid": "53212cfb-c0fb-4dec-8971-50579718118e", "tag": "441fa082-0c7d-48aa-be1e-3b7689b3c81b"}, {"name": "network-changed", "server_uuid": "a0ff2953-422d-4fec-b314-8ed8ab9949d6", "tag": "9d286de8-2662-4a4f-b4c5-2eb5c3dea10e"}]}' _http_log_request /usr/lib/python3.9/site-packages/keystoneauth1/session.py:519
2023-06-12 19:12:57.214 81 DEBUG novaclient.v2.client [-] RESP: [200] content-length: 346 content-type: application/json date: Mon, 12 Jun 2023 19:12:56 GMT openstack-api-version: compute 2.1 server: Apache vary: OpenStack-API-Version,X-OpenStack-Nova-API-Version x-compute-request-id: req-46593683-915d-4a07-910b-d5f2ce3090ab x-openstack-nova-api-version: 2.1 x-openstack-request-id: req-46593683-915d-4a07-910b-d5f2ce3090ab _http_log_response /usr/lib/python3.9/site-packages/keystoneauth1/session.py:550
2023-06-12 19:12:57.215 81 DEBUG novaclient.v2.client [-] RESP BODY: {"events": [{"name": "network-changed", "server_uuid": "53212cfb-c0fb-4dec-8971-50579718118e", "tag": "441fa082-0c7d-48aa-be1e-3b7689b3c81b", "status": "completed", "code": 200}, {"name": "network-changed", "server_uuid": "a0ff2953-422d-4fec-b314-8ed8ab9949d6", "tag": "9d286de8-2662-4a4f-b4c5-2eb5c3dea10e", "status": "completed", "code": 200}]} _http_log_response /usr/lib/python3.9/site-packages/keystoneauth1/session.py:582
2023-06-12 19:12:57.215 81 DEBUG novaclient.v2.client [-] POST call to compute for http://172.17.1.91:8774/v2.1/os-server-external-events used request id req-46593683-915d-4a07-910b-d5f2ce3090ab request /usr/lib/python3.9/site-packages/keystoneauth1/session.py:954
2023-06-12 19:12:57.215 81 INFO neutron.notifiers.nova [-] Nova event matching ['req-46593683-915d-4a07-910b-d5f2ce3090ab'] response: {'name': 'network-changed', 'server_uuid': '53212cfb-c0fb-4dec-8971-50579718118e', 'tag': '441fa082-0c7d-48aa-be1e-3b7689b3c81b', 'status': 'completed', 'code': 200}
2023-06-12 19:12:57.215 81 INFO neutron.notifiers.nova [-] Nova event matching ['req-46593683-915d-4a07-910b-d5f2ce3090ab'] response: {'name': 'network-changed', 'server_uuid': 'a0ff2953-422d-4fec-b314-8ed8ab9949d6', 'tag': '9d286de8-2662-4a4f-b4c5-2eb5c3dea10e', 'status': 'completed', 'code': 200}

Comment 4 Jakub Libosvar 2023-06-13 14:54:14 UTC
I tested it on my development environment that contains compose RHOS-17.1-RHEL-9-20230404.n.1 and it's working there.

Comment 16 Roman Safronov 2023-06-22 09:54:39 UTC
Verified that the issue does not happen on RHOS-17.1-RHEL-9-20230621.n.1 with  openstack-neutron-ovn-migration-tool-18.6.1-1.20230518200969.el9ost.noarch

Comment 24 errata-xmlrpc 2023-08-16 01:15:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4577


Note You need to log in before you can comment on or make changes to this bug.