Bug 2143972 - [OSP16.1] Live-migration failure during post because of keystone unavailable [NEEDINFO]
Summary: [OSP16.1] Live-migration failure during post because of keystone unavailable
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: OSP DFG:Compute
QA Contact: OSP DFG:Compute
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-11-18 14:56 UTC by ggrimaux
Modified: 2023-07-12 10:00 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:
rribaud: needinfo? (rosingh)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 2000069 0 None None None 2023-01-18 19:05:01 UTC
Red Hat Issue Tracker OSP-20303 0 None None None 2022-11-18 15:00:57 UTC

Description ggrimaux 2022-11-18 14:56:30 UTC
Description of problem:

Client attempted a live migration and it failed during post section with:

The source of the problem seems to have been related to keystone:
1064:2022-11-17 15:49:15.296 7 INFO nova.compute.resource_tracker [req-1c837c0b-3ccd-4226-8340-af1b12c6fddb - - - - -] [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] Updating resource usage from migration 1976219d-a89d-48f2-893f-39d3e194a4f3
1067:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [req-ac4e513d-a39c-452c-8df7-319d2e764095 - - - - -] [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] Post live migration at destination icmlw-p1-r740-070.itpc.uk.pri.o2.com failed: oslo_messaging.rpc.client.RemoteError: Remote error: ServiceUnavailable The server is currently unavailable. Please try again at a later time.<br /><br />
1073:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] Traceback (most recent call last):
1074:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 7579, in _post_live_migration
1075:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     instance, block_migration, dest)
1076:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/nova/compute/rpcapi.py", line 796, in post_live_migration_at_destination
1077:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     instance=instance, block_migration=block_migration)
1078:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 181, in call
1079:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     transport_options=self.transport_options)
1080:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 129, in _send
1081:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     transport_options=transport_options)
1082:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 674, in send
1083:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     transport_options=transport_options)
1084:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]   File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 664, in _send
1085:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef]     raise result
1086:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] oslo_messaging.rpc.client.RemoteError: Remote error: ServiceUnavailable The server is currently unavailable. Please try again at a later time.<br /><br />
1087:2022-11-17 15:49:34.143 7 ERROR nova.compute.manager [instance: 67857f95-d07d-4568-b92f-b37c760c94ef] The Keystone service is temporarily unavailable.


I looked at keystone logs around that time on all 3 controller nodes and couldn't find anything substantial.

Instance was stuck in "MIGRATING" state and nova.instances was pointing still to the source compute.
We had to modify the database to point to the new compute.

I don't know if you can find something with.
Or if we need to enable debug mode in keystone and wait for the next occurrence of this situation.

If you need anything please let me know.

Thank you.

Version-Release number of selected component (if applicable):
OSP16.1.7
Environment is integrated with Contrail.


How reproducible:
Happened once

Steps to Reproduce:
1. Live-migration
2.
3.

Actual results:
Live-migration failed.

Expected results:
Live-migration succeed

Additional info:
have sosreport from compute nodes and controller nodes


Note You need to log in before you can comment on or make changes to this bug.