This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
Bug 1742436 - Update instance host and task state when post live migration fails
Summary: Update instance host and task state when post live migration fails
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 13.0 (Queens)
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ga
: 18.0
Assignee: Amit Uniyal
QA Contact: OSP DFG:Compute
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-16 19:05 UTC by David Hill
Modified: 2024-12-20 18:53 UTC (History)
26 users (show)

Fixed In Version: openstack-nova-27.1.1-18.0.20230801141713.252e660.el9ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-01-11 14:55:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 791135 0 None MERGED [compute] always set instance.host in post_livemigration 2023-06-16 06:37:59 UTC
OpenStack gerrit 864055 0 None MERGED [compute] always set instance.host in post_livemigration 2023-08-22 11:24:56 UTC
Red Hat Issue Tracker   OSP-18384 0 None None None 2024-01-11 14:55:03 UTC
Red Hat Issue Tracker OSP-31144 0 None None None 2024-01-11 14:58:01 UTC
Red Hat Issue Tracker OSP-3145 0 None None None 2021-11-18 14:46:55 UTC
Red Hat Knowledge Base (Solution) 4347541 0 None None None 2020-05-25 00:12:06 UTC

Description David Hill 2019-08-16 19:05:18 UTC
This bug was initially created as a copy of Bug #1636102

I am copying this bug because: 



Update instance host and task state when post live migration fails

If a live migration fails during the post processing it can lead to
the instance being shutdown on the source node and left in a migrating
task state. The instance is now running on the target node so the
instance host and task state should be updated.

Comment 2 David Hill 2019-08-16 19:11:00 UTC
The VM was migrated to remote_compute.localdomain but in the database it remained on source_compute.localdomain .   The following error was seen in the compute logs:

2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [req-f0905fc1-ec77-47ec-a4ce-4324718cec3a 6828abf50d114f7cad181a7402571511 cdfe5b92c11d49fc86d242448d99f8cc - default default] [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901] Post live
 migration at destination remote_compute.localdomain failed: NeutronAdminCredentialConfigurationInvalid_Remote: Networking client is experiencing an unauthorized exception.
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901] Traceback (most recent call last):
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6384, in _post_live_migration
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]     instance, block_migration, dest)
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]   File "/usr/lib/python2.7/site-packages/nova/compute/rpcapi.py", line 783, in post_live_migration_at_destination
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]     instance=instance, block_migration=block_migration)
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 174, in call
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]     retry=self.retry)
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 131, in _send
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]     timeout=timeout, retry=retry)
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in send
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]     retry=retry)
2019-08-14 12:55:18.671 1 ERROR nova.compute.manager [instance: 229f9217-67ed-4ef5-bcd8-796fa89f3901]   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 550, in _send

Comment 7 Cristian Muresanu 2020-05-26 22:09:04 UTC
We have observed that If we give the destination node while triggering the live migration it works but if we just trigger live migration and let scheduler decide on the destination node its fails.

Comment 8 jhardee 2020-08-03 15:05:46 UTC
We wanted to see if there's any update on this?

Thanks team

Comment 13 Matthew Secaur 2022-03-10 21:42:42 UTC
So, it's been 2.5 years that this BZ hasn't made any progress. In the meantime, I have two customers who are waiting on this to be fixed in OSP16.1 and, as far as I can tell, OSP16.2, also.

What can we do to get more attention on this issue?

Thanks!

Comment 14 aruffin@redhat.com 2022-06-03 19:39:23 UTC
Hello,

Is there any progress on this bug?

Comment 15 Alex Stupnikov 2022-06-06 06:59:28 UTC
Hello. We are backporting fix upstream and are planning to release RHOSP 16.2 fix in one of upcoming minor releases.

Comment 16 smooney 2022-06-13 12:37:51 UTC
the fix upstream has not been merged and was deprioriteis so I'm not currently working on this.

when it lands we can likely backport to 16.2 but i don't plan to backport it to 16.1 or 13

Comment 20 aruffin@redhat.com 2022-11-03 20:14:52 UTC
Hello,

As this was merged upstream, CU is asking when and what more is needed to have this backported into 16.2

Andre

Comment 25 Red Hat Bugzilla 2024-05-11 04:25:02 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.