Bug 1424481

Summary: Rollback of failed live migration with attached volume fails on destination
Product: Red Hat OpenStack Reporter: Matthew Booth <mbooth>
Component: openstack-novaAssignee: Lee Yarwood <lyarwood>
Status: CLOSED CURRENTRELEASE QA Contact: Prasanth Anbalagan <panbalag>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.0 (Liberty)CC: berrange, dasmith, ealcaniz, eglynn, jraju, kchamart, lyarwood, mburns, mzheng, sbauza, sferdjao, sgordon, srevivo, vromanso
Target Milestone: Upstream M2Keywords: Triaged
Target Release: 14.0 (Rocky)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-06 08:57:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matthew Booth 2017-02-17 17:17:18 UTC
Description of problem:
This bug comes out of bug 1421946. In that bug we're doing a live migration of an instance with an attached volume, which fails. The live migration failure itself is not the subject of this bug.

During the rollback we somehow end up calling ComputeManager._driver_detach_volume on the *destination*, which attempts to detach the volume from the ephemeral destination domain. This fails with:

libvirtError: Requested operation is not valid: cannot modify device on transient domain

The logs aren't 100% clear, but I am guessing this is called from _rollback_live_migration() when it calls self.compute_rpcapi.remove_volume_connection(context, instance, bdm.volume_id, dest). Note that the dest target is explicit.

Failure during rollback means that the rollback is incomplete, which is likely wasting resources.

Comment 5 Lee Yarwood 2018-03-15 14:47:22 UTC
*** Bug 1353147 has been marked as a duplicate of this bug. ***

Comment 12 Lee Yarwood 2018-07-06 08:57:34 UTC
This was resolved during Pike and backedport to stable/ocata and stable/newton:

https://review.openstack.org/#/q/I95948721a0119f5f54dbe50d4455fd47d422164b

Closing as CURRENTRELEASE.

Comment 13 Lee Yarwood 2018-07-17 13:23:23 UTC
To be clear this fix landed in openstack-nova in the following releases, OSP 10  >=14.0.8 and OSP 11 >=15.0.7. All versions of openstack-nova shipped with OSP 12 and 13 have this fix at release.