Bug 1326868

Summary: Cold migrate of host doesn't migrate the instances from source host
Product: Red Hat OpenStack Reporter: Ido Ovadia <iovadia>
Component: openstack-novaAssignee: Eoghan Glynn <eglynn>
Status: CLOSED NOTABUG QA Contact: Prasanth Anbalagan <panbalag>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.0 (Liberty)CC: berrange, dasmith, eglynn, jschluet, kchamart, ndipanov, sbauza, sferdjao, sgordon, srevivo, vromanso
Target Milestone: ---Keywords: ZStream
Target Release: 8.0 (Liberty)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-21 15:43:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1104445    
Attachments:
Description Flags
/var/log/nova/nova-compute.log none

Description Ido Ovadia 2016-04-13 15:12:40 UTC
Created attachment 1146899 [details]
/var/log/nova/nova-compute.log

Description of problem:
=======================

https://bugzilla.redhat.com/show_bug.cgi?id=1104445

Cold migrate all instances from host failed. 
I got massage the migrate starting,but the the instances keeps run on source host.

Version-Release number of selected component:
=============================================
python-django-horizon-8.0.1-2.el7ost.noarch

python-novaclient-3.1.0-2.el7ost.noarch
openstack-nova-console-12.0.2-5.el7ost.noarch
python-nova-12.0.2-5.el7ost.noarch
openstack-nova-common-12.0.2-5.el7ost.noarch
openstack-nova-conductor-12.0.2-5.el7ost.noarch
openstack-nova-cert-12.0.2-5.el7ost.noarch
openstack-nova-api-12.0.2-5.el7ost.noarch
openstack-nova-compute-12.0.2-5.el7ost.noarch
openstack-nova-novncproxy-12.0.2-5.el7ost.noarch
openstack-nova-scheduler-12.0.2-5.el7ost.noarch

How reproducible:
=================
100%

Steps to Reproduce:
===================
1. Deploy undercloud and overcloud with 3* compute hosts and external Ceph 
2. Create 6 instances.
3. As 'Admin' user go to: Admin --> System --> Hypervisors 
4. Click 'Compute Host' tab.
5. Choose a compute host which has running instances, on 'Actions' section     click 'Disable Service'.
6. Click Migrate Host.
7. Choose Cold Migrate and click Migrate Host

Actual results:
===============
Got success massage but instances keeps run on source host

All instances on ACTIVE state
 
[stack@undercloud ~]$ nova list --all-tenant
+--------------------------------------+-------------------+----------------------------------+--------+------------+-------------+-------------------------+
| ID                                   | Name              | Tenant ID                        | Status | Task State | Power State | Networks                |
+--------------------------------------+-------------------+----------------------------------+--------+------------+-------------+-------------------------+
| 5bc99c07-c6ec-4913-9d05-7f9507f2b1e9 | RHEL-guest-7.2-1  | 6d75a629774942b0bbf50671b027599a | ACTIVE | -          | Running     | Internal=192.168.100.10 |
| ed8fc635-46e1-4bb4-be94-f166e43c96d4 | RHEL-guest-7.2-2  | 6d75a629774942b0bbf50671b027599a | ACTIVE | -          | Running     | Internal=192.168.100.9  |
| 525d9896-b6c2-4ef0-926e-caa8c1ed34ab | RHEL-guest-7.2-3  | 6d75a629774942b0bbf50671b027599a | ACTIVE | -          | Running     | Internal=192.168.100.11 |
| a40b45ef-d4b7-4b91-bb58-6613b52d4d0f | instance_cirros-1 | 6d75a629774942b0bbf50671b027599a | ACTIVE | -          | Running     | Internal=192.168.100.3  |
| 41df4559-9cf4-40c3-a313-9d21cab303a5 | instance_cirros-2 | 6d75a629774942b0bbf50671b027599a | ACTIVE | -          | Running     | Internal=192.168.100.4  |
| 893aa3c7-ae91-44c1-baed-49a99755d2cd | instance_cirros-3 | 6d75a629774942b0bbf50671b027599a | ACTIVE | -          | Running     | Internal=192.168.100.5  |
+--------------------------------------+-------------------+----------------------------------+--------+------------+-------------+-------------------------+


Expected results:
=================
Host migrate successfully.

Note: live migrate worked successfully

/var/log/nova/nova-compute.log enclosed

Comment 1 Dan Smith 2016-04-21 15:43:31 UTC
It looks to me like you don't have the requisite SSH infrastructure in place to support this. Here is the relevant part of your log:

> 2016-04-13 14:29:55.028 25641 ERROR oslo_messaging.rpc.dispatcher ResizeError: Resize error: not able to execute ssh command: Unexpected error while running command.
> 2016-04-13 14:29:55.028 25641 ERROR oslo_messaging.rpc.dispatcher Command: ssh 172.16.0.32 mkdir -p /var/lib/nova/instances/41df4559-9cf4-40c3-a313-9d21cab303a5
> 2016-04-13 14:29:55.028 25641 ERROR oslo_messaging.rpc.dispatcher Exit code: 255
> 2016-04-13 14:29:55.028 25641 ERROR oslo_messaging.rpc.dispatcher Stdout: u''
> 2016-04-13 14:29:55.028 25641 ERROR oslo_messaging.rpc.dispatcher Stderr: u'Host key verification failed.\r\n'

In order for this feature to work, you need SSH keys and some configuration to be installed everywhere so that images (et al) can be copied between machines. See section 2.3.1 for more information here:

https://access.redhat.com/documentation/en/red-hat-openstack-platform/8/migrating-instances/chapter-2-how-to-migrate-a-static-instance