Bug 1403338 - rhel-osp-director: running "nova host-servers-migrate" as part of upgrade fails.
Summary: rhel-osp-director: running "nova host-servers-migrate" as part of upgrade fa...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 9.0 (Mitaka)
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: async
: 9.0 (Mitaka)
Assignee: Angus Thomas
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-09 17:21 UTC by Alexander Chuzhoy
Modified: 2016-12-16 12:23 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-16 12:22:57 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Alexander Chuzhoy 2016-12-09 17:21:48 UTC
rhel-osp-director:  running "nova host-servers-migrate" as part of upgrade fails.

Environment:
openstack-tripleo-heat-templates-5.1.0-7.el7ost.noarch
openstack-nova-scheduler-14.0.2-7.el7ost.noarch
openstack-nova-conductor-14.0.2-7.el7ost.noarch
openstack-nova-compute-14.0.2-7.el7ost.noarch
puppet-nova-9.4.0-1.el7ost.noarch
openstack-puppet-modules-9.3.0-1.el7ost.noarch
instack-undercloud-5.1.0-4.el7ost.noarch
openstack-nova-network-14.0.2-7.el7ost.noarch
openstack-nova-cert-14.0.2-7.el7ost.noarch
python-nova-tests-14.0.2-7.el7ost.noarch
python-nova-14.0.2-7.el7ost.noarch
openstack-nova-placement-api-14.0.2-7.el7ost.noarch
openstack-nova-console-14.0.2-7.el7ost.noarch
openstack-nova-novncproxy-14.0.2-7.el7ost.noarch
python-novaclient-6.0.0-1.el7ost.noarch
openstack-nova-api-14.0.2-7.el7ost.noarch
openstack-tripleo-heat-templates-compat-2.0.0-41.el7ost.noarch
openstack-nova-cells-14.0.2-7.el7ost.noarch
openstack-nova-14.0.2-7.el7ost.noarch
openstack-nova-common-14.0.2-7.el7ost.noarch


Steps to reproduce:
1. Follow the upgrade doc until compute nodes: https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/paged/upgrading-red-hat-openstack-platform/chapter-3-director-based-environments-performing-upgrades-to-major-versions

2. You'd need to migrate VMs:
https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/single/director-installation-and-usage/#sect-Migrating_VMs_from_an_Overcloud_Compute_Node

3.
When you run:
nova host-servers-migrate [hostname]
The vm doesn't get migrated.

[stack@instack ~]$ nova host-servers-migrate overcloud-compute-0.localdomain
+--------------------------------------+--------------------+---------------+
| Server UUID                          | Migration Accepted | Error Message |
+--------------------------------------+--------------------+---------------+
| 1d145e4b-c6fa-4b91-99cb-921525f8b084 | True               |               |
+--------------------------------------+--------------------+---------------+




[stack@instack ~]$ nova migration-list
+----+---------------------------------+---------------------------------+---------------------------------+---------------------------------+-----------+-----------+--------------------------------------+--------
----+------------+----------------------------+----------------------------+----------------+
| Id | Source Node                     | Dest Node                       | Source Compute                  | Dest Compute                    | Dest Host | Status    | Instance UUID                        | Old Fla
vor | New Flavor | Created At                 | Updated At                 | Type           |
+----+---------------------------------+---------------------------------+---------------------------------+---------------------------------+-----------+-----------+--------------------------------------+--------
----+------------+----------------------------+----------------------------+----------------+
| 1  | -                               | -                               | overcloud-compute-0.localdomain | overcloud-compute-1.localdomain | -         | completed | 1d145e4b-c6fa-4b91-99cb-921525f8b084 | 4
    | 4          | 2016-12-08T15:10:19.000000 | 2016-12-08T15:10:23.000000 | live-migration |
| 4  | -                               | -                               | overcloud-compute-1.localdomain | overcloud-compute-0.localdomain | -         | completed | a6a47318-dab5-4680-8bc3-be48c15568ef | 4
    | 4          | 2016-12-08T15:13:47.000000 | 2016-12-08T15:13:58.000000 | live-migration |
| 7  | -                               | -                               | overcloud-compute-1.localdomain | overcloud-compute-0.localdomain | -         | completed | 1d145e4b-c6fa-4b91-99cb-921525f8b084 | 4
    | 4          | 2016-12-08T15:13:51.000000 | 2016-12-08T15:14:03.000000 | live-migration |
| 8  | -                               | -                               | overcloud-compute-0.localdomain | overcloud-compute-1.localdomain | -         | completed | a6a47318-dab5-4680-8bc3-be48c15568ef | 4          | 4          | 2016-12-09T14:25:12.000000 | 2016-12-09T14:25:23.000000 | live-migration |
| 11 | overcloud-compute-0.localdomain | overcloud-compute-1.localdomain | overcloud-compute-0.localdomain | overcloud-compute-1.localdomain | 192.0.2.8 | error     | 1d145e4b-c6fa-4b91-99cb-921525f8b084 | 4          | 4          | 2016-12-09T17:08:20.000000 | 2016-12-09T17:08:21.000000 | migration      |
+----+---------------------------------+---------------------------------+---------------------------------+---------------------------------+-----------+-----------+--------------------------------------+------------+------------+----------------------------+----------------------------+----------------+


I see the following entry in nova-compute.log on the respective compute:
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher     self.instance_events.clear_events_for_instance(instance)
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher     self.gen.throw(type, value, traceback)
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6643, in _error_out_instance_on_exception
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher     raise error.inner_exception
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher ResizeError: Resize error: not able to execute ssh command: Unexpected error while running command.
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher Command: ssh -o BatchMode=yes 192.0.2.8 mkdir -p /var/lib/nova/instances/1d145e4b-c6fa-4b91-99cb-921525f8b084
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher Exit code: 255
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher Stdout: u''
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher Stderr: u'Permission denied (publickey,gssapi-keyex,gssapi-with-mic).\r\n'
2016-12-09 17:08:21.911 2641 ERROR oslo_messaging.rpc.dispatcher



Expected result:
Migration should work.

Comment 2 Alexander Chuzhoy 2016-12-09 18:24:55 UTC
The ssh keys weren't set properly on all compute nodes.
Need to make sure that the nova user is able to ssh from/to computes without issues.

Comment 3 Ollie Walsh 2016-12-16 12:22:57 UTC
Closing as the ssh keys were not set properly


Note You need to log in before you can comment on or make changes to this bug.