Bug 1151126 - fail to migrate instance on shared nova storage with ssh verification error
Summary: fail to migrate instance on shared nova storage with ssh verification error
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-packstack
Version: 5.0 (RHEL 7)
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 8.0 (Liberty)
Assignee: Ivan Chavero
QA Contact: Gabriel Szasz
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-10-09 15:54 UTC by Dafna Ron
Modified: 2016-08-29 13:44 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-26 03:40:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (892.27 KB, application/x-gzip)
2014-10-09 15:54 UTC, Dafna Ron
no flags Details

Description Dafna Ron 2014-10-09 15:54:58 UTC
Created attachment 945379 [details]
logs

Description of problem:

I configured shared storage for nova and failed migration with the following error: 
Error: Failed to launch instance "Dafna1": Please try again later [Error: Unexpected error while running command. Command: ssh 10.35.160.125 mkdir -p /export/instances/bb438456-366b-44bd-8971-8d21e0b36920 Exit code: 255 Stdout: '' Stderr: 'Host key verification failed.\r\n'].

Version-Release number of selected component (if applicable):

openstack-packstack-2014.1.1-0.41.dev1251.el6ost.noarch

How reproducible:

100%

Steps to Reproduce:
1. install a multi-compute setup using packstack and configure: CONFIG_NOVA_COMPUTE_MIGRATE_PROTOCOL=ssh 
2. configure shared storage for nova
3. try to migrate an instance 

Actual results:

we fail to migrate the instance with Host key verification failed error 

Expected results:

we should succeed to migrate since ssh migration is configured by packstack 

Additional info:

CONFIG_NOVA_COMPUTE_MIGRATE_PROTOCOL=ssh

[root@stripe ~]# grep live_migration_uri -R /etc/nova
/etc/nova/nova.conf:#live_migration_uri=qemu+tcp://%s/system
/etc/nova/nova.conf:live_migration_uri=qemu+ssh://nova@%s/system?no_verify=1&keyfile=/etc/nova/ssh/nova_migration_key
[root@stripe ~]# 


Command: ssh 10.35.160.125 mkdir -p /export/instances/bb438456-366b-44bd-8971-8d21e0b36920
Exit code: 255
Stdout: ''
Stderr: 'Host key verification failed.\r\n'
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 133, in _dispatch_and_reply
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     incoming.message))
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 176, in _dispatch
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     return self._do_dispatch(endpoint, method, ctxt, args)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 122, in _do_dispatch
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     result = getattr(endpoint, method)(ctxt, **new_args)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/exception.py", line 88, in wrapped
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     payload)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/exception.py", line 71, in wrapped
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     return f(self, context, *args, **kw)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 274, in decorated_function
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     pass
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 260, in decorated_function
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 327, in decorated_function
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     function(self, context, *args, **kwargs)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 248, in decorated_function
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     migration.instance_uuid, exc_info=True)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 235, in decorated_function
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 303, in decorated_function
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     e, sys.exc_info())
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 290, in decorated_function
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 3459, in resize_instance
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     block_device_info)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 4980, in migrate_disk_and_power_off
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     utils.execute('ssh', dest, 'mkdir', '-p', inst_base)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/utils.py", line 165, in execute
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     return processutils.execute(*cmd, **kwargs)
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.6/site-packages/nova/openstack/common/processutils.py", line 193, in execute
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher     cmd=' '.join(cmd))
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher ProcessExecutionError: Unexpected error while running command.
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher Command: ssh 10.35.160.125 mkdir -p /export/instances/bb438456-366b-44bd-8971-8d21e0b36920
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher Exit code: 255
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher Stdout: ''
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher Stderr: 'Host key verification failed.\r\n'
2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher 
2014-10-09 18:23:26.576 26527 ERROR oslo.messaging._drivers.common [-] Returning exception Unexpected error while running command.
Command: ssh 10.35.160.125 mkdir -p /export/instances/bb438456-366b-44bd-8971-8d21e0b36920
:

Comment 1 Dafna Ron 2014-10-09 15:56:42 UTC
a workaround would be to ssh with user root from src compute to dst compute.

Comment 3 Dafna Ron 2014-10-21 11:13:38 UTC
I an testing migration on rhos4 (for upgrade) and the workarond will not work.

you need to manually scp the public keys so that no password will be required when running ssh. 

scp $HOME/.ssh/id_rsa.pub root.address:~/.ssh/authorized_keys

Please note that you need to scp that to the storage server as well as the computes (so from both computes to the storage and to each other)

Comment 4 Dafna Ron 2014-10-23 15:07:26 UTC
I installed rhel7 and configured ssh in packstack+ssh'ed from the hosts to each other to save the key but the workaround does not work. 

So this is not working in rhel7 rhos5.

Comment 6 Marko Myllynen 2014-11-28 14:43:04 UTC
(In reply to Dafna Ron from comment #4)
> I installed rhel7 and configured ssh in packstack+ssh'ed from the hosts to
> each other to save the key but the workaround does not work. 
> 
> So this is not working in rhel7 rhos5.

I'm seeing this also with CONFIG_NOVA_COMPUTE_MIGRATE_PROTOCOL=tcp when trying to migrate volume backed instances.

The above mentioned workaround doesn't work because the error is for the nova user not for the root user. So (preferably) packstack should set up ~nova/.ssh/known_hosts properly on each compute node or (less preferably) the nova user should use -o StrictHostKeyChecking=no.

But even if ~nova/.ssh/known_hosts is set up properly or using -o StrictHostKeyChecking=no then the next failure will be "This account is currently not available" due to nova's shell being /sbin/nologin which prevents nova's login attempts. If changing nova's shell to /bin/bash then migration works as expected (but this of course has security implications).

Thanks.

Comment 8 Marko Myllynen 2014-11-28 14:49:34 UTC
Upstream blueprint for a proper solution is at https://review.openstack.org/#/c/85877/.

Comment 9 Martin Magr 2015-10-23 10:42:02 UTC
Packstack definitely should not touch ssh setting and user's console.

Comment 10 Marko Myllynen 2015-10-23 11:30:44 UTC
(In reply to Martin Magr from comment #9)
> Packstack definitely should not touch ssh setting

Well, during the installation ~nova/.ssh/id_rda and ~nova/.ssh/authorized_keys are already being created so why would touching ~nova/.ssh/known_hosts be a such bad thing after creating those two files?

> and user's console.

No, it shouldn't, as mentioned, it would have implications.

So if we are going to leave this regression unfixed should we document it somewhere? There are already more than one people who have wasted time troubleshooting this.

Thanks.

Comment 11 Martin Magr 2015-10-27 11:10:49 UTC
You are misusing the word regression. Since Packstack was never modifying known_hosts it cannot be regression. Nevertheless with known_hosts you are right, we could add such feature.

Comment 13 Marko Myllynen 2015-10-27 11:15:22 UTC
(In reply to Martin Magr from comment #11)
> You are misusing the word regression. Since Packstack was never modifying
> known_hosts it cannot be regression.

I was referring to https://bugzilla.redhat.com/show_bug.cgi?id=1151126#c5, not the lack of known_hosts.

> Nevertheless with known_hosts you are right, we could add such feature.

Thanks.

Comment 14 Marko Myllynen 2015-10-29 12:38:46 UTC
There's also some related discussion at:

https://bugzilla.redhat.com/show_bug.cgi?id=1028186

Thanks.

Comment 15 Marko Myllynen 2015-11-11 13:22:37 UTC
Also somewhat related:

https://bugzilla.redhat.com/show_bug.cgi?id=1221776

Thanks.

Comment 17 Ivan Chavero 2016-01-26 03:40:20 UTC
It's documented that in order to be able to use the live migration the user has to manually modify the nova user entry in /etc/passwd

**CONFIG_NOVA_COMPUTE_MIGRATE_PROTOCOL**
    Protocol used for instance migration. Valid options are: tcp and ssh. Note that by default, the Compute user is created with the /sbin/nologin shell so that the SSH protocol will not work. To make the SSH protocol work, you must configure the Compute user on compute hosts manually. ['tcp', 'ssh']


If the shared storage is properly configured and the nova is properly modified the live migration works as expected (tested on RHEL 7.2 on Liberty).

Packstack does it's work ok so i'm closing this bug.


Note You need to log in before you can comment on or make changes to this bug.