Created attachment 945379 [details] logs Description of problem: I configured shared storage for nova and failed migration with the following error: Error: Failed to launch instance "Dafna1": Please try again later [Error: Unexpected error while running command. Command: ssh 10.35.160.125 mkdir -p /export/instances/bb438456-366b-44bd-8971-8d21e0b36920 Exit code: 255 Stdout: '' Stderr: 'Host key verification failed.\r\n']. Version-Release number of selected component (if applicable): openstack-packstack-2014.1.1-0.41.dev1251.el6ost.noarch How reproducible: 100% Steps to Reproduce: 1. install a multi-compute setup using packstack and configure: CONFIG_NOVA_COMPUTE_MIGRATE_PROTOCOL=ssh 2. configure shared storage for nova 3. try to migrate an instance Actual results: we fail to migrate the instance with Host key verification failed error Expected results: we should succeed to migrate since ssh migration is configured by packstack Additional info: CONFIG_NOVA_COMPUTE_MIGRATE_PROTOCOL=ssh [root@stripe ~]# grep live_migration_uri -R /etc/nova /etc/nova/nova.conf:#live_migration_uri=qemu+tcp://%s/system /etc/nova/nova.conf:live_migration_uri=qemu+ssh://nova@%s/system?no_verify=1&keyfile=/etc/nova/ssh/nova_migration_key [root@stripe ~]# Command: ssh 10.35.160.125 mkdir -p /export/instances/bb438456-366b-44bd-8971-8d21e0b36920 Exit code: 255 Stdout: '' Stderr: 'Host key verification failed.\r\n' 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last): 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 133, in _dispatch_and_reply 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher incoming.message)) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 176, in _dispatch 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 122, in _do_dispatch 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/exception.py", line 88, in wrapped 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher payload) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__ 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/exception.py", line 71, in wrapped 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher return f(self, context, *args, **kw) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 274, in decorated_function 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher pass 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__ 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 260, in decorated_function 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 327, in decorated_function 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher function(self, context, *args, **kwargs) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 248, in decorated_function 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher migration.instance_uuid, exc_info=True) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__ 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 235, in decorated_function 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 303, in decorated_function 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher e, sys.exc_info()) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__ 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 290, in decorated_function 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 3459, in resize_instance 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher block_device_info) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 4980, in migrate_disk_and_power_off 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher utils.execute('ssh', dest, 'mkdir', '-p', inst_base) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/utils.py", line 165, in execute 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher return processutils.execute(*cmd, **kwargs) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.6/site-packages/nova/openstack/common/processutils.py", line 193, in execute 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher cmd=' '.join(cmd)) 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher ProcessExecutionError: Unexpected error while running command. 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher Command: ssh 10.35.160.125 mkdir -p /export/instances/bb438456-366b-44bd-8971-8d21e0b36920 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher Exit code: 255 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher Stdout: '' 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher Stderr: 'Host key verification failed.\r\n' 2014-10-09 18:23:26.573 26527 TRACE oslo.messaging.rpc.dispatcher 2014-10-09 18:23:26.576 26527 ERROR oslo.messaging._drivers.common [-] Returning exception Unexpected error while running command. Command: ssh 10.35.160.125 mkdir -p /export/instances/bb438456-366b-44bd-8971-8d21e0b36920 :
a workaround would be to ssh with user root from src compute to dst compute.
I an testing migration on rhos4 (for upgrade) and the workarond will not work. you need to manually scp the public keys so that no password will be required when running ssh. scp $HOME/.ssh/id_rsa.pub root.address:~/.ssh/authorized_keys Please note that you need to scp that to the storage server as well as the computes (so from both computes to the storage and to each other)
I installed rhel7 and configured ssh in packstack+ssh'ed from the hosts to each other to save the key but the workaround does not work. So this is not working in rhel7 rhos5.
(In reply to Dafna Ron from comment #4) > I installed rhel7 and configured ssh in packstack+ssh'ed from the hosts to > each other to save the key but the workaround does not work. > > So this is not working in rhel7 rhos5. I'm seeing this also with CONFIG_NOVA_COMPUTE_MIGRATE_PROTOCOL=tcp when trying to migrate volume backed instances. The above mentioned workaround doesn't work because the error is for the nova user not for the root user. So (preferably) packstack should set up ~nova/.ssh/known_hosts properly on each compute node or (less preferably) the nova user should use -o StrictHostKeyChecking=no. But even if ~nova/.ssh/known_hosts is set up properly or using -o StrictHostKeyChecking=no then the next failure will be "This account is currently not available" due to nova's shell being /sbin/nologin which prevents nova's login attempts. If changing nova's shell to /bin/bash then migration works as expected (but this of course has security implications). Thanks.
Upstream blueprint for a proper solution is at https://review.openstack.org/#/c/85877/.
Packstack definitely should not touch ssh setting and user's console.
(In reply to Martin Magr from comment #9) > Packstack definitely should not touch ssh setting Well, during the installation ~nova/.ssh/id_rda and ~nova/.ssh/authorized_keys are already being created so why would touching ~nova/.ssh/known_hosts be a such bad thing after creating those two files? > and user's console. No, it shouldn't, as mentioned, it would have implications. So if we are going to leave this regression unfixed should we document it somewhere? There are already more than one people who have wasted time troubleshooting this. Thanks.
You are misusing the word regression. Since Packstack was never modifying known_hosts it cannot be regression. Nevertheless with known_hosts you are right, we could add such feature.
(In reply to Martin Magr from comment #11) > You are misusing the word regression. Since Packstack was never modifying > known_hosts it cannot be regression. I was referring to https://bugzilla.redhat.com/show_bug.cgi?id=1151126#c5, not the lack of known_hosts. > Nevertheless with known_hosts you are right, we could add such feature. Thanks.
There's also some related discussion at: https://bugzilla.redhat.com/show_bug.cgi?id=1028186 Thanks.
Also somewhat related: https://bugzilla.redhat.com/show_bug.cgi?id=1221776 Thanks.
It's documented that in order to be able to use the live migration the user has to manually modify the nova user entry in /etc/passwd **CONFIG_NOVA_COMPUTE_MIGRATE_PROTOCOL** Protocol used for instance migration. Valid options are: tcp and ssh. Note that by default, the Compute user is created with the /sbin/nologin shell so that the SSH protocol will not work. To make the SSH protocol work, you must configure the Compute user on compute hosts manually. ['tcp', 'ssh'] If the shared storage is properly configured and the nova is properly modified the live migration works as expected (tested on RHEL 7.2 on Liberty). Packstack does it's work ok so i'm closing this bug.