Bug 1221776
Summary: | nova migrate fails with ssh command failure | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Sean Toner <stoner> | |
Component: | rhosp-director | Assignee: | Sven Anderson <svanders> | |
Status: | CLOSED DUPLICATE | QA Contact: | Shai Revivo <srevivo> | |
Severity: | high | Docs Contact: | ||
Priority: | urgent | |||
Version: | 7.0 (Kilo) | CC: | berrange, dasmith, eglynn, emacchi, hbrock, jcoufal, jdonohue, jprovazn, jschluet, jslagle, kchamart, mbooth, mburns, mkrcmari, myllynen, pablo.iranzo, racedoro, rhel-osp-director-maint, sbauza, sclewis, sferdjao, sgordon, srevivo, vromanso | |
Target Milestone: | ga | Keywords: | Triaged | |
Target Release: | 10.0 (Newton) | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1240356 (view as bug list) | Environment: | ||
Last Closed: | 2016-11-14 20:47:32 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1156010, 1198809, 1241501, 1243520, 1258302 |
Description
Sean Toner
2015-05-14 19:36:13 UTC
Looks to me like the target host key isn't in /etc/ssh/ssh_known_hosts. Unfortunately there are a couple of actions which require compute hosts to be able to ssh directly between each other, and this is one of them. This requires host keys to be propagated to ssh_known_hosts on all nova computes which might have to communicate, so safest to do all of them. It also requires ssh keys to have been configured correctly in /var/lib/nova/.ssh on all hosts. Packstack will do this at installation time for all hosts it installs. However, it obviously can't do it for hosts installed subsequently. My guess is that this host has been installed later, and its keys haven't been propagated. You'll need to do it outside the control of packstack. from comment 3 above it almost sounds if this bug is a design flaw and not a bug per say. But it does bring up a good point as to the expectation of what should "nova migrate <instance id>" be capable of doing and maybe better error handling when it can't do it due to permissions, so that it's obvious what to do to make it work for the operator. I agree that the error message is poor, but I don't necessarily think it's a design issue (that level of error handling is typical). When you do a libvirt migration using ssh, each host must be able to talk to the other without keys or passwords needing to be exchanged. Each compute host would need a key generated, then the public keys copied to every other compute host. Nova can't really orchestrate this sort of thing - maybe the director or packstack can. Deploying N compute nodes is part of the director workflow. Adding a new compute node to the openstack deployment is certainly something the director would handle (and would need more key distribution once complete!), so I'm moving this to the director. Proper SSH user configuration is not automated for now in RDO director, the process of setting up SSH on compute nodes will be part of documentation, doc patch is here: https://review.gerrithub.io/#/c/236817/ Per PM, moving this to A1. Cloned bug 1240356 created on Docs team to make sure this is documented. *** Bug 1156000 has been marked as a duplicate of this bug. *** There is two steps to allow SSH compute migration: * Generate a dedicated keypair for Nova Compute service, probably not with Puppet but in TripleO tools (during bootstrap). * Configure Puppet (puppet-nova: ::nova SSH parameters) with the content of the keys, on compute manifests. puppet-nova will prepare /var/lib/nova/.ssh directories with the keys and configure libvirt migration automatically. After that, you will be able to migrate instances. Emilien, agree, we should fix this. Can you reassign appropriately? Thanks. *** This bug has been marked as a duplicate of bug 1267598 *** Dup -- QE will decide about automating the original |