Bug 1564445

Summary: live migration broken when live_migration_inbound_addr is set and transport = ssh
Product: Red Hat OpenStack Reporter: Sven Michels <svmichel>
Component: puppet-novaAssignee: Ollie Walsh <owalsh>
Status: CLOSED ERRATA QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: high Docs Contact:
Priority: high    
Version: 12.0 (Pike)CC: awaugama, berrange, dasmith, eglynn, jhakimra, jjoyce, jschluet, kchamart, lyarwood, nlevinki, owalsh, roxenham, sbauza, sgordon, slinaber, srevivo, stephenfin, tvignaud, vromanso
Target Milestone: Upstream M2Keywords: Triaged
Target Release: 14.0 (Rocky)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: puppet-nova-13.1.1-0.20180709142740.fa5ce48.el7ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1576750 1576751 (view as bug list) Environment:
Last Closed: 2019-01-11 11:49:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1576750, 1576751    

Description Sven Michels 2018-04-06 10:09:12 UTC
Description of problem:

We wanted to get live migration using our storage interface.

To get this working, we specified the nodes storage ip in nova.conf/libvirt/live_migration_inbound_addr using a template setting:

nova::migration::libvirt::live_migration_inbound_addr

As long as live_migration_inbound_addr is *NOT* set, the live_migration_uri defaults to:
qemu+ssh://nova_migration@%s:2022/system?keyfile=/etc/nova/migration/identity
which is set by puppet (modules/nova/manifests/migration/libvirt.pp). But the same module unsets the uri when live_migration_inbound_addr is set:
 160 if is_service_default($live_migration_inbound_addr) {
 161   $live_migration_uri = "qemu+${transport_real}://${prefix}%s${postfix}/system${extra_params}"
 162   $live_migration_scheme = $::os_service_default
 163 } else {
 164   $live_migration_uri = $::os_service_default
 165   $live_migration_scheme = $transport_real
 166 } 

Since live_migration_uri is deprecated, this might be okay. But it breaks live migration, because the ssh config which is needed to migrate is missing then.

The ssh config for nova is done (correctly) in /var/lib/nova/.ssh/config - but livemigration is done by root in nova-libvirt container. And root is missing the config.


Version-Release number of selected component (if applicable):
puppet-nova-11.4.0-2

How reproducible:
deploy without live_migration_inbound_addr set, live migration works (using the default interface)

Set live_migration_inbound_addr to another IP of the node during deployment,
using nova::migration::libvirt::live_migration_inbound_addr
and live migration will break. You'll see errors regarding ssh connection failed, permission denied in nova-compute.log. And you see that it tries to use port 22 (or it doesn't show any port, which defaults to 22) in the error.


Steps to Reproduce:
1. deploy without live_migration_inbound_addr
2. migrate, works
3. deploy with live_migration_inbound_addr
4. migration fails

Actual results:
live migraton fails after setting live_migration_inbound_addr

Expected results:
live migration works using the correct IP from live_migration_inbound_addr.


Additional info:
The only change which i need to get it working after the addr was set:
copy /var/lib/nova/.ssh/config to /root/.ssh/config inside the nova-libvirt container.
The config contains the same information as the live_migration_uri would provide, so i think the config is the correct way to set the options. We just need to provide the root user the same information.

Comment 8 Ollie Walsh 2018-05-10 10:10:33 UTC
https://review.openstack.org/562818 & https://review.openstack.org/562764 merged to master

Comment 15 errata-xmlrpc 2019-01-11 11:49:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045