Bug 1310140 - nova-serialproxy enabled causes live migration to fail
Summary: nova-serialproxy enabled causes live migration to fail
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 7.0 (Kilo)
Hardware: All
OS: Linux
urgent
unspecified
Target Milestone: async
: 7.0 (Kilo)
Assignee: Lee Yarwood
QA Contact: Prasanth Anbalagan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-19 15:01 UTC by Jon Jozwiak
Modified: 2020-12-11 12:04 UTC (History)
15 users (show)

Fixed In Version: openstack-nova-2015.1.3-2.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-24 13:55:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1455252 0 None None None 2016-02-23 14:25:19 UTC
OpenStack gerrit 191035 0 None None None 2016-02-23 14:28:16 UTC
Red Hat Product Errata RHBA-2016:0507 0 normal SHIPPED_LIVE openstack-nova bug fix advisory 2016-03-24 17:52:57 UTC

Description Jon Jozwiak 2016-02-19 15:01:34 UTC
Description of problem:
When nova-serialproxy is used, live migration fails due to serial port issues

Version-Release number of selected component (if applicable):
RHEL OSP 7 y2

python-nova-2015.1.2-13.el7ost.noarch
openstack-nova-console-2015.1.2-13.el7ost.noarch
openstack-nova-scheduler-2015.1.2-13.el7ost.noarch
openstack-nova-serialproxy-2015.1.2-13.el7ost.noarch
openstack-nova-cert-2015.1.2-13.el7ost.noarch
openstack-nova-novncproxy-2015.1.2-13.el7ost.noarch
openstack-nova-api-2015.1.2-13.el7ost.noarch
openstack-nova-compute-2015.1.2-13.el7ost.noarch
python-novaclient-2.23.0-2.el7ost.noarch
openstack-nova-common-2015.1.2-13.el7ost.noarch
openstack-nova-conductor-2015.1.2-13.el7ost.noarch
libvirt-1.2.17-13.el7_2.3.x86_64

How reproducible:
Can reproduce every time

Steps to Reproduce:
1. Deploy OSP director (7.2) deployment with serialproxy enabled (Serial proxy customization documented here: https://github.com/jonjozwiak/openstack/tree/master/director-examples/serialproxy)
   In my case I used Cinder backed by NFS.  The same issue will exist with any backend
2. Boot an instance backed by cinder volume
nova boot --flavor 2 --block-device source=image,id=<id of image>,dest=volume,size=10,shutdown=preserve,bootindex=0 \
  myInstanceFromVolume
3. Attempt to migrate instance 
nova list 
nova show myInstanceFromVolume | grep host
nova live-migration <Instance ID> 

Actual results:
Instance does not migrate 

Expected results:
Instance moved to another hypervisor

Additional info:
Below is the /var/log/messages file from the compute node showing an error failing to bind a socket.  There is a bug upstream that is identical to this problem: 
     https://bugs.launchpad.net/nova/+bug/1455252 

To validate the issue is serial console specific, I editing /etc/nova/nova.conf on the compute and controllers, commented out the serial_console settings, and restarted nova on compute and controllers.  After that, I recreated a new instance and validated it's standard console was working (nova console-log <instance name> -> validate it gets a response).  After that, I did nova live-migrate and the migration worked without problems.  

If I revert and walk through the process again, it will fail to migrate again.  

/var/log/messages error on the compute node:  

Feb 18 17:48:55 overcloud-compute-1 journal: internal error: process exited while connecting to monitor: 2016-02-18T22:48:54.829400Z qemu-kvm: -chardev socket,id=charserial0,host=192.168.20.23,port=10000,server,nowait: Failed to bind socket: Cannot assign requested address
Feb 18 17:48:55 overcloud-compute-1 nova-compute: Traceback (most recent call last):
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 457, in fire_timers
Feb 18 17:48:55 overcloud-compute-1 nova-compute: timer()
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/hubs/timer.py", line 58, in __call__
Feb 18 17:48:55 overcloud-compute-1 nova-compute: cb(*args, **kw)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 168, in _do_send
Feb 18 17:48:55 overcloud-compute-1 nova-compute: waiter.switch(result)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 214, in main
Feb 18 17:48:55 overcloud-compute-1 nova-compute: result = function(*args, **kwargs)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/nova/utils.py", line 997, in context_wrapper
Feb 18 17:48:55 overcloud-compute-1 nova-compute: return func(*args, **kwargs)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5674, in _live_migration_operation
Feb 18 17:48:55 overcloud-compute-1 nova-compute: instance=instance)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
Feb 18 17:48:55 overcloud-compute-1 nova-compute: six.reraise(self.type_, self.value, self.tb)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5643, in _live_migration_operation
Feb 18 17:48:55 overcloud-compute-1 nova-compute: CONF.libvirt.live_migration_bandwidth)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
Feb 18 17:48:55 overcloud-compute-1 nova-compute: result = proxy_call(self._autowrap, f, *args, **kwargs)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
Feb 18 17:48:55 overcloud-compute-1 nova-compute: rv = execute(f, *args, **kwargs)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
Feb 18 17:48:55 overcloud-compute-1 nova-compute: six.reraise(c, e, tb)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
Feb 18 17:48:55 overcloud-compute-1 nova-compute: rv = meth(*args, **kwargs)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1825, in migrateToURI2
Feb 18 17:48:55 overcloud-compute-1 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)
Feb 18 17:48:55 overcloud-compute-1 nova-compute: libvirtError: internal error: process exited while connecting to monitor: 2016-02-18T22:48:54.829400Z qemu-kvm: -chardev socket,id=charserial0,host=192.168.20.23,port=10000,server,nowait: Failed to bind socket: Cannot assign requested address

Comment 2 Lee Yarwood 2016-02-24 11:47:46 UTC
I've backported the following change to enable the live migration of instances with serial consoles attached :

libvirt: enable live migration with serial console
https://review.openstack.org/191035

However we are still susceptible to serial port collisions if the serial port used for the instance on the source is already in-use on the destination. This is still being worked on upstream :

https://review.openstack.org/#/q/topic:refactoring-libvirt

I have opened the following bug to track these and look into backporting , however I'm not entirely sure if this will be possible at present.

Serial port collisions can occur when live migrating instances
https://bugzilla.redhat.com/show_bug.cgi?id=1311514

Comment 8 errata-xmlrpc 2016-03-24 13:55:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0507.html


Note You need to log in before you can comment on or make changes to this bug.