Red Hat Bugzilla – Bug 1477293
If there is no default route on the compute nodes, ephemeral data local on compute nodes, a nova resize faills
Last modified: 2017-08-01 12:35:35 EDT
Description of problem:
- 1 controller
- 2 compute nodes
- ephemeral data configured to be local to the compute nodes.
- during a deployment the compute nic configs have no default gateway defined.
When a resize of an instance is done this is what happens
The instance is *seemingly* resized properly on the second compute node however
- data is lost, any files created on the ephemeral storage is lost in the resize.
- an abnormal effect i observe is the original compute node still has the /var/lib/nova/instance/<name>/<disks> there vs a case when the default gw is setup.
The customer root cased the problem to be this:
The problem has probably to do with the fact that our compute nodes, we do nothave any default route.
dest_host will be set in _create_migration in nova/compute/resource_tracker.py to self.driver.get_host_ip_addr()
On the libvirt driver this is set to CONF.my_ip.
CONF.my_ip is set, presumably at nova startup using a method in oslo_utils/netutils.py called get_my_ipv4. It tries to create a socket to destination 192.0.2.0. This presupposes that this address is routable. On our comutes nodes, it is nodes, so this defaults to 127.0.0.1.
With dest_host set to 127.0.0.1, the migration/ resize will perform a local copy.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Set up an openstack env with 1 controller and 2 computes
2.ephemeral storage local on compute nodes
3. no default routes on the compute nic configs (have uploaded my yaml files for reference in case it is useful. Please note my yaml files use ceph for all the other storage)
- Ephemeral data lost
- Ephemeral data should be copied over to the second compute node vs a disk being recreated
- The false positive for a resize could cause serious damage for users using applications like hadoop depending heavily on local compute storage.
*** This bug has been marked as a duplicate of bug 1477294 ***