Bug 1477293 - If there is no default route on the compute nodes, ephemeral data local on compute nodes, a nova resize faills
Summary: If there is no default route on the compute nodes, ephemeral data local on co...
Keywords:
Status: CLOSED DUPLICATE of bug 1477294
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 10.0 (Newton)
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Eoghan Glynn
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-01 16:26 UTC by Ruchika K
Modified: 2022-08-16 12:47 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 16:35:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-4669 0 None None None 2022-08-16 12:47:35 UTC

Description Ruchika K 2017-08-01 16:26:39 UTC
Description of problem:
Environment:
- 1 controller
- 2 compute nodes
- ephemeral data configured to be local to the compute nodes.
- during a deployment the compute nic configs have no default gateway defined.
  (reference attachment)

When a resize of an instance is done this is what happens
The instance is *seemingly* resized properly on the second compute node however 
- data is lost, any files created on the ephemeral storage is lost in the resize.
- an abnormal effect i observe is the original compute node still has the /var/lib/nova/instance/<name>/<disks> there vs a case when the default gw is setup.
   
The customer root cased the problem to be this:

The problem has probably to do with the fact that our compute nodes, we do nothave any default route. 
dest_host will be set in _create_migration in nova/compute/resource_tracker.py to self.driver.get_host_ip_addr() 

On the libvirt driver this is set to CONF.my_ip. 

CONF.my_ip is set, presumably at nova startup using a method in oslo_utils/netutils.py called get_my_ipv4. It tries to create a socket to destination 192.0.2.0. This presupposes that this address is routable. On our comutes nodes, it is nodes, so this defaults to 127.0.0.1. 
With dest_host set to 127.0.0.1, the migration/ resize will perform a local copy.

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.Set up an openstack env with 1 controller and 2 computes
2.ephemeral storage local on compute nodes
3. no default routes on the compute nic configs (have uploaded my yaml files for reference in case it is useful. Please note my yaml files use ceph for all the other storage)
  

Actual results:
- Ephemeral data lost


Expected results:
- Ephemeral data should be copied over to the second compute node vs a disk being recreated 
- The false positive for a resize could cause serious damage for users using applications like hadoop depending heavily on local compute storage.

Additional info:

Comment 1 Ruchika K 2017-08-01 16:35:35 UTC

*** This bug has been marked as a duplicate of bug 1477294 ***


Note You need to log in before you can comment on or make changes to this bug.