Bug 1477293 - If there is no default route on the compute nodes, ephemeral data local on compute nodes, a nova resize faills
If there is no default route on the compute nodes, ephemeral data local on co...
Status: CLOSED DUPLICATE of bug 1477294
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova (Show other bugs)
10.0 (Newton)
All Linux
unspecified Severity medium
: ---
: ---
Assigned To: Eoghan Glynn
Joe H. Rahme
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-01 12:26 EDT by Ruchika K
Modified: 2017-08-01 12:35 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-01 12:35:35 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ruchika K 2017-08-01 12:26:39 EDT
Description of problem:
Environment:
- 1 controller
- 2 compute nodes
- ephemeral data configured to be local to the compute nodes.
- during a deployment the compute nic configs have no default gateway defined.
  (reference attachment)

When a resize of an instance is done this is what happens
The instance is *seemingly* resized properly on the second compute node however 
- data is lost, any files created on the ephemeral storage is lost in the resize.
- an abnormal effect i observe is the original compute node still has the /var/lib/nova/instance/<name>/<disks> there vs a case when the default gw is setup.
   
The customer root cased the problem to be this:

The problem has probably to do with the fact that our compute nodes, we do nothave any default route. 
dest_host will be set in _create_migration in nova/compute/resource_tracker.py to self.driver.get_host_ip_addr() 

On the libvirt driver this is set to CONF.my_ip. 

CONF.my_ip is set, presumably at nova startup using a method in oslo_utils/netutils.py called get_my_ipv4. It tries to create a socket to destination 192.0.2.0. This presupposes that this address is routable. On our comutes nodes, it is nodes, so this defaults to 127.0.0.1. 
With dest_host set to 127.0.0.1, the migration/ resize will perform a local copy.

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.Set up an openstack env with 1 controller and 2 computes
2.ephemeral storage local on compute nodes
3. no default routes on the compute nic configs (have uploaded my yaml files for reference in case it is useful. Please note my yaml files use ceph for all the other storage)
  

Actual results:
- Ephemeral data lost


Expected results:
- Ephemeral data should be copied over to the second compute node vs a disk being recreated 
- The false positive for a resize could cause serious damage for users using applications like hadoop depending heavily on local compute storage.

Additional info:
Comment 1 Ruchika K 2017-08-01 12:35:35 EDT

*** This bug has been marked as a duplicate of bug 1477294 ***

Note You need to log in before you can comment on or make changes to this bug.