Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1098581

Summary: nova: _resize files are not cleaned when we destroy instances after failed resize because of disk space
Product: Red Hat OpenStack Reporter: Dafna Ron <dron>
Component: openstack-novaAssignee: Vladan Popovic <vpopovic>
Status: CLOSED UPSTREAM QA Contact: Ami Jeain <ajeain>
Severity: high Docs Contact:
Priority: high    
Version: 5.0 (RHEL 7)CC: dron, ndipanov, rbs.shashank, sgordon, vpopovic, yeylon
Target Milestone: ---   
Target Release: 6.0 (Juno)   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-30 15:59:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Dafna Ron 2014-05-16 15:18:54 UTC
Created attachment 896447 [details]
logs

Description of problem:

I configured my setup to work with preallocation and started resize action on instances launched with tiny flavour (1GB disk). 
on the 8th instance I failed to resize because of disk space allocation and that caused all instances waiting for confirm resize to fail resize as well. 
when I destroyed the instances I noticed that the _resize dir is not deleted leaving 1GB disks on the HDD. 

I am setting this high since it's basically storage leak issue (we are taking space from the HDD for instances that do not exist any more). 

Version-Release number of selected component (if applicable):

openstack-nova-compute-2014.1-2.el7ost.noarch

How reproducible:

100%

Steps to Reproduce:
1. configure nova to work with preallocation + allow resize in nova.conf by changing nova.conf:
preallocate_images=space 
allow_resize_to_same_host=True
scheduler_default_filters=AllHostsFilter
restart nova services: 
#for i in $(/bin/systemctl -a | awk ' /nova/ { print $1 } '); do systemctl restart $i ; done
2. launch instances 
3. resize instances one by one until you fail because you run out of disk space
4. confirm resize on all instances that are waiting confirm resize. 
5. once we fail resize destroy all instances
6. ls -l /var/lib/nova/instances
7. run: qemu-img info /var/lib/nova/instances/<instance>_resize/disk

Actual results:

_resize is not deleted along with the disks 

Expected results:

we should clean all disks from the compute 

Additional info: logs 


[root@orange-vdsf instances(keystone_admin)]# ls -l /var/lib/nova/instances/
total 52
drwxr-xr-x 2 nova nova 4096 May 16 17:34 1a98eefe-ba41-49b7-931a-ebe54796e343
drwxr-xr-x 2 nova nova 4096 May 16 17:33 1a98eefe-ba41-49b7-931a-ebe54796e343_resize
drwxr-xr-x 2 nova nova 4096 May 16 17:32 5f87fd7e-4f85-4c41-ab12-3bf680160281
drwxr-xr-x 2 nova nova 4096 May 16 17:29 5f87fd7e-4f85-4c41-ab12-3bf680160281_resize
drwxr-xr-x 2 nova nova 4096 May 16 17:39 7610109e-b631-40a5-9fa0-1bf150560503
drwxr-xr-x 2 nova nova 4096 May 16 17:38 7610109e-b631-40a5-9fa0-1bf150560503_resize
drwxr-xr-x 2 nova nova 4096 May 16 17:31 a4aefac9-feb8-40d7-9228-1df6a02d9e13
drwxr-xr-x 2 nova nova 4096 May 16 17:29 a4aefac9-feb8-40d7-9228-1df6a02d9e13_resize
drwxr-xr-x 2 nova nova 4096 May 16 13:20 _base
-rw-r--r-- 1 nova nova    0 May 16 17:48 compute_nodes
drwxr-xr-x 2 nova nova 4096 May 16 17:40 d1e28a92-3ba2-402e-a798-ff8ff9a6bd98
drwxr-xr-x 2 nova nova 4096 May 16 17:38 d1e28a92-3ba2-402e-a798-ff8ff9a6bd98_resize
drwxr-xr-x 2 nova nova 4096 May 15 11:09 locks
drwxr-xr-x 2 nova nova 4096 May 15 18:34 snapshots
[root@orange-vdsf instances(keystone_admin)]# 
[root@orange-vdsf instances(keystone_admin)]# 
[root@orange-vdsf instances(keystone_admin)]# 
[root@orange-vdsf instances(keystone_admin)]# 
[root@orange-vdsf instances(keystone_admin)]# 
[root@orange-vdsf instances(keystone_admin)]# 
[root@orange-vdsf instances(keystone_admin)]# ls -l /var/lib/nova/instances/
total 32
drwxr-xr-x 2 nova nova 4096 May 16 17:33 1a98eefe-ba41-49b7-931a-ebe54796e343_resize
drwxr-xr-x 2 nova nova 4096 May 16 17:29 5f87fd7e-4f85-4c41-ab12-3bf680160281_resize
drwxr-xr-x 2 nova nova 4096 May 16 17:38 7610109e-b631-40a5-9fa0-1bf150560503_resize
drwxr-xr-x 2 nova nova 4096 May 16 17:29 a4aefac9-feb8-40d7-9228-1df6a02d9e13_resize
drwxr-xr-x 2 nova nova 4096 May 16 13:20 _base
-rw-r--r-- 1 nova nova    0 May 16 17:48 compute_nodes
drwxr-xr-x 2 nova nova 4096 May 16 17:38 d1e28a92-3ba2-402e-a798-ff8ff9a6bd98_resize
drwxr-xr-x 2 nova nova 4096 May 15 11:09 locks
drwxr-xr-x 2 nova nova 4096 May 15 18:34 snapshots
[root@orange-vdsf instances(keystone_admin)]# qemu-img info /var/lib/nova/instances/1a98eefe-ba41-49b7-931a-ebe54796e343_resize/disk
image: /var/lib/nova/instances/1a98eefe-ba41-49b7-931a-ebe54796e343_resize/disk
file format: raw
virtual size: 1.0G (1073741824 bytes)
disk size: 1.0G

Comment 1 Vladan Popovic 2014-06-30 16:22:07 UTC
Didn't manage to reproduce sorry.
Version used: openstack-nova-compute-2014.1-6.el7ost.noarch

I deployed packstack allinone, configured nova to support resize on same host and had this when I started:

# ls /var/lib/nova/instances/
_base  compute_nodes  locks

I created a tinysmall flavor so I don't run out of resources on the laptop:

# nova flavor-show m1.tinysmall
+----------------------------+--------------------------------------+
| Property                   | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| disk                       | 2                                    |
| extra_specs                | {}                                   |
| id                         | 5df35497-34eb-4380-ab75-88a85473add4 |
| name                       | m1.tinysmall                         |
| os-flavor-access:is_public | True                                 |
| ram                        | 1024                                 |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 1                                    |
+----------------------------+--------------------------------------+


# nova boot --image cirros --flavor m1.tiny ohai
# ls /var/lib/nova/instances/
_base  compute_nodes  f9a96888-6be7-43a4-80f4-461463fbc201  locks
# nova resize ohai m1.tinysmall
# nova list 
+--------------------------------------+------+--------+---------------+-------------+---------------------+
| ID                                   | Name | Status | Task State    | Power State | Networks            |
+--------------------------------------+------+--------+---------------+-------------+---------------------+
| f9a96888-6be7-43a4-80f4-461463fbc201 | ohai | RESIZE | resize_finish | Running     | public=172.24.4.237 |
+--------------------------------------+------+--------+---------------+-------------+---------------------+

# ls /var/lib/nova/instances/
_base  compute_nodes  f9a96888-6be7-43a4-80f4-461463fbc201  f9a96888-6be7-43a4-80f4-461463fbc201_resize  locks
# nova delete ohai
# ls /var/lib/nova/instances/
_base  compute_nodes  locks


Dafna, can you please check if the steps are correct and try it with the newer version and if not reproducable we could close it with CURRENTRELEASE.
Thanks,
Vladan

Comment 2 Dafna Ron 2014-07-01 09:32:19 UTC
vlad, looking at how you reproduce it's not at all like the steps that I wrote in the bug description. 
the whole point of this is to launch as many instances as possible with raw + preallocated disks -> resize all of them. 
after we destroy the instances (because we should fail resize if we run out of space) the disks are not cleaned like they should be cleaned. 
from what you wrote above you are not following the steps to reproduce the bug...