Bug 1812988 - Partial cleanup after failed image to volume conversions
Summary: Partial cleanup after failed image to volume conversions
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 13.0 (Queens)
Hardware: All
OS: Linux
high
high
Target Milestone: z13
: 13.0 (Queens)
Assignee: Rajat Dhasmana
QA Contact: Tzach Shefi
Chuck Copello
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-12 16:11 UTC by Ganesh Kadam
Modified: 2023-12-15 17:30 UTC (History)
2 users (show)

Fixed In Version: openstack-cinder-12.0.10-18.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-28 18:22:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 721206 0 None MERGED RBD: Cleanup temporary file during exception 2021-02-18 15:41:39 UTC
Red Hat Product Errata RHSA-2020:4391 0 None None None 2020-10-28 18:22:46 UTC

Comment 2 Rajat Dhasmana 2020-03-17 10:33:52 UTC
Hi Ganesh,

from a quick overview of the code, we already have a condition to check for available space before creating volume or writing image to it but it is with respect to image_size only and i feel the condition can be improved to handle these type of situations as well [1]

Coming to the cleanup part, this probably failed while encrypting image[2] so wrapping the code into a try finally block will do the cleanup like this,

'''
from oslo_utils import fileutils

try:
    with tempfile.NamedTemporaryFile(prefix='luks_',
    ...
finally:
    fileutils.delete_if_exists(tmp_dir)
'''

but with looking deeper, there are multiple places in this code flow where temporary files are created and the out of disk error could occur anywhere so all the places would need similar addressing.
For the current scenario the above change should do the task.


[1] https://github.com/openstack/cinder/blob/master/cinder/volume/flows/manager/create_volume.py#L840
[2] https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L1567-L1588

Comment 3 Rajat Dhasmana 2020-03-17 12:42:17 UTC
Correction, the second link is
[2] https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L1547-L1560

Comment 21 Tzach Shefi 2020-09-25 07:38:19 UTC
Verified on:
python-cinder-12.0.10-19.el7ost.noarch


First we upload a source image, in my case a ~12G qcow2 based. 
(overcloud) [stack@undercloud-0 ~]$ glance image-create --name large12g --disk-format qcow2 --container-format bare --file large12g.qcow2 
+------------------+----------------------------------------------------------------------------------+
| Property         | Value                                                                            |
+------------------+----------------------------------------------------------------------------------+
| checksum         | a05ead3a04ae663da77eee5d2cb2fa73                                                 |
| container_format | bare                                                                             |
| created_at       | 2020-09-25T04:56:31Z                                                             |
| direct_url       | rbd://157e5fda-fe74-11ea-820e-525400a01fe0/images/616ddf2c-cb34-436e-            |
|                  | 9dc8-6971b6efdb69/snap                                                           |
| disk_format      | qcow2                                                                            |
| id               | 616ddf2c-cb34-436e-9dc8-6971b6efdb69                                             |
| locations        | [{"url": "rbd://157e5fda-fe74-11ea-820e-525400a01fe0/images/616ddf2c-cb34-436e-  |
|                  | 9dc8-6971b6efdb69/snap", "metadata": {}}]                                        |
| min_disk         | 0                                                                                |
| min_ram          | 0                                                                                |
| name             | large12g                                                                         |
| owner            | d630c26336d041f08fcb70edebcaf3f3                                                 |
| protected        | False                                                                            |
| size             | 12001017856                                                                      |
| status           | active                                                                           |
| tags             | []                                                                               |
| updated_at       | 2020-09-25T04:59:28Z                                                             |
| virtual_size     | None                                                                             |
| visibility       | shared                                                                           |
+------------------+----------------------------------------------------------------------------------+

Then we create an encrypted volume from said image, this should complete without any problems. 
As the controller running c-vol has ample free space for both qcow2 to raw conversion as well as conversion to encrypted volume type.   

(overcloud) [stack@undercloud-0 ~]$ cinder create 20 --volume-type LUKS --image large12g
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2020-09-25T05:26:37.000000           |
| description                    | None                                 |
| encrypted                      | True                                 |
| id                             | ff108fcc-47c3-4f87-9f1c-269df98a2091 |
| metadata                       | {}                                   |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | None                                 |
| os-vol-host-attr:host          | hostgroup@tripleo_ceph#tripleo_ceph  |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | d630c26336d041f08fcb70edebcaf3f3     |
| replication_status             | None                                 |
| size                           | 20                                   |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | creating                             |
| updated_at                     | 2020-09-25T05:26:37.000000           |
| user_id                        | 0ed1a958550d4e49a0b71714df531420     |
| volume_type                    | LUKS                                 |
+--------------------------------+--------------------------------------+

(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+----------+------+------+-------------+----------+-------------+
| ID                                   | Status   | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+----------+------+------+-------------+----------+-------------+
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | creating | -    | 20   | LUKS        | false    |             |
+--------------------------------------+----------+------+------+-------------+----------+-------------+
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+------+------+-------------+----------+-------------+
| ID                                   | Status    | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------+------+-------------+----------+-------------+
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -    | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+------+------+-------------+----------+-------------+

As expected the basic positive flow works, an encrypted volume is usefully created from an image.  


Now lets simulate the desired bz/situation where free disk space on said controller is limited. 
In my case I had just copied over the qocw2 file to controller as a means of consuming free space. 
With just 25G free disk space we now try to create another encrypted volume from said image:

(overcloud) [stack@undercloud-0 ~]$ cinder create 20 --volume-type LUKS --image large12g --name limitedDiskSpace
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2020-09-25T06:03:25.000000           |
| description                    | None                                 |
| encrypted                      | True                                 |
| id                             | 2d15bbc0-2f72-4b40-b33e-98c03f108e82 |
| metadata                       | {}                                   |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | limitedDiskSpace                     |
| os-vol-host-attr:host          | hostgroup@tripleo_ceph#tripleo_ceph  |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | d630c26336d041f08fcb70edebcaf3f3     |
| replication_status             | None                                 |
| size                           | 20                                   |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | creating                             |
| updated_at                     | 2020-09-25T06:03:25.000000           |
| user_id                        | 0ed1a958550d4e49a0b71714df531420     |
| volume_type                    | LUKS                                 |
+--------------------------------+--------------------------------------+

And after a while:
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name             | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | creating  | limitedDiskSpace | 20   | LUKS        | false    |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+

However eventually as expected we failed to create the volume:
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name             | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error     | limitedDiskSpace | 20   | LUKS        | false    |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+

During volume creation I'd monitored the controller running c-vol path: /var/lib/cinder/conversion,
cleanup of tmp folder under path was preformed as expected, temporary folder created during this volume's failed creation attempt were completely removed. 

Just to be sure I had redone both steps over. 
Deleted the qcow2 folder from controller running c-vol so as to free up sufficient disk space for another attempt. 
With 36G free we again try to create a third volume:

(overcloud) [stack@undercloud-0 ~]$ cinder create 20 --volume-type LUKS --image large12g --name ThirdAttemptAmpleFreeSpace
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2020-09-25T06:38:27.000000           |
| description                    | None                                 |
| encrypted                      | True                                 |
| id                             | f348606e-b40c-44f2-9715-a69cae6e8462 |
| metadata                       | {}                                   |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | ThirdAttemptAmpleFreeSpace           |
| os-vol-host-attr:host          | hostgroup@tripleo_ceph#tripleo_ceph  |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | d630c26336d041f08fcb70edebcaf3f3     |
| replication_status             | None                                 |
| size                           | 20                                   |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | creating                             |
| updated_at                     | 2020-09-25T06:38:27.000000           |
| user_id                        | 0ed1a958550d4e49a0b71714df531420     |
| volume_type                    | LUKS                                 |
+--------------------------------+--------------------------------------+


During creation

(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name                       | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error     | limitedDiskSpace           | 20   | LUKS        | false    |             |
| f348606e-b40c-44f2-9715-a69cae6e8462 | creating  | ThirdAttemptAmpleFreeSpace | 20   | LUKS        | false    |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                          | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+

The tmp folder creation on controller:
[root@controller-0 conversion]# ll
total 12768260
-rw-------. 1 root 42400   603521024 ספט 25 06:38 tmpaI3hpnhostgroup@tripleo_ceph


(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-------------+----------------------------+------+-------------+----------+-------------+
| ID                                   | Status      | Name                       | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-------------+----------------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error       | limitedDiskSpace           | 20   | LUKS        | false    |             |
| f348606e-b40c-44f2-9715-a69cae6e8462 | downloading | ThirdAttemptAmpleFreeSpace | 20   | LUKS        | false    |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available   | -                          | 20   | LUKS        | true     |             |
+--------------------------------------+-------------+----------------------------+------+-------------+----------+-------------+

and after a while we have our volume available
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name                       | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error     | limitedDiskSpace           | 20   | LUKS        | false    |             |
| f348606e-b40c-44f2-9715-a69cae6e8462 | available | ThirdAttemptAmpleFreeSpace | 20   | LUKS        | true     |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                          | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+


One last attempt lets clone the qcow2 image on cotnoller's disk to reduce it's free space:
[root@controller-0 conversion]# cp large12g2.qcow2 large12g3.qcow2 
[root@controller-0 conversion]# df -h | grep vda2
/dev/vda2        60G   36G   25G  60% /
[root@controller-0 conversion]# 
 
We are left with 25G free, which is "enough" so as to block oor next attempt which should fail.
(overcloud) [stack@undercloud-0 ~]$ cinder create 20 --volume-type LUKS --image large12g --name ForthAttemptShouldFail
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2020-09-25T07:20:26.000000           |
| description                    | None                                 |
| encrypted                      | True                                 |
| id                             | eacf5f09-14a5-4568-ac9f-67ea167bedeb |
| metadata                       | {}                                   |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | ForthAttemptShouldFail               |
| os-vol-host-attr:host          | hostgroup@tripleo_ceph#tripleo_ceph  |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | d630c26336d041f08fcb70edebcaf3f3     |
| replication_status             | None                                 |
| size                           | 20                                   |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | creating                             |
| updated_at                     | 2020-09-25T07:20:26.000000           |
| user_id                        | 0ed1a958550d4e49a0b71714df531420     |
| volume_type                    | LUKS                                 |
+--------------------------------+--------------------------------------+

During vol creation:
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name                       | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error     | limitedDiskSpace           | 20   | LUKS        | false    |             |
| eacf5f09-14a5-4568-ac9f-67ea167bedeb | creating  | ForthAttemptShouldFail     | 20   | LUKS        | false    |             |
| f348606e-b40c-44f2-9715-a69cae6e8462 | available | ThirdAttemptAmpleFreeSpace | 20   | LUKS        | true     |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                          | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+

Conversion folder path:
[root@controller-0 conversion]# ll
total 28158024
-rw-r--r--. 1 root 42400 12001017856 ספט 25 06:01 large12g2.qcow2
-rw-r--r--. 1 root 42400 12001017856 ספט 25 07:17 large12g3.qcow2    both images used to consume/reduce free disk space. 
-rw-------. 1 root 42400  4387831808 ספט 25 07:21 tmp_2UUsIhostgroup@tripleo_ceph  
 

Unsurprisingly, as expected, we fail to create the forth volume:
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name                       | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error     | limitedDiskSpace           | 20   | LUKS        | false    |             |
| eacf5f09-14a5-4568-ac9f-67ea167bedeb | error     | ForthAttemptShouldFail     | 20   | LUKS        | false    |             |
| f348606e-b40c-44f2-9715-a69cae6e8462 | available | ThirdAttemptAmpleFreeSpace | 20   | LUKS        | true     |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                          | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+

And the tmp folder has been removed
[root@controller-0 conversion]# ll
total 23439496
-rw-r--r--. 1 root 42400 12001017856 ספט 25 06:01 large12g2.qcow2
-rw-r--r--. 1 root 42400 12001017856 ספט 25 07:17 large12g3.qcow2

Argo good to verify.

Comment 29 errata-xmlrpc 2020-10-28 18:22:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: openstack-cinder security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4391


Note You need to log in before you can comment on or make changes to this bug.