Bug 1812988

Summary: Partial cleanup after failed image to volume conversions
Product: Red Hat OpenStack Reporter: Ganesh Kadam <gkadam>
Component: openstack-cinderAssignee: Rajat Dhasmana <rdhasman>
Status: CLOSED ERRATA QA Contact: Tzach Shefi <tshefi>
Severity: high Docs Contact: Chuck Copello <ccopello>
Priority: high    
Version: 13.0 (Queens)CC: rdhasman, tshefi
Target Milestone: z13Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-cinder-12.0.10-18.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-28 18:22:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 2 Rajat Dhasmana 2020-03-17 10:33:52 UTC
Hi Ganesh,

from a quick overview of the code, we already have a condition to check for available space before creating volume or writing image to it but it is with respect to image_size only and i feel the condition can be improved to handle these type of situations as well [1]

Coming to the cleanup part, this probably failed while encrypting image[2] so wrapping the code into a try finally block will do the cleanup like this,

'''
from oslo_utils import fileutils

try:
    with tempfile.NamedTemporaryFile(prefix='luks_',
    ...
finally:
    fileutils.delete_if_exists(tmp_dir)
'''

but with looking deeper, there are multiple places in this code flow where temporary files are created and the out of disk error could occur anywhere so all the places would need similar addressing.
For the current scenario the above change should do the task.


[1] https://github.com/openstack/cinder/blob/master/cinder/volume/flows/manager/create_volume.py#L840
[2] https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L1567-L1588

Comment 3 Rajat Dhasmana 2020-03-17 12:42:17 UTC
Correction, the second link is
[2] https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L1547-L1560

Comment 21 Tzach Shefi 2020-09-25 07:38:19 UTC
Verified on:
python-cinder-12.0.10-19.el7ost.noarch


First we upload a source image, in my case a ~12G qcow2 based. 
(overcloud) [stack@undercloud-0 ~]$ glance image-create --name large12g --disk-format qcow2 --container-format bare --file large12g.qcow2 
+------------------+----------------------------------------------------------------------------------+
| Property         | Value                                                                            |
+------------------+----------------------------------------------------------------------------------+
| checksum         | a05ead3a04ae663da77eee5d2cb2fa73                                                 |
| container_format | bare                                                                             |
| created_at       | 2020-09-25T04:56:31Z                                                             |
| direct_url       | rbd://157e5fda-fe74-11ea-820e-525400a01fe0/images/616ddf2c-cb34-436e-            |
|                  | 9dc8-6971b6efdb69/snap                                                           |
| disk_format      | qcow2                                                                            |
| id               | 616ddf2c-cb34-436e-9dc8-6971b6efdb69                                             |
| locations        | [{"url": "rbd://157e5fda-fe74-11ea-820e-525400a01fe0/images/616ddf2c-cb34-436e-  |
|                  | 9dc8-6971b6efdb69/snap", "metadata": {}}]                                        |
| min_disk         | 0                                                                                |
| min_ram          | 0                                                                                |
| name             | large12g                                                                         |
| owner            | d630c26336d041f08fcb70edebcaf3f3                                                 |
| protected        | False                                                                            |
| size             | 12001017856                                                                      |
| status           | active                                                                           |
| tags             | []                                                                               |
| updated_at       | 2020-09-25T04:59:28Z                                                             |
| virtual_size     | None                                                                             |
| visibility       | shared                                                                           |
+------------------+----------------------------------------------------------------------------------+

Then we create an encrypted volume from said image, this should complete without any problems. 
As the controller running c-vol has ample free space for both qcow2 to raw conversion as well as conversion to encrypted volume type.   

(overcloud) [stack@undercloud-0 ~]$ cinder create 20 --volume-type LUKS --image large12g
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2020-09-25T05:26:37.000000           |
| description                    | None                                 |
| encrypted                      | True                                 |
| id                             | ff108fcc-47c3-4f87-9f1c-269df98a2091 |
| metadata                       | {}                                   |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | None                                 |
| os-vol-host-attr:host          | hostgroup@tripleo_ceph#tripleo_ceph  |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | d630c26336d041f08fcb70edebcaf3f3     |
| replication_status             | None                                 |
| size                           | 20                                   |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | creating                             |
| updated_at                     | 2020-09-25T05:26:37.000000           |
| user_id                        | 0ed1a958550d4e49a0b71714df531420     |
| volume_type                    | LUKS                                 |
+--------------------------------+--------------------------------------+

(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+----------+------+------+-------------+----------+-------------+
| ID                                   | Status   | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+----------+------+------+-------------+----------+-------------+
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | creating | -    | 20   | LUKS        | false    |             |
+--------------------------------------+----------+------+------+-------------+----------+-------------+
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+------+------+-------------+----------+-------------+
| ID                                   | Status    | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------+------+-------------+----------+-------------+
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -    | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+------+------+-------------+----------+-------------+

As expected the basic positive flow works, an encrypted volume is usefully created from an image.  


Now lets simulate the desired bz/situation where free disk space on said controller is limited. 
In my case I had just copied over the qocw2 file to controller as a means of consuming free space. 
With just 25G free disk space we now try to create another encrypted volume from said image:

(overcloud) [stack@undercloud-0 ~]$ cinder create 20 --volume-type LUKS --image large12g --name limitedDiskSpace
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2020-09-25T06:03:25.000000           |
| description                    | None                                 |
| encrypted                      | True                                 |
| id                             | 2d15bbc0-2f72-4b40-b33e-98c03f108e82 |
| metadata                       | {}                                   |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | limitedDiskSpace                     |
| os-vol-host-attr:host          | hostgroup@tripleo_ceph#tripleo_ceph  |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | d630c26336d041f08fcb70edebcaf3f3     |
| replication_status             | None                                 |
| size                           | 20                                   |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | creating                             |
| updated_at                     | 2020-09-25T06:03:25.000000           |
| user_id                        | 0ed1a958550d4e49a0b71714df531420     |
| volume_type                    | LUKS                                 |
+--------------------------------+--------------------------------------+

And after a while:
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name             | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | creating  | limitedDiskSpace | 20   | LUKS        | false    |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+

However eventually as expected we failed to create the volume:
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name             | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error     | limitedDiskSpace | 20   | LUKS        | false    |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+------------------+------+-------------+----------+-------------+

During volume creation I'd monitored the controller running c-vol path: /var/lib/cinder/conversion,
cleanup of tmp folder under path was preformed as expected, temporary folder created during this volume's failed creation attempt were completely removed. 

Just to be sure I had redone both steps over. 
Deleted the qcow2 folder from controller running c-vol so as to free up sufficient disk space for another attempt. 
With 36G free we again try to create a third volume:

(overcloud) [stack@undercloud-0 ~]$ cinder create 20 --volume-type LUKS --image large12g --name ThirdAttemptAmpleFreeSpace
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2020-09-25T06:38:27.000000           |
| description                    | None                                 |
| encrypted                      | True                                 |
| id                             | f348606e-b40c-44f2-9715-a69cae6e8462 |
| metadata                       | {}                                   |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | ThirdAttemptAmpleFreeSpace           |
| os-vol-host-attr:host          | hostgroup@tripleo_ceph#tripleo_ceph  |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | d630c26336d041f08fcb70edebcaf3f3     |
| replication_status             | None                                 |
| size                           | 20                                   |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | creating                             |
| updated_at                     | 2020-09-25T06:38:27.000000           |
| user_id                        | 0ed1a958550d4e49a0b71714df531420     |
| volume_type                    | LUKS                                 |
+--------------------------------+--------------------------------------+


During creation

(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name                       | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error     | limitedDiskSpace           | 20   | LUKS        | false    |             |
| f348606e-b40c-44f2-9715-a69cae6e8462 | creating  | ThirdAttemptAmpleFreeSpace | 20   | LUKS        | false    |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                          | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+

The tmp folder creation on controller:
[root@controller-0 conversion]# ll
total 12768260
-rw-------. 1 root 42400   603521024 ספט 25 06:38 tmpaI3hpnhostgroup@tripleo_ceph


(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-------------+----------------------------+------+-------------+----------+-------------+
| ID                                   | Status      | Name                       | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-------------+----------------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error       | limitedDiskSpace           | 20   | LUKS        | false    |             |
| f348606e-b40c-44f2-9715-a69cae6e8462 | downloading | ThirdAttemptAmpleFreeSpace | 20   | LUKS        | false    |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available   | -                          | 20   | LUKS        | true     |             |
+--------------------------------------+-------------+----------------------------+------+-------------+----------+-------------+

and after a while we have our volume available
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name                       | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error     | limitedDiskSpace           | 20   | LUKS        | false    |             |
| f348606e-b40c-44f2-9715-a69cae6e8462 | available | ThirdAttemptAmpleFreeSpace | 20   | LUKS        | true     |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                          | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+


One last attempt lets clone the qcow2 image on cotnoller's disk to reduce it's free space:
[root@controller-0 conversion]# cp large12g2.qcow2 large12g3.qcow2 
[root@controller-0 conversion]# df -h | grep vda2
/dev/vda2        60G   36G   25G  60% /
[root@controller-0 conversion]# 
 
We are left with 25G free, which is "enough" so as to block oor next attempt which should fail.
(overcloud) [stack@undercloud-0 ~]$ cinder create 20 --volume-type LUKS --image large12g --name ForthAttemptShouldFail
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2020-09-25T07:20:26.000000           |
| description                    | None                                 |
| encrypted                      | True                                 |
| id                             | eacf5f09-14a5-4568-ac9f-67ea167bedeb |
| metadata                       | {}                                   |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | ForthAttemptShouldFail               |
| os-vol-host-attr:host          | hostgroup@tripleo_ceph#tripleo_ceph  |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | d630c26336d041f08fcb70edebcaf3f3     |
| replication_status             | None                                 |
| size                           | 20                                   |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | creating                             |
| updated_at                     | 2020-09-25T07:20:26.000000           |
| user_id                        | 0ed1a958550d4e49a0b71714df531420     |
| volume_type                    | LUKS                                 |
+--------------------------------+--------------------------------------+

During vol creation:
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name                       | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error     | limitedDiskSpace           | 20   | LUKS        | false    |             |
| eacf5f09-14a5-4568-ac9f-67ea167bedeb | creating  | ForthAttemptShouldFail     | 20   | LUKS        | false    |             |
| f348606e-b40c-44f2-9715-a69cae6e8462 | available | ThirdAttemptAmpleFreeSpace | 20   | LUKS        | true     |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                          | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+

Conversion folder path:
[root@controller-0 conversion]# ll
total 28158024
-rw-r--r--. 1 root 42400 12001017856 ספט 25 06:01 large12g2.qcow2
-rw-r--r--. 1 root 42400 12001017856 ספט 25 07:17 large12g3.qcow2    both images used to consume/reduce free disk space. 
-rw-------. 1 root 42400  4387831808 ספט 25 07:21 tmp_2UUsIhostgroup@tripleo_ceph  
 

Unsurprisingly, as expected, we fail to create the forth volume:
(overcloud) [stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| ID                                   | Status    | Name                       | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+
| 2d15bbc0-2f72-4b40-b33e-98c03f108e82 | error     | limitedDiskSpace           | 20   | LUKS        | false    |             |
| eacf5f09-14a5-4568-ac9f-67ea167bedeb | error     | ForthAttemptShouldFail     | 20   | LUKS        | false    |             |
| f348606e-b40c-44f2-9715-a69cae6e8462 | available | ThirdAttemptAmpleFreeSpace | 20   | LUKS        | true     |             |
| ff108fcc-47c3-4f87-9f1c-269df98a2091 | available | -                          | 20   | LUKS        | true     |             |
+--------------------------------------+-----------+----------------------------+------+-------------+----------+-------------+

And the tmp folder has been removed
[root@controller-0 conversion]# ll
total 23439496
-rw-r--r--. 1 root 42400 12001017856 ספט 25 06:01 large12g2.qcow2
-rw-r--r--. 1 root 42400 12001017856 ספט 25 07:17 large12g3.qcow2

Argo good to verify.

Comment 29 errata-xmlrpc 2020-10-28 18:22:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: openstack-cinder security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4391