Bug 1504670

Summary: Problem in creating several cinder backups at same time
Product: Red Hat OpenStack Reporter: Gorka Eguileor <geguileo>
Component: openstack-cinderAssignee: Gorka Eguileor <geguileo>
Status: CLOSED ERRATA QA Contact: Avi Avraham <aavraham>
Severity: high Docs Contact:
Priority: medium    
Version: 11.0 (Ocata)CC: aavraham, cschwede, dciabrin, dhill, ebeaudoi, geguileo, juwu, lkuchlan, mbayer, pgrist, scohen, srevivo, tshefi
Target Milestone: z5Keywords: Triaged, ZStream
Target Release: 11.0 (Ocata)Flags: tshefi: automate_bug+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-cinder-10.0.6-13.el7ost Doc Type: Bug Fix
Doc Text:
Previously, certain method calls for backup/restore operations would block the eventlet's thread switching. Consequently, operations were slower and connection errors were observed in the database and RabbitMQ logs. With this update, proxy blocking method calls were changed into native threads to prevent blocking. As a result, restore/backup operations are faster and the connection issues are resolved.
Story Points: ---
Clone Of: 1504661
: 1504671 (view as bug list) Environment:
Last Closed: 2018-05-18 16:48:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1504671    
Bug Blocks: 1464146, 1504661, 1542607    

Description Gorka Eguileor 2017-10-20 11:26:13 UTC
+++ This bug was initially created as a clone of Bug #1504661 +++

>> Description of problem:

Creating the cinder backup manually, no problem:

cinder --os-tenant-name sandbox create --display_name volume_kris 1
cinder --os-tenant-name sandbox backup-create --display-name back_kris --force 5a06927c-1892-45b8-b8a3-3981dafea875


When we run the below scripts, we have problems:

Creating of 10 volumes

#!/bin/sh
for var in {0..9}
do
  cinder --os-tenant-name sandbox create --display_name volume_kris_$var 1
done

Creating 10 backup volumes
#!/bin/sh
i=0
for var in $(cinder --os-tenant-name sandbox list | grep volume_kris_ |awk '{print $2}')
do
cinder --os-tenant-name sandbox backup-create --display-name back_kris_$i --force $var
i=$((i+1))
done


>> Version-Release number of selected component (if applicable):
Openstack 9
openstack-cinder-8.1.1-4.el7ost.noarch                      Sat Mar 18 03:48:42 2017
python-cinder-8.1.1-4.el7ost.noarch                         Sat Mar 18 03:42:38 2017
python-cinderclient-1.6.0-1.el7ost.noarch                   Sat Mar 18 03:37:14 2017


>> How reproducible:
Re-run the above script
Note: It's not all the cinder backups creation that will failed


>> Actual results:
Few cinder backups will not be created getting the "error" or "creating" state.

Expected results:
After the scripts, all the cinders backups are created

Additional info:
After we modified the timeout as below, we got better results. 
  listen mysql
  timeout client 180m
  timeout server 180m

This seems to be caused by Cinder's data compression (a CPU intensive operation) during backups being done directly in the greenthread, which would prevent thread switching to other greenthreads.

Given enough greenthreads doing compression they would end up running mostly serially and preventing other threads from running.

Solution would be to run the compression on a native thread so they don't interfere with greenthread switching.

--- Additional comment from Gorka Eguileor on 2017-09-26 08:39:45 EDT ---

Seems to be the same issue as in bz #1403948

Comment 6 Tzach Shefi 2018-05-07 12:04:37 UTC
Verified on:
openstack-cinder-10.0.6-24.el7ost.noarch

Using loop created 10 volumes from Cirros image to have some data. 

---------------+-------------+----------+------+-------------+----------+-------------+
[stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
| ID                                   | Status    | Name     | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+
| 05a601f9-9548-447b-8e54-5101677850d8 | available | volume_4 | 1    | -           | true     |             |
| 624b991b-404a-40ea-b5c8-6f4a626f162f | available | volume_0 | 1    | -           | true     |             |
| 62a5d3ec-71bb-4505-ac8b-dd50cc5537ba | available | volume_9 | 1    | -           | true     |             |
| 6eae8ea2-0088-4e69-99b9-9fc0bdd3da67 | available | volume_3 | 1    | -           | true     |             |
| 70175086-f5de-49aa-8439-269b89f929d9 | available | volume_1 | 1    | -           | true     |             |
| 7fd66bfb-625a-4550-a49a-268cf28d49d5 | available | volume_5 | 1    | -           | true     |             |
| b3a3a05a-466c-45e1-a91c-2aa486599ddd | available | volume_2 | 1    | -           | true     |             |
| c4ceb5fb-9fcb-4f27-a0a8-a4d10af51edb | available | volume_6 | 1    | -           | true     |             |
| f5e4bc15-bbf3-4a8f-8ae3-8d6499e79101 | available | volume_7 | 1    | -           | true     |             |
| f68ad387-6391-4e36-934f-8cea5355769a | available | volume_8 | 1    | -           | true     |             |
+--------------------------------------+-----------+----------+------+-------------+----------+-------------+


With second loop backed-up all volumes at "once" no problem. 
[stack@undercloud-0 ~]$ cat backup.sh 
#!/bin/sh
i=0
for var in $(cinder list | grep volume_ |awk '{print $2}')
do
cinder backup-create --display-name back_$i --force $var
i=$((i+1))
done

cinder backup-list    (while in creating state) 
+--------------------------------------+--------------------------------------+----------+--------+------+--------------+---------------+
| ID                                   | Volume ID                            | Status   | Name   | Size | Object Count | Container     |
+--------------------------------------+--------------------------------------+----------+--------+------+--------------+---------------+
| 0ea6b171-0664-44b9-9b45-b4ed6bc420d0 | f68ad387-6391-4e36-934f-8cea5355769a | creating | back_9 | 1    | 0            | volumebackups |
| 21a10e38-058d-44b6-80a1-e931e2642335 | b3a3a05a-466c-45e1-a91c-2aa486599ddd | creating | back_6 | 1    | 0            | volumebackups |
| 31736f5f-c93e-46e5-bc79-f51add3ad5d6 | 6eae8ea2-0088-4e69-99b9-9fc0bdd3da67 | creating | back_3 | 1    | 0            | volumebackups |
| 5bef8aea-df6f-43d4-8cde-f2b5341b2d4e | c4ceb5fb-9fcb-4f27-a0a8-a4d10af51edb | creating | back_7 | 1    | 0            | volumebackups |
| 6db6448d-4881-4e8d-87ee-edd183ad167f | 62a5d3ec-71bb-4505-ac8b-dd50cc5537ba | creating | back_2 | 1    | 0            | volumebackups |
| 8f674884-4a72-4e34-8628-92ff4bc16a0d | 624b991b-404a-40ea-b5c8-6f4a626f162f | creating | back_1 | 1    | 0            | volumebackups |
| a1178f4c-9cba-4008-8304-c89471b564d2 | f5e4bc15-bbf3-4a8f-8ae3-8d6499e79101 | creating | back_8 | 1    | 0            | volumebackups |
| bdef9994-e9a7-4764-b670-dc9c80cb4a79 | 7fd66bfb-625a-4550-a49a-268cf28d49d5 | creating | back_5 | 1    | 0            | volumebackups |
| c4fb7d7a-9078-4d43-aac3-ad736a867fe0 | 05a601f9-9548-447b-8e54-5101677850d8 | creating | back_0 | 1    | 0            | volumebackups |
| e5df201b-8402-4e42-a7ad-2d59eebe3500 | 70175086-f5de-49aa-8439-269b89f929d9 | creating | back_4 | 1    | 0            | volumebackups |
+--------------------------------------+--------------------------------------+----------+--------+------+--------------+---------------+

Once all backup's completed:

cinder backup-list
+--------------------------------------+--------------------------------------+-----------+--------+------+--------------+---------------+
| ID                                   | Volume ID                            | Status    | Name   | Size | Object Count | Container     |
+--------------------------------------+--------------------------------------+-----------+--------+------+--------------+---------------+
| 0ea6b171-0664-44b9-9b45-b4ed6bc420d0 | f68ad387-6391-4e36-934f-8cea5355769a | available | back_9 | 1    | 22           | volumebackups |
| 21a10e38-058d-44b6-80a1-e931e2642335 | b3a3a05a-466c-45e1-a91c-2aa486599ddd | available | back_6 | 1    | 22           | volumebackups |
| 31736f5f-c93e-46e5-bc79-f51add3ad5d6 | 6eae8ea2-0088-4e69-99b9-9fc0bdd3da67 | available | back_3 | 1    | 22           | volumebackups |
| 5bef8aea-df6f-43d4-8cde-f2b5341b2d4e | c4ceb5fb-9fcb-4f27-a0a8-a4d10af51edb | available | back_7 | 1    | 22           | volumebackups |
| 6db6448d-4881-4e8d-87ee-edd183ad167f | 62a5d3ec-71bb-4505-ac8b-dd50cc5537ba | available | back_2 | 1    | 22           | volumebackups |
| 8f674884-4a72-4e34-8628-92ff4bc16a0d | 624b991b-404a-40ea-b5c8-6f4a626f162f | available | back_1 | 1    | 22           | volumebackups |
| a1178f4c-9cba-4008-8304-c89471b564d2 | f5e4bc15-bbf3-4a8f-8ae3-8d6499e79101 | available | back_8 | 1    | 22           | volumebackups |
| bdef9994-e9a7-4764-b670-dc9c80cb4a79 | 7fd66bfb-625a-4550-a49a-268cf28d49d5 | available | back_5 | 1    | 22           | volumebackups |
| c4fb7d7a-9078-4d43-aac3-ad736a867fe0 | 05a601f9-9548-447b-8e54-5101677850d8 | available | back_0 | 1    | 22           | volumebackups |
| e5df201b-8402-4e42-a7ad-2d59eebe3500 | 70175086-f5de-49aa-8439-269b89f929d9 | available | back_4 | 1    | 22           | volumebackups |
+--------------------------------------+--------------------------------------+-----------+--------+------+--------------+---------------+

Looking fine.

Comment 9 errata-xmlrpc 2018-05-18 16:48:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1611