1504661 – Problem in creating several cinder backups at same time

Bug 1504661 - Problem in creating several cinder backups at same time

Summary: Problem in creating several cinder backups at same time

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-cinder
Sub Component:
Version:	10.0 (Newton)
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	z7
Target Release:	10.0 (Newton)
Assignee:	Gorka Eguileor
QA Contact:	Tzach Shefi
Docs Contact:
URL:
Whiteboard:
Depends On:	1504670 1504671
Blocks:	1464146 1542607
TreeView+	depends on / blocked

Reported:	2017-10-20 11:12 UTC by Gorka Eguileor
Modified:	2022-08-16 11:49 UTC (History)
CC List:	18 users (show)
Fixed In Version:	openstack-cinder-9.1.4-23.el7ost
Doc Type:	Bug Fix
Doc Text:	Previously, certain method calls for backup/restore operations would block the eventlet's thread switching. Consequently, operations were slower and connection errors were observed in the database and RabbitMQ logs. With this update, proxy blocking method calls were changed into native threads to prevent blocking. As a result, restore/backup operations are faster and the connection issues are resolved.
Clone Of:	1464146
Clones:	1504670 1542607 (view as bug list)
Environment:
Last Closed:	2018-02-27 16:39:47 UTC
Target Upstream Version:
Embargoed:
Flags:	tshefi: automate_bug+

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1719580	None	None	None	2017-10-20 11:12:31 UTC
OpenStack gerrit	507510	'None'	'MERGED'	'Run backup compression on native thread'	2019-11-13 02:31:20 UTC
OpenStack gerrit	518316	'None'	'MERGED'	'Run backup-restore operations on native thread'	2019-11-13 02:31:19 UTC
Red Hat Issue Tracker	OSP-4731	None	None	None	2022-08-16 11:49:31 UTC
Red Hat Product Errata	RHBA-2018:0360	normal	SHIPPED_LIVE	openstack-cinder bug fix advisory	2018-02-27 21:35:04 UTC

Description Gorka Eguileor 2017-10-20 11:12:31 UTC

+++ This bug was initially created as a clone of Bug #1464146 +++

>> Description of problem:

Creating the cinder backup manually, no problem:

cinder --os-tenant-name sandbox create --display_name volume_kris 1
cinder --os-tenant-name sandbox backup-create --display-name back_kris --force 5a06927c-1892-45b8-b8a3-3981dafea875


When we run the below scripts, we have problems:

Creating of 10 volumes

#!/bin/sh
for var in {0..9}
do
  cinder --os-tenant-name sandbox create --display_name volume_kris_$var 1
done

Creating 10 backup volumes
#!/bin/sh
i=0
for var in $(cinder --os-tenant-name sandbox list | grep volume_kris_ |awk '{print $2}')
do
cinder --os-tenant-name sandbox backup-create --display-name back_kris_$i --force $var
i=$((i+1))
done


>> Version-Release number of selected component (if applicable):
Openstack 9
openstack-cinder-8.1.1-4.el7ost.noarch                      Sat Mar 18 03:48:42 2017
python-cinder-8.1.1-4.el7ost.noarch                         Sat Mar 18 03:42:38 2017
python-cinderclient-1.6.0-1.el7ost.noarch                   Sat Mar 18 03:37:14 2017


>> How reproducible:
Re-run the above script
Note: It's not all the cinder backups creation that will failed


>> Actual results:
Few cinder backups will not be created getting the "error" or "creating" state.

Expected results:
After the scripts, all the cinders backups are created

Additional info:
After we modified the timeout as below, we got better results. 
  listen mysql
  timeout client 180m
  timeout server 180m

This seems to be caused by Cinder's data compression (a CPU intensive operation) during backups being done directly in the greenthread, which would prevent thread switching to other greenthreads.

Given enough greenthreads doing compression they would end up running mostly serially and preventing other threads from running.

Solution would be to run the compression on a native thread so they don't interfere with greenthread switching.

--- Additional comment from Gorka Eguileor on 2017-09-26 08:39:45 EDT ---

Seems to be the same issue as in bz #1403948

Comment 11 Tzach Shefi 2018-02-08 15:32:16 UTC

Verified on:
openstack-cinder-9.1.4-24.el7ost.noarch

Ran Gorka's script to generate 10 volumes (in my case nfs backed) then back them up to Swift Cinder backend. 
All volumes and backups are available. 


[stack@undercloud-0 ~]$ cinder backup-list
+--------------------------------------+--------------------------------------+-----------+-------------+------+--------------+---------------+                                                                                                                                
| ID                                   | Volume ID                            | Status    | Name        | Size | Object Count | Container     |                                                                                                                                
+--------------------------------------+--------------------------------------+-----------+-------------+------+--------------+---------------+                                                                                                                                
| 2181a5a8-0fef-4448-9079-77f15e65b81b | 3f7fb4e6-c007-4911-bd42-091c796fc74c | available | back_kris_2 | 1    | 22           | volumebackups |                                                                                                                                
| 4165e80d-39b1-482d-a2d2-bc02e4fa8801 | f6ac45bc-5d82-4102-a1a9-5cae0be1678a | available | back_kris_8 | 1    | 22           | volumebackups |
| 51958304-2472-4a38-80f7-c007e73f4cb2 | 88957b5d-d452-46d3-8a25-32d90fa3b36b | available | back_kris_5 | 1    | 22           | volumebackups |
| 5b561ae2-97d2-4c17-87f1-436f33c142f8 | 154e2445-e2a7-479b-805d-bd6e0a2cf3bc | available | back_kris_0 | 1    | 22           | volumebackups |
| 857a0aa6-1d59-4111-b204-5bcfc1dd50c2 | d32d58de-f324-4b1d-88bd-98805802d10a | available | back_kris_7 | 1    | 22           | volumebackups |
| 9127a127-2504-4db2-93ac-3726964bab25 | c6e3e0d0-41d6-4d01-9329-30010aebcd55 | available | back_kris_6 | 1    | 22           | volumebackups |
| b18ffd27-31ca-4c09-9e01-f49b382bea5c | 7a5e3cc2-420b-4f64-ae43-2fd58018e622 | available | back_kris_4 | 1    | 22           | volumebackups |
| bfe72142-f51f-4c56-bd74-b5e0d3b15fad | 753d375e-2bd6-4747-9a6a-d95d686e413c | available | back_kris_3 | 1    | 22           | volumebackups |
| c3e903e5-4d20-4538-a0bf-116a160dd6c0 | 3d5afdfd-e45a-4d6a-b64d-c7421a90e98f | available | back_kris_1 | 1    | 22           | volumebackups |
| cabaca92-1b60-43f6-93ca-7c31a5cbf36c | 03d10afe-667c-466c-8dd2-7cdfc0ddae13 | available | -           | 1    | 22           | volumebackups |
+--------------------------------------+--------------------------------------+-----------+-------------+------+--------------+---------------+


Reran same scripts time filled 1 volume with random data,
Created an image from that filled volume, created 10 new volumes from that image. 
Again all backups were created successfully, tho as expected took a bit longer to complete.

Comment 13 errata-xmlrpc 2018-02-27 16:39:47 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0360

Note You need to log in before you can comment on or make changes to this bug.