Bug 1452704

Summary: [RFE] Backport Cinder Multibackend availability support
Product: Red Hat OpenStack Reporter: Luca Miccini <lmiccini>
Component: openstack-cinderAssignee: Gorka Eguileor <geguileo>
Status: CLOSED ERRATA QA Contact: Tzach Shefi <tshefi>
Severity: high Docs Contact:
Priority: high    
Version: 11.0 (Ocata)CC: dcadzow, dnavale, dwojewod, eharney, geguileo, pgrist, pkundal, scohen, sputhenp, srevivo
Target Milestone: z4Keywords: FeatureBackport, FutureFeature, Triaged, ZStream
Target Release: 11.0 (Ocata)Flags: scohen: needinfo+
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-cinder-10.0.6-5.el7ost Doc Type: Enhancement
Doc Text:
Adds support for multiple Availability Zones within a single Block Storage volume service by defining the Availability Zones on each driver section.
Story Points: ---
Clone Of:
: 1488390 1510931 (view as bug list) Environment:
Last Closed: 2018-02-13 16:29:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1466000    
Bug Blocks: 1488390, 1510931    
Attachments:
Description Flags
test patch none

Description Luca Miccini 2017-05-19 13:51:35 UTC
Description of problem:

I would like to request the backport of the following cinder feature:

https://blueprints.launchpad.net/cinder/+spec/multibackend-az-support

https://review.openstack.org/#/c/433437/

"Currently, all Cinder services on one node must be part of the same storage availability zone due to this option being in the DEFAULT config section. This made sense when only one backend was specified. But since adding multibackend support, and deprecating support for defining backends in the DEFAULT section, this puts an unnecessary restriction on backend service setup.

The API and scheduler service already need to be made available outside of a given availability zone/fault domain. Since the majority of backends are external devices, making the c-vol service just a control path like the api and scheduler service, it should be fine to include c-vol in with these others. The backends themselves can be in different fault domains. We should therefore allow defining per-backend availability zones."


Version-Release number of selected component (if applicable):

openstack-cinder-10 (ocata)

How reproducible:

always

Steps to Reproduce:
1. verify no backend_availability_zone option is available in cinder.conf
2.
3.

Actual results:

1. AZ value cannot be defined on a per-backend basis
2. cinder-volume service can only manage storage backend in the same (single) AZ defined in cinder.conf

Expected results:

1. cinder to allow the definition of AZ value on a per backend basis.
2. single cinder-volume instance to manage multiple backends in multiple AZ's

Additional info:

Comment 4 Luca Miccini 2017-05-23 10:29:28 UTC
proposed backport: https://review.openstack.org/#/c/467115/

Comment 6 Luca Miccini 2017-05-29 06:12:59 UTC
Created attachment 1283164 [details]
test patch

Comment 14 Tzach Shefi 2018-02-05 09:27:38 UTC
Gorka, let me know if verification below are sufficient to close this(cloned #3). 

Tested on openstack-cinder-10.0.6-9.el7ost.noarch


1. On a system with ceph backend, added second nfs backend. 
2. Added backend_availability_zone dc1 dc2 one per backend.

[tripleo_ceph]
volume_backend_name=tripleo_ceph
volume_driver=cinder.volume.drivers.rbd.RBDDriver
rbd_ceph_conf=/etc/ceph/ceph.conf
rbd_user=qe
rbd_pool=qe-volumes
rbd_secret_uuid=cdf24729-bd06-46c2-ba47-7018ae220197
backend_host=hostgroup
backend_availability_zone = dc1

[nfs]
volume_backend_name=nfs
volume_driver=cinder.volume.drivers.nfs.NfsDriver
nfs_shares_config=/etc/cinder/nfs_shares.conf
nfs_snapshot_support=True
nas_secure_file_operations=False
nas_secure_file_permissions=False
backend_availability_zone = dc2


[stack@undercloud-0 ~]$ cinder extra-specs-list
+--------------------------------------+------+-----------------------------------------+                                                                                        
| ID                                   | Name | extra_specs                             |                                                                                        
+--------------------------------------+------+-----------------------------------------+                                                                                        
| 178b03de-6c4f-4494-9df5-33cee2927fa1 | ceph | {'volume_backend_name': 'tripleo_ceph'} |                                                                                        
| 99fb86be-2d22-48de-8cd9-332b26204335 | nfs  | {'volume_backend_name': 'nfs'}          |                                                                                        
+--------------------------------------+------+-----------------------------------------+


I'd created two volumes on both backend before adding backend_availability_zone setting, just to make sure backends are OK. 
#cinder create --display-name nfs1 --volume-type nfs 1
#cinder create --display-name ceph1 --volume-type ceph 1

Followed by two more volumes, after backend_availability_zone
#cinder create --display-name nfs2-dc2 --volume-type nfs 1 --availability-zone dc2
#cinder create --display-name ceph1-dc1 --volume-type ceph 1 --availability-zone dc1

Cinder list shows all of them available
[stack@undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+-----------+------+-------------+----------+-------------+                                                                   
| ID                                   | Status    | Name      | Size | Volume Type | Bootable | Attached to |                                                                   
+--------------------------------------+-----------+-----------+------+-------------+----------+-------------+                                                                   
| 067d4b76-ac44-40a7-ab44-7a1be7307d2d | available | ceph1-dc1 | 1    | ceph        | false    |             |                                                                   
| 5f58935a-940e-467b-ac2f-5e079b179355 | available | nfs2-dc2  | 1    | nfs         | false    |             |                                                                   
| ddb65abf-a7d4-4422-b25d-7afaa97d9d35 | available | ceph1     | 1    | ceph        | false    |             |                                                                   
| fad1ce35-89c9-46fb-9b76-8d533c57923a | available | nfs1      | 1    | nfs         | false    |             |                                                                   
+--------------------------------------+-----------+-----------+------+-------------+----------+-------------+ 

When adding wrong the wrong  --availability-zone, or omitting it altogether (defaults to none) I get an error status as expected. 

#cinder create --display-name ceph1-dc2-fail --volume-type ceph 1 --availability-zone dc2

#cinder create --display-name ceph1-dc2-fail1 --volume-type ceph 1 


| a4109b56-ba39-46c7-9306-f59023caf804 | error     | ceph1-dc2-fail | 1    | ceph        | false    |             |

| 025a5bf7-3d1c-4599-84f8-f5e224c569d9 | error     | ceph1-dc2-fail1 | 1    | ceph        | false    |             |


Anything else I should check before verification?

Comment 15 Tzach Shefi 2018-02-05 09:30:54 UTC
Also cinder show volume, show's correct zone per each volume

cinder show ceph1-dc1
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | dc1     

cinder show nfs2-dc2 
+--------------------------------+--------------------------------------+
| Property                       | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | dc2

Comment 16 Tzach Shefi 2018-02-05 09:36:42 UTC
Also adding cinder service list output

[stack@undercloud-0 ~]$ cinder service-list
+------------------+------------------------+------+---------+-------+----------------------------+-----------------+
| Binary           | Host                   | Zone | Status  | State | Updated_at                 | Disabled Reason |
+------------------+------------------------+------+---------+-------+----------------------------+-----------------+
| cinder-backup    | hostgroup              | nova | enabled | up    | 2018-02-05T09:36:01.000000 | -               |
| cinder-scheduler | hostgroup              | nova | enabled | up    | 2018-02-05T09:35:57.000000 | -               |
| cinder-volume    | hostgroup@nfs          | dc2  | enabled | up    | 2018-02-05T09:35:52.000000 | -               |
| cinder-volume    | hostgroup@tripleo_ceph | dc1  | enabled | up    | 2018-02-05T09:36:01.000000 | -               |
+------------------+------------------------+------+---------+-------+----------------------------+-----------------+

Comment 17 Gorka Eguileor 2018-02-05 12:11:43 UTC
Verification looks good to me, thanks.

Comment 18 Tzach Shefi 2018-02-05 13:38:50 UTC
Verified on:
openstack-cinder-10.0.6-9.el7ost.noarch
Comments 14-16.

Comment 21 errata-xmlrpc 2018-02-13 16:29:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0306