Bug 1304683
Summary: | Cannot delete volume after cinder-volume has moved to another pcmk controller node | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Christian Horn <chorn> |
Component: | openstack-tripleo-heat-templates | Assignee: | Giulio Fidente <gfidente> |
Status: | CLOSED ERRATA | QA Contact: | Udi Shkalim <ushkalim> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.0 (Kilo) | CC: | athomas, chorn, dbecker, dmacpher, dmesser, eharney, fdinitto, gfidente, jslagle, mburns, morazi, mtessun, nlevinki, ralf.boernemeier, rhel-osp-director-maint, sgotliv, yeylon |
Target Milestone: | y3 | Keywords: | ZStream |
Target Release: | 7.0 (Kilo) | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-0.8.6-119.el7ost | Doc Type: | Bug Fix |
Doc Text: |
In an Overcloud with HA Controller nodes, the 'cinder-volume' service might move to a new node. This causes problems modifying and deleting volumes due to a different hostname for the volume service. This fix sets a consistent hostname for the 'cinder-volume' service on all Controller nodes. Users can now modify and delete volumes on a HA Overcloud without issue.
|
Story Points: | --- |
Clone Of: | 1303843 | Environment: | |
Last Closed: | 2016-02-18 16:52:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1303843 | ||
Bug Blocks: | 1290377 |
Comment 1
Sergey Gotliv
2016-02-05 09:29:36 UTC
Hi Fabio, What's the correct path to address the issue that Sergey is referring to in the bug linked from his comment above? Thanks (In reply to Angus Thomas from comment #2) > Hi Fabio, > > What's the correct path to address the issue that Sergey is referring to in > the bug linked from his comment above? > > Thanks OSPd needs to set: [DEFAULT] host=$somevalue in cinder.conf and it has to be same on all controller nodes. that some value could be anything really, my suggestion would be to keep simple: overcloud-$overcloudname-cinder-host (In reply to Fabio Massimo Di Nitto from comment #3) > (In reply to Angus Thomas from comment #2) > > Hi Fabio, > > > > What's the correct path to address the issue that Sergey is referring to in > > the bug linked from his comment above? > > > > Thanks > > OSPd needs to set: > > [DEFAULT] > > host=$somevalue > > in cinder.conf and it has to be same on all controller nodes. > > that some value could be anything really, my suggestion would be to keep > simple: > > overcloud-$overcloudname-cinder-host Correcting my reply. This above is ok for new installs. For updates and upgrades, this is going to be very complex depending on how/where volumes have been created before the process and current setup. On updates/upgrades, if host is set and it´s the same on all controllers, then we should be ok. but if it´s not the same, then we can´t just sanitize it otherwise current volumes will be unavailable. I think the best would be to have Sergey´s team involved to define a proper action plan. Hi Giulio, Can you work up a patch for the simpler new-install case, to be backported to 7.3? Thanks, Angus The Cinder config option was also renamed from 'host' into 'backend_host' in Kilo. Other backends in addition to NFS could be affected by the same problem, including dellsc, eqlx and netapp. (In reply to Giulio Fidente from comment #7) > The Cinder config option was also renamed from 'host' into 'backend_host' in > Kilo. Other backends in addition to NFS could be affected by the same > problem, including dellsc, eqlx and netapp. Hi Giulio, can you please tell me the option name in regard to the different OSP releases? Kilo (OSP 7.x): backend_host = <unique value> Liberty (OSP 8.x) host = <unique value> or backend_host = <unique value> Background of this question: I have deployed OSP 8 Beta 4 with a Controller Pre-Configuration yaml script which configures the following line in "/etc/cinder/cinder.conf" on all 3 Controller nodes: [DEFAULT] backend_host = osp8br2-controller.localdomain The value "backend_host = osp8br2-controller.localdomain" seems to be ignored --> check cinder service-list: [stack@osp8bdr2 ~(UC)]$ cinder service-list +------------------+------------------------------------------------+------+---------+-------+----------------------------+-----------------+ | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | +------------------+------------------------------------------------+------+---------+-------+----------------------------+-----------------+ | cinder-scheduler | osp8br2-controller-0.localdomain | nova | enabled | up | 2016-02-09T08:01:22.000000 | - | | cinder-scheduler | osp8br2-controller-1.localdomain | nova | enabled | up | 2016-02-09T08:01:23.000000 | - | | cinder-scheduler | osp8br2-controller-2.localdomain | nova | enabled | up | 2016-02-09T08:01:22.000000 | - | | cinder-volume | osp8br2-controller-0.localdomain@tripleo_iscsi | nova | enabled | up | 2016-02-09T08:01:23.000000 | - | | cinder-volume | osp8br2-controller-0.localdomain@tripleo_nfs | nova | enabled | up | 2016-02-09T08:01:22.000000 | - | +------------------+------------------------------------------------+------+---------+-------+----------------------------+-----------------+ Thanks for your help! Regards, Ralf hi Ralf, it's only the per-backend setting which got renamed from host to backend_host. In the DEFAULT stanza host is still the right key to use. In the change at [1] I am trying to set this globally, using host, as one can always override the backend_host via hiera for a particular backend. Do you think that could work? Hi Giulio, Thanks for the clarification. I'm not a developer, so I can't really answer your question ;-) Anyway, from what I understood, it should be OK to have "host" set in the [DEFAULT] stanza of cinder.conf and have the possibility to override this for a specific backend using extradata. Sounds reasonable for me. Regards, Ralf Fabio, I think we understand the initial setup case. How do we cover the failover case? (In reply to Mike Orazi from comment #11) > Fabio, > > I think we understand the initial setup case. How do we cover the failover > case? As long as host, backend_host are the same across all the nodes, then there is nothing else you need to do. ctrl1: host=overcloud-foo backend_host=overcloud-foo ctrl2: host=overcloud-foo backend_host=overcloud-foo ... etc ... then you are all set. cinder-volume will use those values consistently independently on which node it is started/migrated/running. Fabio, the proposed patch drops use of backend_host and always sets: host=hostgroup on all controllers, which I think fixes the 'new deployment' scenario for all backend drivers. Eric, can you confirm this? backend_host is left unset, deployments which were using it will continue to see it set to what it was on update because puppet won't clean it it will still be possible to override on update any backend_host previously set by providing a new value via hieradata (ExtraConfig) does it looks like a viable plan? (In reply to Giulio Fidente from comment #13) > Fabio, the proposed patch drops use of backend_host and always sets: > > host=hostgroup > > on all controllers, which I think fixes the 'new deployment' scenario for > all backend drivers. Eric, can you confirm this? > > backend_host is left unset, deployments which were using it will continue to > see it set to what it was on update because puppet won't clean it > > it will still be possible to override on update any backend_host previously > set by providing a new value via hieradata (ExtraConfig) > > does it looks like a viable plan? I think so, but I am really not an expert in cinder stuff. So whatever the cinder guys ACK is good for me. we still need confirmation from cinder folks what the right fix is here for the update scenario The assumption we need confirmation of is: For new installs we will set: [DEFAULT] host=overcloud and will not set any backend_host values for any configured backends, this will fix the problem for new installs. For updates, where the host/backend_host are already configured differently and volumes have been created (or not, we don't really know), we will still set: [DEFAULT] host=overcloud but not unset/change any configured backend_host values. The assumption is that this means existing volumes will still continue to function, unless cinder-volume needs to fail over for some reason. after discussing with giulio, we're going to go with the patches as-is, they should be update safe for everything but LVM, which has no failover anyway since the volumes aren't replicated. giulio also confirmed that backend_host will always override [DEFAULT]/host: https://github.com/openstack/cinder/blob/c9eef31820dc385a2c9f4ba24dd1d194f9e7d088/cinder/cmd/all.py#L89 I have recreated the steps in original steps to reproduce and hit the same error. Using openstack-tripleo-heat-templates-0.8.6-119.el7ost Am i missing any steps to reproduce here? Thanks, Udi Verified on : openstack-tripleo-heat-templates-0.8.6-120.el7ost.noarch Using the original steps to reproduce: [stack@undercloud ~]$ cinder list +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | ID | Status | Display Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | fd603d0e-e45b-4155-b14a-bf08909b3aea | available | - | 1 | - | false | | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ [stack@undercloud ~]$ cinder delete fd603d0e-e45b-4155-b14a-bf08909b3aea [stack@undercloud ~]$ cinder list +----+--------+--------------+------+-------------+----------+-------------+ | ID | Status | Display Name | Size | Volume Type | Bootable | Attached to | +----+--------+--------------+------+-------------+----------+-------------+ +----+--------+--------------+------+-------------+----------+-------------+ [stack@undercloud ~]$ cinder create --display-name vol1 1 +---------------------+--------------------------------------+ | Property | Value | +---------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | created_at | 2016-02-16T11:04:52.827961 | | display_description | None | | display_name | vol1 | | encrypted | False | | id | e3dc4b2a-e68c-44ac-84a5-1c3206f976df | | metadata | {} | | multiattach | false | | size | 1 | | snapshot_id | None | | source_volid | None | | status | creating | | volume_type | None | +---------------------+--------------------------------------+ [stack@undercloud ~]$ cinder list +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | ID | Status | Display Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | e3dc4b2a-e68c-44ac-84a5-1c3206f976df | available | vol1 | 1 | - | false | | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ [stack@undercloud ~]$ cinder show vol1 +---------------------------------------+--------------------------------------+ | Property | Value | +---------------------------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | created_at | 2016-02-16T11:04:52.000000 | | display_description | None | | display_name | vol1 | | encrypted | False | | id | e3dc4b2a-e68c-44ac-84a5-1c3206f976df | | metadata | {} | | multiattach | false | | os-vol-host-attr:host | hostgroup@tripleo_nfs#tripleo_nfs | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | ade04b2ae4f643ab8537074700757e8e | | os-volume-replication:driver_data | None | | os-volume-replication:extended_status | None | | size | 1 | | snapshot_id | None | | source_volid | None | | status | available | | volume_type | None | +---------------------------------------+--------------------------------------+ [root@overcloud-controller-0 ~]# crm_resource --resource openstack-cinder-volume --locate resource openstack-cinder-volume is running on: overcloud-controller-0 [root@overcloud-controller-0 ~]# crm_resource --resource openstack-cinder-volume --move WARNING: Creating rsc_location constraint 'cli-ban-openstack-cinder-volume-on-overcloud-controller-0' with a score of -INFINITY for resource openstack-cinder-volume on overcloud-controller-0. This will prevent openstack-cinder-volume from running on overcloud-controller-0 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin This will be the case even if overcloud-controller-0 is the last node in the cluster This message can be disabled with --quiet [root@overcloud-controller-0 ~]# crm_resource --resource openstack-cinder-volume --locate resource openstack-cinder-volume is NOT running [root@overcloud-controller-0 ~]# crm_resource --resource openstack-cinder-volume --locate resource openstack-cinder-volume is running on: overcloud-controller-1 [stack@undercloud ~]$ . overcloudrc [stack@undercloud ~]$ cinder service-list +------------------+-----------------------+------+---------+-------+----------------------------+-----------------+ | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | +------------------+-----------------------+------+---------+-------+----------------------------+-----------------+ | cinder-scheduler | hostgroup | nova | enabled | up | 2016-02-16T11:07:31.000000 | - | | cinder-volume | hostgroup@tripleo_nfs | nova | enabled | up | 2016-02-16T11:07:31.000000 | - | +------------------+-----------------------+------+---------+-------+----------------------------+-----------------+ [stack@undercloud ~]$ cinder show vol1 +---------------------------------------+--------------------------------------+ | Property | Value | +---------------------------------------+--------------------------------------+ | attachments | [] | | availability_zone | nova | | bootable | false | | created_at | 2016-02-16T11:04:52.000000 | | display_description | None | | display_name | vol1 | | encrypted | False | | id | e3dc4b2a-e68c-44ac-84a5-1c3206f976df | | metadata | {} | | multiattach | false | | os-vol-host-attr:host | hostgroup@tripleo_nfs#tripleo_nfs | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | ade04b2ae4f643ab8537074700757e8e | | os-volume-replication:driver_data | None | | os-volume-replication:extended_status | None | | size | 1 | | snapshot_id | None | | source_volid | None | | status | available | | volume_type | None | +---------------------------------------+--------------------------------------+ [stack@undercloud ~]$ cinder list +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | ID | Status | Display Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ | e3dc4b2a-e68c-44ac-84a5-1c3206f976df | available | vol1 | 1 | - | false | | +--------------------------------------+-----------+--------------+------+-------------+----------+-------------+ [stack@undercloud ~]$ cinder delete vol1 [stack@undercloud ~]$ cinder list +----+--------+--------------+------+-------------+----------+-------------+ | ID | Status | Display Name | Size | Volume Type | Bootable | Attached to | +----+--------+--------------+------+-------------+----------+-------------+ +----+--------+--------------+------+-------------+----------+-------------+ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0264.html |