Bug 1483160

Summary: On Giveback after A Share Service is Brought Back after Being Shut Down Manila's Access to Shares is Lost
Product: Red Hat OpenStack Reporter: Dustin Schoenbrun <dschoenb>
Component: puppet-manilaAssignee: Tom Barron <tbarron>
Status: CLOSED ERRATA QA Contact: Dustin Schoenbrun <dschoenb>
Severity: medium Docs Contact: Don Domingo <ddomingo>
Priority: high    
Version: 10.0 (Newton)CC: jjoyce, jschluet, slinaber, tvignaud
Target Milestone: z6Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-5.3.3-1.el7ost puppet-manila-9.5.0-2.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1485015 (view as bug list) Environment:
Last Closed: 2017-11-15 13:45:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1485016    
Bug Blocks:    

Description Dustin Schoenbrun 2017-08-18 22:18:26 UTC
Description of problem:
When the Active share service stops and another, passive share service takes over as the active, when the share service that was stopped comes back, the shares that were on the other share service will become unavailable for Manila to control.

Version-Release number of selected component (if applicable):
openstack-manila-3.0.0-8.el7ost.noarch
openstack-manila-ui-2.5.1-9.el7ost.noarch
puppet-manila-9.5.0-1.el7ost.noarch
python-manilaclient-1.11.0-1.el7ost.noarch
python-manila-3.0.0-8.el7ost.noarch
openstack-manila-share-3.0.0-8.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Use Infrared to deploy an OSP-10z4 deployment with Manila and with at least 2 controller nodes and any number of compute nodes. I used a NetApp backend for the storage.
2. Disable the active Manila Share service. Observe that the service will start on another controller node.
3. Create a share on the new share service.
4. Re-enable the disabled share service and observe that the share that was created is no longer controllable by Manila. 

Actual results:
Manila shares created on the other share service become uncontrollable when the first share service is reactivated.

Expected results:
Disruption of the share service shall not impact Manila shares.

Additional info:
I looked into how Cinder does Volume service HA and they use a hostgroup for all of the volume services so that the "hostname" of the volume service does not change when another volume service takes over the active role. Chances are something similar will need to happen to Manila as well.

Comment 1 Tom Barron 2017-09-08 09:53:00 UTC
puppet manila patch 499937 has merged upstream in stable/newton but we still need to cherry pick THT patch 499111 after it merges to stable/ocata

Comment 2 Tom Barron 2017-09-28 11:33:13 UTC
stable/ocata tripleo-heat-templates patch 499111 has been cherry-picked to stable/newton as 508117

Comment 3 Tom Barron 2017-10-10 07:49:52 UTC
508117 has merged upstream to stable/newton

Comment 7 Dustin Schoenbrun 2017-11-06 18:20:49 UTC
Doing the procedure I listed above with the OSP-10z6 puddle, I was able to successfully have shares survive the loss of the controller node where the share service was running with all shares created before the controller was killed being listed and available while it was down. Looks like we're good here.

Comment 9 errata-xmlrpc 2017-11-15 13:45:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3231