Bug 1321453

Summary: OSP director failed to set Netapp unified driver
Product: Red Hat OpenStack Reporter: Yogev Rabl <yrabl>
Component: openstack-tripleo-heat-templatesAssignee: Tom Barron <tbarron>
Status: CLOSED NOTABUG QA Contact: lkuchlan <lkuchlan>
Severity: medium Docs Contact:
Priority: high    
Version: 8.0 (Liberty)CC: dcain, eharney, jcoufal, lkuchlan, mburns, mcornea, morazi, ohochman, pgrist, rhel-osp-director-maint, sclewis, tbarron, tshefi, yrabl
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-15 16:19:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Cinder's logs none

Description Yogev Rabl 2016-03-27 10:16:42 UTC
Description of problem:
An overcloud was deployed with Netapp unified driver as Cinder's back end. The Netapp available is Netapp 7-mode, using the ISCSI protocol with these parameters in the YAML file: 
resource_registry:
    OS::TripleO::ControllerExtraConfigPre: /usr/share/openstack-tripleo-heat-templates/puppet/extraconfig/pre_deploy/controller/cinder-netapp.yaml

parameter_defaults:
    CinderEnableNetappBackend: true
    CinderEnableIscsiBackend: false
    CinderNetappServerPort: '80'
    CinderNetappSizeMultiplier: '1.2'
    CinderNetappStorageFamily: 'ontap_7mode'
    CinderNetappStorageProtocol: 'iscsi'
    CinderNetappTransportType: 'http'
    CinderNetappBackendName: 'tripleo_netapp'
    CinderNetappLogin: 'automation'
    CinderNetappVolumeList: 'vol_rhos_auto_iscsi'
    CinderNetappServerHostname: netapp.qa.lab.tlv.redhat.com
    CinderNetappPassword: <password> 

The result in Cinder's configuration file is the following section: 
[tripleo_netapp]
netapp_login=automation
netapp_vfiler=
netapp_password=<password>
nfs_shares_config=/etc/cinder/shares.conf
netapp_storage_pools=
netapp_sa_password=
netapp_server_hostname=netapp.qa.lab.tlv.redhat.com
netapp_size_multiplier=1.2
thres_avl_size_perc_stop=60
netapp_storage_protocol=iscsi
netapp_webservice_path=/devmgr/v2
volume_driver=cinder.volume.drivers.netapp.common.NetAppDriver
netapp_controller_ips=
netapp_volume_list=vol_rhos_auto_iscsi
netapp_storage_family=ontap_7mode
expiry_thres_minutes=720
netapp_server_port=80
netapp_partner_backend_name=
netapp_eseries_host_type=linux_dm_mp
thres_avl_size_perc_start=20
volume_backend_name=tripleo_netapp
netapp_copyoffload_tool_path=
netapp_transport_type=http
netapp_vserver=

There are several parameters that are not relevant to the back end that we are trying to set. These parameters are causing Cinder to malfunction - volumes cannot be created. 

Version-Release number of selected component (if applicable):
openstack-tripleo-common-0.3.0-3.el7ost.noarch
openstack-tripleo-image-elements-0.9.9-1.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.12-2.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.5-1.el7ost.noarch
openstack-tripleo-heat-templates-0.8.12-2.el7ost.noarch
python-tripleoclient-0.3.1-1.el7ost.noarch
openstack-tripleo-0.0.7-1.el7ost.noarch


How reproducible:
100%

Steps to Reproduce:
1. Deploy an overcloud with Netapp as the back end of Cinder

Actual results:
The back end fails to function

Expected results:
The back end is set with the parameters that we've set in the YAML file and the other parameters are not part of the back end configuration in Cinder's configuration file.

Additional info:

Comment 2 James Slagle 2016-03-28 18:23:47 UTC
Please describe how "These parameters are causing Cinder to malfunction"...a traceback, an error, cinder logs, what you do to actually create a volume, etc.

Comment 3 Yogev Rabl 2016-03-29 12:06:05 UTC
Created attachment 1141211 [details]
Cinder's logs

Here are Cinder's log, please notice the volume.log especially. 

Plus, you'll notice that after I changed the configuration there's another issue - that is true, but it is an IT problem which is not related.

Comment 4 Dave Cain 2016-03-29 16:30:25 UTC
Hi!  Can you share the entire cinder.conf file here Yogev? I'm curious why the volume.log has a hostgroup defined of tripleo_netapp_nfs, which could potentially mean a corresponding stanza in cinder.conf.

Comment 5 Yogev Rabl 2016-03-30 07:38:51 UTC
Here's the driver's section:

[tripleo_netapp_nfs]
netapp_storage_family=ontap_7mode
netapp_vfiler=
expiry_thres_minutes=720
netapp_size_multiplier=1.2
netapp_vserver=
netapp_volume_list=
netapp_storage_protocol=nfs
thres_avl_size_perc_start=20
netapp_server_hostname=netapp
netapp_eseries_host_type=linux_dm_mp
nfs_shares_config=/etc/cinder/shares.conf
thres_avl_size_perc_stop=60
netapp_copyoffload_tool_path=
netapp_storage_pools=
netapp_login=automation
volume_backend_name=tripleo_netapp_nfs
netapp_partner_backend_name=
netapp_transport_type=http
netapp_webservice_path=/devmgr/v2
netapp_sa_password=
netapp_password= <password>
netapp_controller_ips=
netapp_server_port=80
volume_driver=cinder.volume.drivers.netapp.common.NetAppDriver

Comment 6 Dave Cain 2016-03-31 02:56:51 UTC
I'm confused.  Was this [tripleo_netapp_nfs] created by an Overcloud deployment by the Director, or something that was added later?  Are there actually two stanzas/backends in your cinder.conf, one with [tripleo_netapp] and one with [tripleo_netapp_nfs]?  The log files you've attached make sense and match what's configured for the tripleo_netapp_nfs backend defined above and it's definitely a misconfiguration.

Change netapp_storage_protocol to iscsi.
Change netapp_server_hostname to the hostname or ip address of the management interface

Restart Cinder and see if that helps.  This may also be of assistance: http://netapp.github.io/openstack-deploy-ops-guide/liberty/content/cinder.7mode.iscsi.configuration.html

Comment 7 James Slagle 2016-03-31 12:01:34 UTC
it would also be useful to attach all of /etc/cinder

Comment 8 James Slagle 2016-03-31 17:48:22 UTC
(In reply to Dave Cain from comment #6)
> I'm confused.  Was this [tripleo_netapp_nfs] created by an Overcloud
> deployment by the Director, or something that was added later?  Are there
> actually two stanzas/backends in your cinder.conf, one with [tripleo_netapp]
> and one with [tripleo_netapp_nfs]?  The log files you've attached make sense
> and match what's configured for the tripleo_netapp_nfs backend defined above
> and it's definitely a misconfiguration.
> 
> Change netapp_storage_protocol to iscsi.

netapp_stroage_protocol is set by the CinderNetappStorageProtocol parameter in puppet/extraconfig/pre_deploy/controller/cinder-netapp.yaml in tripleo-heat-templates and it's default value is nfs.

should the default value be something different?

that parameter could always be set in parameter_defaults to iscsi if that is the correct value (as a workaround).

> Change netapp_server_hostname to the hostname or ip address of the
> management interface

likewise netapp_server_hostname can be set via the CinderNetappServerHostname parameter.

Comment 9 Yogev Rabl 2016-04-03 07:55:34 UTC
(In reply to James Slagle from comment #7)
> it would also be useful to attach all of /etc/cinder

No, there's no additional information in the directory to resolve this issue. The problem lies in the parameters that OSPD sets the in /etc/cinder/cinder.conf

Comment 10 Mike Burns 2016-04-04 11:57:21 UTC
If we get a fix, we'll evaluate and consider it, but this is unlikely to block the release at this point.

Comment 11 Mike Burns 2016-04-07 21:36:02 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 13 Tom Barron 2016-10-14 13:10:15 UTC
volume.log in the attachments clearly shows attempts to start an NFS backend, which isn't going to work with a NetApp configured for iSCSI only.

From comment #8, the default netapp storage protocol with Director is NFS.

From comment #1, we see Yogev attempting to override the default by deploying with a pre-deploy heat template.  We can see there that an additional backend configuration stanza, for iscsi NetApp, called 'tripleo_netapp' was added to /etc/manila/manila.conf but the volume log tells us that the 'enabled_backends' setting in that file was not updated to activate that stanza rather than the original [tripleo_netapp_nfs] stanza:

> 2016-03-28 11:10:24.980 23728 DEBUG oslo_service.service [req-634aba5b-2895-4be1-b8ad-16a9f95981b6 - - - - -] enabled_backends               = ['tripleo_netapp_nfs'] log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:2229'

And the volume log tells a sad story of the cinder volume service trying to connect to that NetApp using an NFS driver when the NetApp was configured for iscsi only.

The remaining question is whether the pre-deploy heat template cited in Comment 1 **should** have forced the missing enabled_backends update or if this is a case of misconfiguration.

Yogev, can you speak to your source for the pre-deploy heat template?  I'm wondering if what's needed here may be a doc update.

Comment 14 Yogev Rabl 2016-10-25 08:29:51 UTC
(In reply to Tom Barron from comment #13)

> Yogev, can you speak to your source for the pre-deploy heat template?  I'm
> wondering if what's needed here may be a doc update.

I tend to agree, cause we seem to pass this problem.

Comment 16 Tom Barron 2016-12-08 19:46:46 UTC
@Yogev:  First, I know you are in a different DFG now, so if this one needs
handing off to a cinder QE, that's cool.

Second, in his last comment [1]  Yogev wrote "we seem to pass this problem."

Am I right in thinking that in the regular QE automation there are tests with this same NetApp 7-mode iscsi backend that succeed, and that these work with configurations deployed via OSPd ?  That is, we don't have some general
issue with OSPd deployment of NetApp iscsi backend for Cinder, but rather
an issue with this particular way of doing it?

What seems to be happening with this THT template and overcloud deploy command is that an iscsi configuration stanza is added to /etc/cinder/cinder.conf but the enabled_backends line in that configuration did not get updated

2 matches for "enabled_backends" in buffer: volume.log
    251:2016-03-28 11:10:24.980 23728 DEBUG oslo_service.service [req-634aba5b-2895-4be1-b8ad-16a9f95981b6 - - - - -] enabled_backends               = ['tripleo_netapp_nfs'] log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:2229
    735:2016-03-28 11:12:09.551 29704 DEBUG oslo_service.service [req-dbd51d6a-18b4-483c-a843-946b5ca78639 - - - - -] enabled_backends               = ['tripleo_netapp_nfs'] log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:2229

so that even after the volume service was restarted it kept trying to load an NFS backend (and of course your filer is configured for iscsi).

So if you were following one of our docs it would be good to put on record
here which doc that is so that we can determine if it is incorrect or out of date.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1321453#c14
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1321453#c0

Comment 18 Tom Barron 2016-12-15 16:19:07 UTC
I've talked to the QE folks who opened this BZ and who own it now, as well as with our NetApp partners.  This is not a customer reported issue and appears to be an outcome of an un-recommended approach to doing overcloud-deploy with the NetApp iscsi backend.

So I'm going to close this out.  Re-open if there's any issue and we'll take another look.