Bug 1324932

Summary: Cinder user enabled multibackend feature causes deployment failures with Ceph
Product: Red Hat OpenStack Reporter: Rajini Karthik <rajini.karthik>
Component: openstack-tripleo-heat-templatesAssignee: Giulio Fidente <gfidente>
Status: CLOSED WORKSFORME QA Contact: Arik Chernetsky <achernet>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 8.0 (Liberty)CC: achernet, arkady_kanevsky, augol, cdevine, christopher_dearborn, gael_rehault, gfidente, John_walsh, jstransk, kurt_hey, mburns, rajini.karthik, randy_perryman, rhel-osp-director-maint, rsussman, sreichar, wayne_allen
Target Milestone: ---   
Target Release: 10.0 (Newton)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1314073 Environment:
Last Closed: 2016-04-08 10:22:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1312000    
Bug Blocks: 1261979, 1310828    
Attachments:
Description Flags
dell-cinder-backends.yaml
none
dell-storage-environment.yaml
none
network-isolation.yaml
none
network-environment.yaml
none
storage-environment.yaml
none
network-environment.yaml
none
dell-environment.yaml
none
Diff of templates
none
first-boot.yaml
none
post-deploy.yaml
none
diff_templates_without_cinder_backends none

Description Rajini Karthik 2016-04-07 15:34:31 UTC
Created attachment 1144791 [details]
dell-cinder-backends.yaml

Description of problem:
Using the JS 5.0.0 Beta 9 Director and Beta 9 OSP bits. With cinder user enabled backends feature, ceph storage deployment failes and deployment of the stack eventually timesout



[osp_admin@director pilot]$ heat resource-list overcloud -n 5  | grep FAILED
| CephStorage                               | 3d225960-793c-4f56-b709-e0dbd81f2ea5          | OS::Heat::ResourceGroup                           | CREATE_FAILED      | 2016-04-06T20:42:05 | overclou
d  


[osp_admin@director pilot]$ heat resource-show overcloud CephStorage
+------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property               | Value                                                                                                                                                   |
+------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| attributes             | {                                                                                                                                                       |
|                        |   "attributes": null,                                                                                                                                   |
|                        |   "refs": null                                                                                                                                          |
|                        | }                                                                                                                                                       |
| creation_time          | 2016-04-06T20:42:05                                                                                                                                     |
| description            |                                                                                                                                                         |
| links                  | http://192.168.120.178:8004/v1/17ac6752a2ae4e489fcb6549aa28579a/stacks/overcloud/f800e28e-8ecc-4a0f-a1d3-b76808ebce4d/resources/CephStorage (self)      |
|                        | http://192.168.120.178:8004/v1/17ac6752a2ae4e489fcb6549aa28579a/stacks/overcloud/f800e28e-8ecc-4a0f-a1d3-b76808ebce4d (stack)                           |
|                        | http://192.168.120.178:8004/v1/17ac6752a2ae4e489fcb6549aa28579a/stacks/overcloud-CephStorage-fk3ejc3mvxp4/3d225960-793c-4f56-b709-e0dbd81f2ea5 (nested) |
| logical_resource_id    | CephStorage                                                                                                                                             |
| physical_resource_id   | 3d225960-793c-4f56-b709-e0dbd81f2ea5                                                                                                                    |
| required_by            | CephStorageCephDeployment                                                                                                                               |
|                        | CephStorageAllNodesDeployment                                                                                                                           |
|                        | CephStorageAllNodesValidationDeployment                                                                                                                 |
|                        | UpdateWorkflow                                                                                                                                          |
|                        | CephStorageNodesPostDeployment                                                                                                                          |
|                        | allNodesConfig                                                                                                                                          |
|                        | AllNodesExtraConfig                                                                                                                                     |
| resource_name          | CephStorage                                                                                                                                             |
| resource_status        | CREATE_FAILED                                                                                                                                           |
| resource_status_reason | CREATE aborted                                                                                                                                          |
| resource_type          | OS::Heat::ResourceGroup                                                                                                                                 |
| updated_time           | 2016-04-06T20:42:05                                                                                                                                     |
+------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+

See attachments

Version-Release number of selected component (if applicable):
Beta 9 Director
Beta 9 OSP


How reproducible:
Everytime

Steps to Reproduce:
1.
2.
3.

Actual results:
Fails with above

Expected results:
Deployment should complete

Additional info:

Comment 1 Rajini Karthik 2016-04-07 15:38:55 UTC
Created attachment 1144793 [details]
dell-storage-environment.yaml

Comment 2 Giulio Fidente 2016-04-07 16:11:21 UTC
hi, I am trying to investigate this but it's not clear the purpose of  dell-storage-environment.yaml, you don't need that to test the user enabled backends and it might even collide with it depending on what is in the other custom template.

Can you please try using only dell-cinder-backends.yaml and paste the exact deploy command you're using?

Comment 3 Rajini Karthik 2016-04-07 16:20:04 UTC
dell-storage-environment.yaml may not be relevant in this context. We have a number of files environment files we use. We don't have a setup to  just deploy with only the dell-cinder-backends.yaml.

Is there anything else that might help to debug the issue?

Comment 4 Mike Burns 2016-04-07 16:37:24 UTC
Can you provide the deploy command line and the contents of any environment yamls you're passing?

Also, if you can get on Freenode IRC and reach out to us on #tripleo or #rdo, we can likely get to the root of this more quickly.

Comment 5 Rajini Karthik 2016-04-07 17:35:59 UTC
Command that runs the overcloud deploy, will upload the yamls as attachments

016-04-06 15:41:54.696 20221 INFO openstackclient.shell [ admin None] START with options: ['overcloud', 'deploy', '--debug', '--log-file', '/home/osp_admin/pilot/overcloud_deployment.log', '-t', '
380', '--templates', '/home/osp_admin/pilot/templates/overcloud', '-e', '/home/osp_admin/pilot/templates/overcloud/environments/network-isolation.yaml', '-e', '/home/osp_admin/pilot/templates/netwo
rk-environment.yaml', '-e', '/home/osp_admin/pilot/templates/overcloud/environments/storage-environment.yaml', '-e', '/home/osp_admin/pilot/templates/dell-environment.yaml', '-e', '/usr/share/opens
tack-tripleo-heat-templates/environments/puppet-pacemaker.yaml', '-e', '/home/osp_admin/pilot/templates/dell-cinder-backends.yaml', '--control-flavor', 'control', '--compute-flavor', 'compute', '--
ceph-storage-flavor', 'ceph-storage', '--swift-storage-flavor', 'swift-storage', '--block-storage-flavor', 'block-storage', '--neutron-public-interface', 'bond1', '--neutron-network-type', 'vlan', 
'--neutron-disable-tunneling', '--os-auth-url', 'http://192.168.120.178:5000/v2.0', '--os-project-name', 'admin', '--os-user-id', 'admin', '--os-password', 'd24ec4dbe2e7b48ee57497bb74bf0a7152066c42
', '--control-scale', '3', '--compute-scale', '3', '--ceph-storage-scale', '3', '--ntp-server', '0.centos.pool.ntp.org', '--neutron-network-vlan-ranges', 'physint:201:220,physext', '--neutron-bridg
e-mappings', 'physint:br-tenant,physext:br-ex']

Comment 6 Rajini Karthik 2016-04-07 17:36:08 UTC
Command that runs the overcloud deploy, will upload the yamls as attachments

016-04-06 15:41:54.696 20221 INFO openstackclient.shell [ admin None] START with options: ['overcloud', 'deploy', '--debug', '--log-file', '/home/osp_admin/pilot/overcloud_deployment.log', '-t', '
380', '--templates', '/home/osp_admin/pilot/templates/overcloud', '-e', '/home/osp_admin/pilot/templates/overcloud/environments/network-isolation.yaml', '-e', '/home/osp_admin/pilot/templates/netwo
rk-environment.yaml', '-e', '/home/osp_admin/pilot/templates/overcloud/environments/storage-environment.yaml', '-e', '/home/osp_admin/pilot/templates/dell-environment.yaml', '-e', '/usr/share/opens
tack-tripleo-heat-templates/environments/puppet-pacemaker.yaml', '-e', '/home/osp_admin/pilot/templates/dell-cinder-backends.yaml', '--control-flavor', 'control', '--compute-flavor', 'compute', '--
ceph-storage-flavor', 'ceph-storage', '--swift-storage-flavor', 'swift-storage', '--block-storage-flavor', 'block-storage', '--neutron-public-interface', 'bond1', '--neutron-network-type', 'vlan', 
'--neutron-disable-tunneling', '--os-auth-url', 'http://192.168.120.178:5000/v2.0', '--os-project-name', 'admin', '--os-user-id', 'admin', '--os-password', 'd24ec4dbe2e7b48ee57497bb74bf0a7152066c42
', '--control-scale', '3', '--compute-scale', '3', '--ceph-storage-scale', '3', '--ntp-server', '0.centos.pool.ntp.org', '--neutron-network-vlan-ranges', 'physint:201:220,physext', '--neutron-bridg
e-mappings', 'physint:br-tenant,physext:br-ex']

Comment 7 Rajini Karthik 2016-04-07 17:39:25 UTC
Created attachment 1144823 [details]
network-isolation.yaml

Comment 8 Rajini Karthik 2016-04-07 17:40:00 UTC
Created attachment 1144824 [details]
network-environment.yaml

Comment 9 Rajini Karthik 2016-04-07 17:40:30 UTC
Created attachment 1144825 [details]
storage-environment.yaml

Comment 10 Rajini Karthik 2016-04-07 17:41:16 UTC
Created attachment 1144826 [details]
network-environment.yaml

Comment 11 Rajini Karthik 2016-04-07 17:41:50 UTC
Created attachment 1144827 [details]
dell-environment.yaml

Comment 12 Rajini Karthik 2016-04-07 18:13:21 UTC
The overcloud eventually times out, after being stuck on Ceph deployment steps

Comment 13 Rajini Karthik 2016-04-07 18:33:15 UTC
Created attachment 1144835 [details]
Diff of templates

diff -rNu /home/osp_admin/pilot/templates/overcloud /usr/share/openstack-tripleo-heat-templates > diff_of_templates

Comment 14 Giulio Fidente 2016-04-07 19:29:08 UTC
I was able to deploy using both your dell-cinder-backends.yaml and ceph successfully. If there is a problem I think it is elsewhere.

From the diff you attached I see you're editing the ceph hieradata and I think I spotted a syntax error::

-ceph::profile::params::osds:
-  'sdd/dev/':
-    journal: '/dev/sdb'

should it read '/dev/sdd' ?

Also, it is not necessary to edit the hieradata files so you don't need to copy the templates. Instead, you can use the stock templates (switching all references to /usr/share/openstack-tripleo-heat-templates) and use a custom environment file for those parameters which need to overridden, for example:

parameter_defaults:
  ExtraConfig:
    ceph::profile::params::osd_journal_size: 10000
    ceph::profile::params::osd_pool_default_pg_num: 256
    ceph::profile::params::osds:
      '/dev/sdd':
        journal: '/dev/sdb'

Comment 15 Giulio Fidente 2016-04-07 20:27:12 UTC
I noticed you're using some custom first boot and post deployment script too 

resource_registry:
    OS::TripleO::NodeUserData: /home/osp_admin/pilot/templates/first-boot.yaml
    OS::TripleO::NodeExtraConfigPost: /home/osp_admin/pilot/templates/post-deploy.yaml

can you attach those two files as well?

Comment 16 Mike Burns 2016-04-07 21:36:02 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 17 Rajini Karthik 2016-04-07 22:13:21 UTC
Created attachment 1144904 [details]
first-boot.yaml

Comment 18 Rajini Karthik 2016-04-07 22:13:51 UTC
Created attachment 1144905 [details]
post-deploy.yaml

Comment 19 Rajini Karthik 2016-04-07 22:45:26 UTC
Created attachment 1144906 [details]
diff_templates_without_cinder_backends

Comment 20 Rajini Karthik 2016-04-08 03:13:13 UTC
Giulio was correct. The disk formats were old. Fixing them and generating the correct ceph.yaml file solved the issue.

Was able to successfully deploy ceph, dell eqlx and dell storage center backends on my Stamp.

QA stamps and FutureVille 2 stamps pending tests

Comment 21 Giulio Fidente 2016-04-08 10:03:20 UTC
hi, thanks for the update

Unfortunately the syntax error was in a param value, not in the param name, heat does not know what are the allowed values for each and every parameter so validating this before the deployment is quite complex

Waiting on the results from the other environments.

Comment 22 Mike Burns 2016-04-08 10:22:29 UTC
Per comment 20, closing this bug.   If the issue reappears, please reopen.