Bug 1251559

Summary: overcloud deploy fails due to ResourceUnknownStatus
Product: Red Hat OpenStack Reporter: Adriano Petrich <apetrich>
Component: rhosp-directorAssignee: chris alfonso <calfonso>
Status: CLOSED NOTABUG QA Contact: yeylon <yeylon>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: DirectorCC: apetrich, hbrock, mburns, rhel-osp-director-maint, sbaker, shardy, srevivo, yeylon
Target Milestone: ---Keywords: Automation, ZStream
Target Release: Director   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-18 19:29:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
deployment log none

Description Adriano Petrich 2015-08-07 17:09:37 UTC
Created attachment 1060429 [details]
deployment log

Description of problem:
when installing an overcloud it failed with

ERROR: openstack Heat Stack create failed.
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 295, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/command.py", line 53, in run
    self.take_action(parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 826, in take_action
    self._deploy_tuskar(stack, parsed_args)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 574, in _deploy_tuskar
    parsed_args.timeout)
  File "/usr/lib/python2.7/site-packages/rdomanager_oscplugin/v1/overcloud_deploy.py", line 441, in _heat_deploy
    raise Exception("Heat Stack create failed.")
Exception: Heat Stack create failed.

The previous call had an CREATE_FAILED status:


Resource CREATE failed: ResourceUnknownStatus: Resource failed - Unknown status FAILED due to \"Resource CREATE failed: ResourceUnknownStatus: Resource failed - Unknown status FAILED due to \"Resource CREATE failed: Error: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6



Version-Release number of selected component (if applicable):
It happened on the gate to the puddle

https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rdo_manager-gate_instack_undercloud-downstream-rhos-7_director/29/

How reproducible:


Steps to Reproduce:
1.do a deployment on openstack overcloud deploy --debug --log-file overcloud_deployment_71.log --plan overcloud --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server 10.5.26.10 --control-scale 1 --compute-scale 1 --ceph-storage-scale 1 --block-storage-scale 0 --swift-storage-scale 0 --control-flavor baremetal --compute-flavor baremetal --ceph-storage-flavor baremetal --block-storage-flavor baremetal --swift-storage-flavor baremetal
2. that fails with the error 


Actual results:
Deployment fails

Expected results:
Overcloud created 

Additional info:
The job has a list of debug information on the bottom of the job's console.

Comment 3 Adriano Petrich 2015-08-07 17:13:04 UTC
a heat event-list overcloud shows these errors


| CephStorageNodesPostDeployment    | e9507fe2-9e6b-46ca-bb22-d1cbae59a508 | ResourceUnknownStatus: Resource failed - Unknown status FAILED due to "Resource CREATE failed: ResourceUnknownStatus: Resource failed - Unknown status FAILED due to "Resource CREATE failed: Error: Deployment to server failed: deploy_status_code : Deployme | CREATE_FAILED      | 2015-08-07T14:49:44Z |
14:50:11 | overcloud                         | f0ca3117-cce5-4cc1-affc-af51ab89118a | Resource CREATE failed: ResourceUnknownStatus: Resource failed - Unknown status FAILED due to "Resource CREATE failed: ResourceUnknownStatus: Resource failed - Unknown status FAILED due to "Resource CREATE failed: Error: Deployment to server failed: deplo | CREATE_FAILED      | 2015-08-07T14:49:45Z |

Comment 4 Zane Bitter 2015-08-07 18:30:33 UTC
'UnknownStatus' is a red herring here (and the error message will be improved in a future release) - the status in question is CREATE_FAILED and is quite well-known. The reason for the failure is to be found at the end, not the beginning:

  "Deployment exited with non-zero status code: 6"

Since we know that Puppet routinely exits with the status code 6, we can surmise that the SoftwareDeployment that failed was almost certainly one running Puppet and you've confirmed that it was in fact CephStorageNodesPostDeployment.

You'd need to look into the puppet logs for CephStorageNodesPostDeployment to find the actual cause.

Comment 5 chris alfonso 2015-08-18 19:17:29 UTC
Adriano, were you able to get any more info from your deployment based upon Comment 4?

Comment 6 Adriano Petrich 2015-08-18 19:29:28 UTC
Chris I'm closing this bug as it stopped happening.