Bug 1237318 - Overcloud deploy appears to be stuck when using ceph node
Summary: Overcloud deploy appears to be stuck when using ceph node
Keywords:
Status: CLOSED DUPLICATE of bug 1235908
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: Director
Assignee: chris alfonso
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-06-30 19:27 UTC by Marius Cornea
Modified: 2016-04-18 07:01 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-07-01 17:30:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
env file (2.31 KB, application/x-gzip)
2015-06-30 20:47 UTC, Marius Cornea
no flags Details
heat engine log (385.89 KB, text/plain)
2015-06-30 20:48 UTC, Marius Cornea
no flags Details

Description Marius Cornea 2015-06-30 19:27:57 UTC
Description of problem:
I'm doing a deployment with 3 controllers, 1 compute and 1 ceph node on baremetal with network isolation. During the deployment there are large periods when the overcloud nodes don't output anything in os-collect-config log. Also the heat-engine log seems to be showing the same things in loop. Note that this happens only when introducing the ceph node. Deployment with 1 controller, 1 compute, 3 controllers, 1 compute went ok. 

Version-Release number of selected component (if applicable):
instack-undercloud-2.1.2-6.el7ost.noarch
openstack-tripleo-common-0.0.1.dev6-0.git49b57eb.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.1-2.el7ost.noarch
openstack-tripleo-0.0.7-0.1.1664e566.el7ost.noarch
openstack-tripleo-image-elements-0.9.6-4.el7ost.noarch
openstack-tripleo-heat-templates-0.8.6-22.el7ost.noarch

How reproducible:
50%

Steps to Reproduce:
1. Deploy a 3 controller, 1 compute and 1 ceph node with network isolation
2. 
3.

Actual results:
Deployment doesn't show any progress, overcloud nodes don't show any activity in os-collect-config. 

Expected results:
Deployment would be successful.

Additional info:
Attaching part of the heat-engine.log.

Comment 3 Dan Sneddon 2015-06-30 20:07:12 UTC
(In reply to Marius Cornea from comment #0)

Can you clarify what output you see when you deploy the cloud? Does anything start to build?

What was in your network-environment.yaml file?

I have seen an issue where CephStorageNodesPostDeployment would time out when network isolation was enabled, but the Heat resources would all go into CREATE_COMPLETE except for Ceph. Not sure if that's related. That's this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1236167

Comment 4 Marius Cornea 2015-06-30 20:47:21 UTC
Yes, it start to build, the nodes boot up, I can log in via ssh and watch the os-collect-config logs. At the end deployment fails with this  message.

ERROR: openstack ERROR: Authentication failed. Please try again with option --include-password or export HEAT_INCLUDE_PASSWORD=1
Authentication required

Attaching environment file and nic templates.

Comment 5 Marius Cornea 2015-06-30 20:47:50 UTC
Created attachment 1044794 [details]
env file

Comment 6 Marius Cornea 2015-06-30 20:48:13 UTC
Created attachment 1044795 [details]
heat engine log

Comment 7 Mike Burns 2015-07-01 17:30:16 UTC
This appears to be due to the keystone timeout issue.  Marking a duplicate

*** This bug has been marked as a duplicate of bug 1235908 ***


Note You need to log in before you can comment on or make changes to this bug.