Bug 1388139

Summary: Deployments with a Ceph node fail from the GUI
Product: Red Hat OpenStack Reporter: Jason E. Rist <jrist>
Component: openstack-tripleo-commonAssignee: Jiri Tomasek <jtomasek>
Status: CLOSED ERRATA QA Contact: Ola Pavlenko <opavlenk>
Severity: high Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: apannu, jpichon, jschluet, mburns, rhel-osp-director-maint, sclewis, slinaber, ukalifon
Target Milestone: rcKeywords: Triaged
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-common-5.3.0-6.el7ost openstack-tripleo-ui-1.0.4-3.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-14 16:24:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jason E. Rist 2016-10-24 14:54:35 UTC
Cloned from launchpad bug 1626426.

Description:

I set CephStorageCount to 1 and included the "storage environment" setting. Deployment failed in ControllerNodesPostDeployment and ComputeNodesPostDeployment with this error on the failed resources:

Error: authentication_type cephx requires either key or keyring to be set but both are undef at /etc/puppet/modules/ceph/manifests/mon.pp:108 on node default-controller-0.localdomain

It looks like the GUI doesn't pass all needed parameters to the stack creation. I don't get this problem when deploying from the CLI.

Specification URL (additional information):

https://bugs.launchpad.net/tripleo/+bug/1626426

Comment 1 Julie Pichon 2016-10-24 15:04:34 UTC
The description is a bit out of date, as indicated in the upstream bug the only parameter that seems to be missing now is CephClusterFSID.

It looks like it might be set up directly in the CLI at https://github.com/openstack/python-tripleoclient/blob/stable/newton/tripleoclient/v1/overcloud_deploy.py#L196 and not when deploying from a plan, but I don't have much experience deploying with Ceph and it's possible I'm missing an environment file of some sort.

There's also a related issue where it doesn't seem possible to set the parameter manually from the GUI, which is making it difficult to work around the problem at the moment.

Comment 2 Julie Pichon 2016-10-25 15:06:22 UTC
I think there are two bugs:

1. The CephClusterFSID is not generated automatically with the other credentials.

2. The UI doesn't let you set the parameter automatically (that I could find), which is necessary for working around #1 and also when deploying with an externally managed, pre-existing Ceph.

Comment 3 Jason E. Rist 2016-11-02 04:33:34 UTC
Fixes posted, need a backport:
https://review.openstack.org/390946

Fixes posted, need reviews:
https://review.openstack.org/390612

Comment 5 Jon Schlueter 2016-11-08 21:04:39 UTC
updated external tracker to tripleo-common patch

Comment 6 Julie Pichon 2016-11-09 15:49:32 UTC
The stable/newton backport for that tripleo-common patch just merged, this should cover all the patches needed for this bug.

Comment 8 Udi Kalifon 2016-11-15 11:02:53 UTC
Made a deployment of 3 controllers, 2 computes and 1 ceph, on a bare metal system with the default plan. Verified.

Comment 10 errata-xmlrpc 2016-12-14 16:24:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html