Bug 1305260

Summary: create installer API endpoints that creates a Ceph cluster
Product: [Red Hat Storage] Red Hat Storage Console Reporter: Christina Meno <gmeno>
Component: ceph-installerAssignee: Andrew Schoen <aschoen>
Status: CLOSED ERRATA QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 2CC: adeza, aschoen, ceph-eng-bugs, ceph-qe-bugs, flucifre, gmeno, hnallurv, icolle, kdreyer, mkarnik, mkudlej, nthomas, sankarshan
Target Milestone: ---   
Target Release: 2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-23 19:46:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1291304    

Description Christina Meno 2016-02-06 13:55:15 UTC
Description of problem:

Ceph cluster creation 
(input parameter : List of nodes and roles : OSD / MON for each of them and failure domain information like zones and racks, journal size is optional )

 Cluster creation should be resilient to some failures. For e.g. If one OSD creation or a few OSD creation fails, the cluster creation still should go ahead and the cluster should be created. 

Calamari lite service should be configured and turned on as part of cluster creation. 

The errors / failures should be communicated to the user of API. 


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Christina Meno 2016-02-09 04:52:39 UTC
Mrugesh and I spoke. We concluded that USM will work out the information like Failure domain and global cluster defaults config.

How will the request convey this information i.e. share and example of what you expect to sent in the request.

Comment 4 Christina Meno 2016-02-09 23:10:09 UTC
Create cluster will be satisfied by this workflow.

Red Hat Storage Controller will request package installation

Mariner will install packages and report status of jobs


Red Hat Storage Controller will work out how to avoid asking for monitors that violate failure domain.

Red Hat Storage Controller will request monitor cluster creation


Mariner will configure monitors and report status of jobs


Red Hat Storage Controller will work out how to avoid asking for OSDs that violate failure domain.

Red Hat Storage Controller will request OSD creation

Mariner will configure OSDs and report status of jobs

Comment 6 Dusmant 2016-02-10 11:50:17 UTC
This is how we would like the cluster creation to happen, keeping with your proposal. (Note : we want to have package installation as a separate step, rather than being clubbed with cluster creation : We had a discussion with Mrugesh on this and he agrees to this proposal)

1. We invoke the package installation for MONs from Mariner. Something like, we will provide the list-of-nodes, on which ceph packages for MON needs to be installed by Mariner.

2. Using the returned task id of the above step, we will poll and find the progress of this task. 

2a. In the meanwhile we will ask Mariner to install the ceph packages for the OSD nodes on another-list-of-nodes

3. Once the MONs package installation is complete, we will invoke the CreateCluster API, to which we will pass-on the list-of-MONs, using which Mariner will create the cluster. It will return a task-id for this operation.

4. Using this task-id, we monitor the progress of cluster creation. 

5. Once the cluster creation using MONs is successful, we will invoke the addition of OSD nodes to the cluster. We will pass-on a list-of-OSD-Nodes to the Mariner API "AddOSDs", which will internally add all these OSD-Nodes to the specified cluster. Along with the OSDs, we will pass-on disk specific information, journal infromation etc to the API. "AddOSDs" API is expected to prepare the OSDs in parallel, instead of serializing it. The other expected behaviour is, even if one or few number of OSDs fail, the task should proceed, instead of aborting.
This API will return a task-id, using which we will poll  regularly to see the progress of OSD addition. We want the progress to be in format, so that, we can see the succeeded and failed OSDs list.

Comment 7 Mrugesh Karnik 2016-02-10 15:11:29 UTC
The package installation API call could be just a single call. Unlike the mon and osd creation calls, there's no need for sequential actions here - the package installation can be carried out in parallel on all the nodes in question.

Comment 8 Dusmant 2016-02-11 07:27:54 UTC
We would love to have that kind of API. 
But, as per Gregory, they have some complexity involved in Mariner, if they have to provide that kind of API, which can take a list of nodes. It would be easy for them to have each node package installation being invoked separately in parallel, so that they can give task id for each, i suppose. And we use that task id to poll to know how the task is progressing.

Gregory, can you confirm that?

Comment 10 Nishanth Thomas 2016-02-13 08:22:48 UTC
Failure Domains:

Failure domain configuration require changes in the ceph.conf as well as the crush maps. The configuration some what similar to the crush configuration

These are different bucket types supported:
- type 9 region
- type 8 datacenter
- type 7 room
- type 6 pod
- type 5 pdu
- type 4 row
- type 3 rack
- type 2 chassis
- type 1 root

Each host can have a single or a combination of these hierarchies. say region(APAC)->datacenter(BLR)->room()->rack-chassis etc. By default all the hosts will be added to root bucket.

Comment 11 Nishanth Thomas 2016-02-13 10:33:55 UTC
Configuring the Journals

There three different uses cases here.

A disk can used as :

1. OSD Data and Journal co-existing on same disk - We need to create two partitions here, one for OSD and other for journal 

2. Dedicated OSD Data - the disk is dedicated for OSD data only. Journals will be created in a separate disk

3. Dedicated Journal - Need to create multiple partitions based on the requirements. OSD can utilize the journals created on this disk

USM will provide journal size(system calculated or user provided) , type(one among the three) and disk(in case of #3)to mariner so that mariner can take care of creating the journals based on the input

Comment 12 Alfredo Deza 2016-02-13 13:37:57 UTC
On polling:
We have been thinking of allowing USM to provide a callback URL when a request to the API is made so that polling is not needed.

The callback URL would be requested when the API has completed (in either failure or success).

This feature would not be hard to implement in the installer and I think it would allow a better way to handle updates for USM. Individual tasks would still exist and USM could still poll those if needed.

Comment 13 Nishanth Thomas 2016-02-13 18:21:14 UTC
USM needs to know the periodic updates not only the completion status. Suppose a requested operation has 5 steps, USM should get an indication when each step is completed.This is important because the admin want to see progress of the task in UI

Comment 14 Alfredo Deza 2016-02-15 18:07:48 UTC
We have documented a few things on the API in hope this makes it a bit clear:

Status reporting:

  polling: https://github.com/ceph/ceph-installer#polling
  callbacks: https://github.com/ceph/ceph-installer#callback-system

API endpoint interactions are per-host except for install tasks. Reasoning for each and expected behavior are documented:

  installing: https://github.com/ceph/ceph-installer#install-operations
  configuring: https://github.com/ceph/ceph-installer#configure-operations

There are no "composite" tasks.

Comment 15 Alfredo Deza 2016-02-15 22:39:47 UTC
The docs now document how a full cluster install would look like:

https://github.com/ceph/ceph-installer#creating-a-cluster

Comment 16 Mrugesh Karnik 2016-02-17 05:10:58 UTC
(In reply to Alfredo Deza from comment #14)

WRT https://github.com/ceph/ceph-installer#setup-1

/etc/sudoers should not be modified. /etc/sudoers.d should be used to configure ceph-installer specific settings, including disabling requiretty for this user. No system configuration must be overridden, except for the ceph-installer user.

Comment 17 Alfredo Deza 2016-02-17 12:52:03 UTC
(In reply to Mrugesh Karnik from comment #16)
Thank you for catching that, we were not doing it in code anymore, docs were not updated. I have just made the changes to reflect this. We are only making changes to `/etc/sudoers.d/ceph-installer`

Comment 18 Mike McCune 2016-03-28 22:16:42 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 26 Martin Kudlej 2016-06-21 10:31:07 UTC
It is possible to create ceph cluster. Checked with ceph-installer-1.0.11-1.el7scon.noarch ->Verified

Comment 28 errata-xmlrpc 2016-08-23 19:46:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2016:1754