1319833 – Nodes are processed one by one (non parallel) way during Create Cluster task

Bug 1319833 - Nodes are processed one by one (non parallel) way during Create Cluster task

Summary: Nodes are processed one by one (non parallel) way during Create Cluster task

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Storage Console
Classification:	Red Hat Storage
Component:	Ceph
Sub Component:
Version:	2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3
Assignee:	Shubhendu Tripathi
QA Contact:	sds-qe-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:	1319856
Blocks:
TreeView+	depends on / blocked

Reported:	2016-03-21 15:50 UTC by Martin Bukatovic
Modified:	2018-11-19 05:42 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-11-19 05:42:18 UTC
Embargoed:

Attachments	(Terms of Use)

Description Martin Bukatovic 2016-03-21 15:50:26 UTC

Description of problem
======================

When machines are configured during *Create Cluster* task, the process is not
done in parallel. For example: when packages are installed on one machine,
the other machines are waiting and nothing is happening on them.

Without fixing this BZ, it would not be possible to use USM to 
create a production sized clusters.

Version-Release number of selected component
============================================

osd machine:
ceph-0.94.5-9.el7cp.x86_64
ceph-common-0.94.5-9.el7cp.x86_64
ceph-osd-0.94.5-9.el7cp.x86_64
rhscon-agent-0.0.3-3.el7.noarch

mon machine:
ceph-0.94.5-9.el7cp.x86_64
ceph-common-0.94.5-9.el7cp.x86_64
ceph-mon-0.94.5-9.el7cp.x86_64
rhscon-agent-0.0.3-3.el7.noarch

usm server machine:
ceph-0.94.5-9.el7cp.x86_64
ceph-ansible-1.0.1-1.20160307gitb354445.el7.noarch
ceph-common-0.94.5-9.el7cp.x86_64
redhat-ceph-installer-0.2.3-1.20160304gitb3e3c68.el7.noarch
rhscon-ceph-0.0.6-14.el7.x86_64
rhscon-core-0.0.8-14.el7.x86_64
rhscon-ui-0.0.23-1.el7.noarch

How reproducible
================

100 %

Steps to Reproduce
==================

1. Prepare machines (following usm documentation) for USM.
   Allocate at least 3 machines for monitor machines (no extra disks), and
   4 machines (each with at least 2 extre disk for ceph OSDs)
2. In USM web interface, accept all machines.
3. Use *Create Cluster* wizard to setup a cluster, use all machines prepared
   in the step #1.
4. Check what is happening on each machine (ssh there and monitor cpu, disk
   and memory usage, run `top` there).
5. Wait for the process to finish.

Actual results
==============

The setup takes too long because setup is running only on the single machine
at a time.

This may also lead to the timeout caused failure of the Create Cluster task,
but this is not a concern of this BZ.

Expected results
================

There is some window of machines which are processed at the same time, and the
admin can configure the size of this window.

We should also evaluate if full parallel installation on all nodes is feasible.

Additional info
===============

The actuall component concerned with this may be different (devs told me that
this may be a ceph-installer problem as well) - dev should proper reevaluate
component of this BZ during investigation.

Comment 2 Martin Bukatovic 2016-03-21 17:55:23 UTC

Additional information
======================

From the task details page, we can see that it took about 4 minutes to install
packages on a machine, but since all were processed one by one, this process
takes too long and would not scale for actual sized clusters with hunderds of
machines:

~~~
Installing packages     Mar 18 2016, 08:34:14 PM                            
Installed packages on dhcp-126-80.lab.eng.brq.redhat.com:   Mar 18 2016, 08:39:06 PM
Installed packages on dhcp-126-85.lab.eng.brq.redhat.com:   Mar 18 2016, 08:43:10 PM
Installed packages on dhcp-126-81.lab.eng.brq.redhat.com:   Mar 18 2016, 08:46:08 PM
Installed packages on dhcp-126-84.lab.eng.brq.redhat.com:   Mar 18 2016, 08:49:21 PM
Installed packages on dhcp-126-82.lab.eng.brq.redhat.com:   Mar 18 2016, 08:50:09 PM
Installed packages on dhcp-126-79.lab.eng.brq.redhat.com:   Mar 18 2016, 08:53:41 PM
~~~

Comment 3 Shubhendu Tripathi 2018-11-19 05:42:18 UTC

This product is EOL now

Note You need to log in before you can comment on or make changes to this bug.