Bug 1401936 - Calamari to run on all monitor nodes
Summary: Calamari to run on all monitor nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Calamari
Version: 2.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: 2.2
Assignee: Christina Meno
QA Contact: Harish NV Rao
Bara Ancincova
URL:
Whiteboard:
Depends On:
Blocks: 1406357
TreeView+ depends on / blocked
 
Reported: 2016-12-06 12:34 UTC by Nishanth Thomas
Modified: 2022-02-21 18:17 UTC (History)
6 users (show)

Fixed In Version: RHEL: calamari-server-1.5.0-1.el7cp Ubuntu: calamari_1.5.0-2redhat1xenial
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-03-14 15:47:18 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0514 0 normal SHIPPED_LIVE Red Hat Ceph Storage 2.2 bug fix and enhancement update 2017-03-21 07:24:26 UTC

Description Nishanth Thomas 2016-12-06 12:34:37 UTC
Description of problem:

Current version has a limitation that it can run only on single monitor nodes, else it causes inconsistency in the data it reports. This introduces scalability issues as well as a single point of failure. Calamari should run on all monitor nodes with consistent results for READ operations

Comment 2 Christina Meno 2016-12-07 06:28:50 UTC
my memory says that the decision to run on a single monitor was to prevent memory leaks in calamari from causing data loss in ceph if it were to simultaneously cause 3 mons to fail. This concern has been mitigated with some systemd limits.

If there are specifics on the data inconsistency you claim would please share them so we can address?

Comment 3 Nishanth Thomas 2016-12-08 13:42:53 UTC
memory leaks was one of the issue. The core issue was the data inconsistency where you issue write to multiple calamari instances on different MONs. If I remember correctly, its pretty straight reproduce as well. 

1. Create a ceph cluster with 3 MONS and start calamari on each of these.  
2. Create a ceph pool by sending requests to each of these calamari instances
3. List the ceph pools from each of these calamari instances

Comment 4 Christina Meno 2016-12-13 18:01:37 UTC
We won't be able to synchronize the data this way in 2.2

the system currently will handle the situation that you describe above in this way

If you are creating three distinct pools A B and C:
1. can work
2. will work
3. eventually all instance of calamari will report the same data existing pools + A B and C

If you are trying to create the same pool A on each node
1. can work
2. the requests will eventually only succeed once on which ever node gets the pool create to ceph first.
3. same as above

This is what we can do in 2.2 with what is upstream right now. Perhaps we could discuss the merits of your approach in ceph-integration in the Tendrl project?

Comment 5 Nishanth Thomas 2016-12-15 07:58:33 UTC
(In reply to Gregory Meno from comment #4)
> We won't be able to synchronize the data this way in 2.2
> 
> the system currently will handle the situation that you describe above in
> this way
> 
> If you are creating three distinct pools A B and C:
> 1. can work
> 2. will work
> 3. eventually all instance of calamari will report the same data existing
> pools + A B and C
> 
> If you are trying to create the same pool A on each node
> 1. can work
> 2. the requests will eventually only succeed once on which ever node gets
> the pool create to ceph first.
> 3. same as above
> 
> This is what we can do in 2.2 with what is upstream right now. Perhaps we
> could discuss the merits of your approach in ceph-integration in the Tendrl
> project?

If we can make the above scenarios work, I think that solves the problem for the us. The limitation of working with a single instance of calamari was an issue. If that is resolved and the data is consistent across all calamari instances regardless of the instance you are operating on is good enough for the time being.

In Tendrl the data is pushed to a central store(etcd) which is accessed by all. So the issue of data inconsistency never arises between the instances

Comment 6 Christina Meno 2017-01-04 19:38:31 UTC
I don't see any additional work here on my end. Would you please let me know when you have succeeded in this configuration

Comment 7 Christina Meno 2017-01-11 15:59:56 UTC
I need to take a look at the docs and then this will go to ON_QA

Comment 8 Christina Meno 2017-01-26 06:06:39 UTC
it's just a matter of updating the docs to read "setup up calamari on all monitors"

Comment 9 Christina Meno 2017-01-26 06:06:49 UTC
it's just a matter of updating the docs to read "setup up calamari on all monitors"

Comment 13 Shubhendu Tripathi 2017-02-01 09:53:28 UTC
@Gregory, in connection to this can you please reply on https://github.com/Tendrl/specifications/issues/126?

Comment 17 Christina Meno 2017-02-28 17:03:09 UTC
Harish confirmed.

Comment 19 Christina Meno 2017-02-28 18:20:14 UTC
No no specific code change to support this.

Comment 20 Harish NV Rao 2017-03-01 07:23:43 UTC
Planned tests have passed without blockers. Moving this bug to verified state.

Comment 25 errata-xmlrpc 2017-03-14 15:47:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0514.html


Note You need to log in before you can comment on or make changes to this bug.