Description of problem: Current version has a limitation that it can run only on single monitor nodes, else it causes inconsistency in the data it reports. This introduces scalability issues as well as a single point of failure. Calamari should run on all monitor nodes with consistent results for READ operations
my memory says that the decision to run on a single monitor was to prevent memory leaks in calamari from causing data loss in ceph if it were to simultaneously cause 3 mons to fail. This concern has been mitigated with some systemd limits. If there are specifics on the data inconsistency you claim would please share them so we can address?
memory leaks was one of the issue. The core issue was the data inconsistency where you issue write to multiple calamari instances on different MONs. If I remember correctly, its pretty straight reproduce as well. 1. Create a ceph cluster with 3 MONS and start calamari on each of these. 2. Create a ceph pool by sending requests to each of these calamari instances 3. List the ceph pools from each of these calamari instances
We won't be able to synchronize the data this way in 2.2 the system currently will handle the situation that you describe above in this way If you are creating three distinct pools A B and C: 1. can work 2. will work 3. eventually all instance of calamari will report the same data existing pools + A B and C If you are trying to create the same pool A on each node 1. can work 2. the requests will eventually only succeed once on which ever node gets the pool create to ceph first. 3. same as above This is what we can do in 2.2 with what is upstream right now. Perhaps we could discuss the merits of your approach in ceph-integration in the Tendrl project?
(In reply to Gregory Meno from comment #4) > We won't be able to synchronize the data this way in 2.2 > > the system currently will handle the situation that you describe above in > this way > > If you are creating three distinct pools A B and C: > 1. can work > 2. will work > 3. eventually all instance of calamari will report the same data existing > pools + A B and C > > If you are trying to create the same pool A on each node > 1. can work > 2. the requests will eventually only succeed once on which ever node gets > the pool create to ceph first. > 3. same as above > > This is what we can do in 2.2 with what is upstream right now. Perhaps we > could discuss the merits of your approach in ceph-integration in the Tendrl > project? If we can make the above scenarios work, I think that solves the problem for the us. The limitation of working with a single instance of calamari was an issue. If that is resolved and the data is consistent across all calamari instances regardless of the instance you are operating on is good enough for the time being. In Tendrl the data is pushed to a central store(etcd) which is accessed by all. So the issue of data inconsistency never arises between the instances
I don't see any additional work here on my end. Would you please let me know when you have succeeded in this configuration
I need to take a look at the docs and then this will go to ON_QA
it's just a matter of updating the docs to read "setup up calamari on all monitors"
@Gregory, in connection to this can you please reply on https://github.com/Tendrl/specifications/issues/126?
Harish confirmed.
No no specific code change to support this.
Planned tests have passed without blockers. Moving this bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0514.html