Bug 1401926 - Calamari performance benchmarks
Summary: Calamari performance benchmarks
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Calamari
Version: 2.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: 3.0
Assignee: Christina Meno
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks: 1406357
TreeView+ depends on / blocked
 
Reported: 2016-12-06 12:22 UTC by Nishanth Thomas
Modified: 2022-02-21 18:06 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-07-12 17:38:34 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Nishanth Thomas 2016-12-06 12:22:51 UTC
Description of problem:

In a pre-defined environment, calamari should be stress tested for a
refresh interval of 2-3 seconds per API call.

Comment 2 Christina Meno 2016-12-07 06:24:43 UTC
Do you have a list of specific calls that are exceeding this?
My experience says that the only endpoint that won't be able to meet this is /cli are you seeing slowness elsewhere?

Comment 3 Nishanth Thomas 2016-12-08 14:11:32 UTC
Yeah, cli is a good candidate where we get a good number of monitoring data using this endpoint. Another endpoint where I experienced slowness is cushnode/crushrule related endpoints when operates on a reasonably big cluster(say 20 OSDs). Also we need to have a closer look at the apis(not yet) we will be using on a frequent basis(syncobject, pool, cluster etc)

Performance of individual endpoints is important, but I am more worried about how calamari handles a burst of requests over a period of time say one request per second(probably more than that). Do we have any data exists on this?

Comment 4 Christina Meno 2016-12-09 19:12:28 UTC
There is no real load test like data right now. I'm getting calamari_setup in our lab and will write some tests that stress it.

Yeah the crush endpoint is a bit of a concern now that you mention it. It's implementation is by writing out a new map on each change. Perhaps it would be better to try to group the modifications some how.

Comment 5 Christina Meno 2017-01-04 19:35:26 UTC
Sankarshan and I discussed this before break. What we agreed on was that I would share a procedure to gather the measurements and that the console team would gather and report what they found.

Here is the current procedure:
On a monitor node
source /opt/calamari/venv/bin/activate
supervisorctl stop calamari-lite
calamari-lite 2>&1 | tee /var/log/calamari/request_timing.log # run in
the foreground collect access logs that have format like this

172.21.0.100 - - [2017-01-03 19:05:39] "POST
/api/v2/cluster/27246bd8-969a-4c1d-ac5f-8e7b477ad901/pool HTTP/1.1"
202 8264 2.466778

Send me your results and we can investigate the slowest endpoints

Comment 6 Nishanth Thomas 2017-01-06 06:13:00 UTC
Setting up the test bench for the implementation of sequence of steps would require us to free up hardware and all that would need around 10 days. I will start on this and update the results here.

Comment 7 Christina Meno 2017-01-11 16:04:27 UTC
Looks like we won't be able to do anything till next release (since we'll have measurements to late to act in the 2.2 cycle)


Note You need to log in before you can comment on or make changes to this bug.