Bug 1098243

Summary:

Add auto-throttling support for storage client subsystem

Product:

[JBoss] JBoss Operations Network

Reporter:

John Sanda <jsanda>

Component:

Core Server, Storage Node

Assignee:

Simeon Pinder <spinder>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Armine Hovsepyan <ahovsepy>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

JON 3.2

CC:

ahovsepy, fbrychta, genman, hrupp, loleary, lzoubek, mfoley, miburman

Target Milestone:

CR01

Target Release:

JON 3.3.1

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

1045589

Environment:

Last Closed:

2015-02-27 19:58:22 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1126209, 1045589, 1106508, 1108095, 1149342

Bug Blocks:

Attachments:

Description	Flags
throttling-config	none
rhq-server-properties-331	none
throttling-config-gui-331	none
rhq-server-properties-331_new	none

Description John Sanda 2014-05-15 14:42:55 UTC

+++ This bug was initially created as a clone of Bug #1045589 +++

Description of problem:
The changes for bug 1009945 put a new throttling mechanism in place for the storage client. Previously, only the insertion of raw metrics was throttled since that was the only place where async requests were used. We should implement an auto-throttling strategy that is configurable through the RHQ plugin (see bug 1045584).

If the storage cluster cannot keep up with the requests being submitted by an RHQ server and we consequently start hitting timeout exceptions, then we should increase the throttling to avoid further request timeouts. We might also want to have a pre-defined alert definition for this as well.

When the storage cluster grows, it stands to reason that we ought to decrease the throttling to improve throughput. And if the cluster decreases in size, then we ought to increase the throttling. Again, all of this should be configurable in some way since the performance characteristics will vary from environment to environment.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Elias Ross on 2014-01-09 14:47:52 EST ---


What I did many years ago for an SMPP server (SMS messages) is create a notion of a window size. This is done for TCP as well. You simply track the number of outstanding requests and then either block or buffer additional requests until the window size goes down. I used JMS at the time to buffer but you can always reply to the client to stop sending more traffic, etc. (Not sure this would really help matters though, unless it was a temporary issue.)

But it seems the driver already has this notion. Wouldn't you simply get an exception immediately if more than, say 1000 async outstanding requests were made?

+        PoolingOptions poolingOptions = builder.poolingOptions();
poolingOptions.setMaxSimultaneousRequestsPerConnectionThreshold(HostDistance.LOCAL+            option("rhq.storage.client.local-simultaneous", 1000));
poolingOptions.setMaxSimultaneousRequestsPerConnectionThreshold(HostDistance.REMOT+            option("rhq.storage.client.remote-simultaneous", 1000));

Anyway needs more research I guess.

--- Additional comment from John Sanda on 2014-01-09 15:15:57 EST ---

The solution you described is in effect what RateLimiter does. The driver does not have any built-in throttling. Within the constraints of the PoolingOptions used, the driver will fire off requests as fast it can. This makes it possible and easy for the client to overwhelm Cassandra without any restraint. For RHQ this is a concern particularly when merging measurement reports where we can quickly wind up with lots of concurrent writes.

--- Additional comment from Elias Ross on 2014-01-09 15:29:34 EST ---

If Cassandra is busy, then the number of outstanding requests will increase, then eventually the driver will indicate Cassandra is saturated and prevent any additional requests. I would consider this 'throttling', like how TCP does throttling using window size, etc. Effectively you are bound by the number of max connections times the number of max simultaneous requests. When all connections are fully busy, then you should see the TimeoutException.

I agree that setting up rate limiting on the server side may help deal with temporary spikes in load, i.e. metrics aggregation jobs that can safely span an hour, but does not make sense for insertion of raw metrics IMHO.

The primary issue is you don't have any way to identify the correct way to limit rates. If you guess too low, then you end up with a growing backlog. If you guess too high, then you hit the Cassandra/driver limit anyway.

What you could do is calculate the job duration for aggregation, and if it is less than < 15 minutes, then add throttling automatically. And visa-versa.

--- Additional comment from Elias Ross on 2014-01-10 14:11:31 EST ---

Yes I see now that every server thread raising a TimeoutException at the same time will cause trouble and the driver can't tell you when this might happen.

Still, I can imagine scenarios where the rate is fine 99% of the time, but then (like you're running 'nodetool repair'), shit hits the fan, so to speak: All the requests back up, clients retry, and repeat until the performance is better. Fundamentally, I see the rate limiter as sort of a hack on top of bad driver design, so I'm not really in favor of it...

What you kind of want is a way to detect the driver limit is being reached, then block the calling threads, causing blockage on the client (producer) side.

I do agree that the aggregate jobs probably need to be rate limited in some way, just to prevent unnecessary saturation.

--- Additional comment from John Sanda on 2014-01-10 16:49:50 EST ---

@Elias,

Your analogy with TCP window size is a good one, and RateLimiter is what provides that behavior. Without any throttling, we cannot make any guarantees about request rates. Suppose there is a burst of incoming measurement reports, and at the same time, a REST and/or remote client pushing/pulling a lot of metric data? Or when those measurement reports come in, the driver just about reaches the point of saturation, then there some requests to load metric graph, and this pushes us past the tipping point. What would happen? A lot of requests would wind up failing and overall throughput would suffer.

By using RateLimiter, we can make some guarantees around request throughput. And by making it configurable (see bug 1045584), the throughput can be tailored to your environment. Another thing to keep in mind is that for testing purposes I have been working with only a single node. These errors are far less likely to happen in a production deployment with multiple nodes.

--- Additional comment from Elias Ross on 2014-01-15 15:10:09 EST ---


There are a couple of issues with RateLimiter. I'd like to make some suggestions.

1) The metrics index update: If you rate limit within a execution thread pool, you can end up with a lot of threads or queued future tasks. Neither one is great. I've run out of memory in some situations. I think the metrics index update should not require any write permits. There is no feedback mechanism to the client side. Anyway, the rate limiting is already happening on the front end.

2) I would suggest you obtain N permits for N data points, rather than for each data point. The reason is writes can stall in the middle.

@@ -351,8 +351,8 @@ public void addNumericData(final Set<MeasurementDataNumeric> dataSet,
             final long startTime = System.currentTimeMillis();
             final AtomicInteger remainingInserts = new AtomicInteger(dataSet.size());
 
+            writePermits.acquire(dataSet.size());
             for (final MeasurementDataNumeric data : dataSet) {
-                writePermits.acquire();

--- Additional comment from John Sanda on 2014-02-10 11:58:08 EST ---

Changes have finally landed in master. Here is a brief summary of changes. The throttling, which is done using a RateLimiter, has been pushed down into StorageSession. The code base is no longer littered with calls to RateLimter.acquire(). when there is a topology change, (e.g., node added/removed or node up/down), throttling will be adjusted by a configurable amount. If there is a client-side request timeout, throttling will be increased by a configurable amount. The properties that control and tune the throttling are all defined in rhq-server.properties. Just a reminder that bug 1045584 has been created so that these settings will be exposed the resource configuration of the RHQ Server. Here are the newly added properties:

rhq.storage.request.limit
rhq.storage.request.limit.topology-delta
rhq.storage.request.limit.timeout-delta
rhq.storage.request.limit.timeout-dampening
rhq.storage.request.limit.min


The properties are documented in rhq-server.properties. I will also update this design doc[1] as well as add a page in the user docs here[1]

[1] https://docs.jboss.org/author/display/RHQ/Request+Throttling
[2] https://docs.jboss.org/author/display/RHQ/RHQ+Storage+Cluster+Administration

--- Additional comment from Heiko W. Rupp on 2014-04-23 08:30:55 EDT ---

Bulk closing of 4.10 issues.

If an issue is not solved for you, please open a new BZ (or clone the existing one) with a version designator of 4.10.

Comment 2 John Sanda 2014-09-02 18:58:22 UTC

The upstream bug 1045589 has already been closed, but this needs to be revisited. We have since upgraded the DataStax driver. There may have been some changes in the driver around when it sends event notifications that could impact the throttling. I am setting the target milestone to ER4 to allow for time to revisit this.

Comment 3 John Sanda 2014-09-26 21:25:44 UTC

The changes for this have actually been in the release branch. I left it ASSIGNED though due to ongoing work for other, related bugs. Earlier today while testing I realized that there is an issue that needs to be resolved.

When a timeout occurs, StorageSession.java decreases the request limit. The request limit however, will not be increased to its original rate without a server restart. This is bad because timeouts are likely to occur during spikes or bursts in traffic. For example, if a server has been offline for a while, agents will spool measurement reports. When the server comes back online, it will get hammered with those measurement reports. We are more susceptible to timeouts during bursts like this. When that burst or spike in traffic has subsided, we should ideally return to the original request per second rate. Now if there is a pattern of repeated timeouts of a period of time that includes peaks and valleys in request traffic, that is likely an indicator that we either need to,

A) decrease the rate limit
B) decrease load
C) deploy additional nodes

I think that dealing with repeated timeouts over a sustained interval is beyond the scope of this bug, but we do need to deal with the temporary spikes so that a server restart is not needed to get back to the original request limit. We will have to tackle this in ER05.

Comment 4 Michael Burman 2014-10-10 06:00:13 UTC

Added to master:

commit 57c72fd8254998281c157852d0f3bf3e363c09a6
Author: Michael Burman <miburman>
Date:   Tue Oct 7 12:36:50 2014 +0300

    [BZ 1098243] [BZ 1149342] When encountering a timeout on StorageSession, recreate RateLimiter with higher warmup time.

Comment 5 Libor Zoubek 2014-10-14 13:25:50 UTC

branch: release/jon3.3.x
commit b0031db6702b26520b92fb30b83774dcb53f86fb
Author: Michael Burman <miburman>
Date:   Tue Oct 7 12:36:50 2014 +0300

    [BZ 1098243] [BZ 1149342] When encountering a timeout on StorageSession, recreate RateLimiter with higher warmup time.
    
    (cherry picked from commit 57c72fd8254998281c157852d0f3bf3e363c09a6)
    Signed-off-by: Libor Zoubek <lzoubek>

Comment 6 Simeon Pinder 2014-10-21 20:24:18 UTC

Moving to ON_QA as available to test with the latest brew build:
https://brewweb.devel.redhat.com//buildinfo?buildID=394734

Comment 7 Armine Hovsepyan 2014-10-31 13:55:19 UTC

missleading description on gui - screenshot attached

Comment 8 Armine Hovsepyan 2014-10-31 13:55:52 UTC

Created attachment 952512 [details]
throttling-config

Comment 9 John Sanda 2014-10-31 14:13:51 UTC

I agree that the description should be updated. The property descriptions in rhq-server.properties should also be updated.

Comment 10 John Sanda 2014-10-31 14:14:11 UTC

I agree that the description should be updated. The property descriptions in rhq-server.properties should also be updated.

Comment 11 Armine Hovsepyan 2014-10-31 15:34:33 UTC

reassigning based on comment #10

Comment 13 Mike Foley 2014-10-31 17:02:27 UTC

passing tcms testcase execution run for throttling 

https://tcms.engineering.redhat.com/run/191519/

Comment 15 Larry O'Leary 2015-01-05 22:15:04 UTC

Please clarify whether this is functionally complete? From the looks of it, the only remaining task for this BZ is to fix the property description in the JBoss ON UI and rhq-server.properties file?

Target is being set to 3.3.1 as description change would be low risk.

Comment 16 Michael Burman 2015-01-07 08:47:47 UTC

Yes, this is functionally complete, but needs updates to the UI & properties-file to remove redundant stuff.

Comment 17 Michael Burman 2015-01-07 12:54:35 UTC

Fixed in master, removed the old information and added new configuration options (as well as metric for the current warmup period if user wants to create an alert based on throttling).

commit e8a902ed399920177c2cb2bf4801be7ca61309ba
Author: Michael Burman <miburman>
Date:   Wed Jan 7 14:53:30 2015 +0200

    [BZ 1098243] Remove redundant request limit configuration options and make the warmup periods configurable through the UI

Comment 18 Libor Zoubek 2015-01-12 09:43:15 UTC

branch:  release/jon3.3.x
link:    https://github.com/rhq-project/rhq/commit/a8cbf18cf
time:    2015-01-12 10:42:33 +0100
commit:  a8cbf18cf25eb8dfa20886ebc5b7dcc40ce5d06b
author:  Michael Burman - miburman
message: [BZ 1098243] Remove redundant request limit configuration options and
         make the warmup periods configurable through the UI
         (cherry picked from commit
         e8a902ed399920177c2cb2bf4801be7ca61309ba) Signed-off-by: Libor
         Zoubek <lzoubek>

Comment 19 Simeon Pinder 2015-01-26 08:15:07 UTC

Moving to ON_QA as available for test with the latest 3.3.1.ER01 bits from here:
http://download.devel.redhat.com/brewroot/packages/org.jboss.on-jboss-on-parent/3.3.0.GA/12/maven/org/jboss/on/jon-server-patch/3.3.0.GA/jon-server-patch-3.3.0.GA.zip

Comment 21 Armine Hovsepyan 2015-01-28 13:39:33 UTC

1. Fix is visible on GUI -- see screenshot attached
2. rhq-server.properties fix (from https://github.com/rhq-project/rhq/commit/a8cbf18cf) - is not visible after applying patch --  seems not included into patch script -- screenhost attached.

reassigning back

Comment 22 Armine Hovsepyan 2015-01-28 13:41:48 UTC

Created attachment 985159 [details]
rhq-server-properties-331

Comment 23 Armine Hovsepyan 2015-01-28 13:43:21 UTC

Created attachment 985160 [details]
throttling-config-gui-331

Comment 25 Simeon Pinder 2015-02-16 04:49:32 UTC

Moving to ON_QA as available to test with latest CP build:
http://download.devel.redhat.com/brewroot/packages/org.jboss.on-jboss-on-parent/3.3.0.GA/16/maven/org/jboss/on/jon-server-patch/3.3.0.GA/jon-server-patch-3.3.0.GA.zip
*Note: jon-server-patch-3.3.0.GA.zip maps to CR01 build of jon-server-3.3.0.GA-update-01.zip.

Comment 26 Armine Hovsepyan 2015-02-16 10:43:09 UTC

verified in JON 3.3.1 CR1
1. rhq-server.properties.new is created with correct changed
2. applied patch with rhq-server.properties merged with properties.new

Comment 27 Armine Hovsepyan 2015-02-16 10:45:03 UTC

Created attachment 992109 [details]
rhq-server-properties-331_new