609534 – Perf: OOME when server is swamped with Calltime data

Bug 609534 - Perf: OOME when server is swamped with Calltime data

Summary: Perf: OOME when server is swamped with Calltime data

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	RHQ Project
Classification:	Other
Component:	Monitoring
Sub Component:
Version:	3.0.0
Hardware:	All
OS:	All
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	RHQ Project Maintainer
QA Contact:	Mike Foley
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	rhq-perf
TreeView+	depends on / blocked

Reported:	2010-06-30 14:19 UTC by Heiko W. Rupp
Modified:	2014-05-29 21:10 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2014-05-29 21:10:20 UTC
Embargoed:

Attachments	(Terms of Use)
Patch mentioned in the description (1.56 KB, application/octet-stream) 2010-06-30 14:19 UTC, Heiko W. Rupp	no flags	Details
View All

Description Heiko W. Rupp 2010-06-30 14:19:55 UTC

Created attachment 428005 [details]
Patch mentioned in the description

When the server is swamped with call time data (amount depends on JVM size, database etc. ; in my case RHQ server has 384M max heap and I am supplying around 800k-1M values per hour),
a little slowness on the DB (for example) makes CT data pile up and the Server yield an OOME.
Heap dump shows 
- Prepared Statement with 27MB in size
- 3 http threads with 77 MB of data each.

The attached (too simple) patch improves the situation a lot, as smaller chunks of data are sent to the database, so the PS does not get that big.

An improved version of the patch would 
- watch for the size of incoming data and not chop into too small pieces
- chunk the data that goes into alert processing as well
- null out the already processed data after the previous step to help garbage collection

Comment 1 Heiko W. Rupp 2010-06-30 15:25:40 UTC

With the patch I can (with the end_time index present) to 4min intervals, which mean ~ 1.4million values/hour. So I propose including this in the next release.

Comment 2 Heiko W. Rupp 2010-06-30 15:48:32 UTC

Actually it allows to process ~100k values per minute every minute which accounts for 6 million values per hour on a postgres instance on one laptop hard disk and a RHQ server VM with 384 MB of ram.

This is for a 20mins interval now, so no definitive proof, but vm statistics look good.

Note You need to log in before you can comment on or make changes to this bug.