Bug 699782 - Default max_fsm_pages setting and cumin vacuum interval is not suitable for med/large scale
Summary: Default max_fsm_pages setting and cumin vacuum interval is not suitable for m...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: cumin
Version: Development
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: 2.0
: ---
Assignee: Trevor McKay
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
: 697640 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-26 15:42 UTC by Trevor McKay
Modified: 2011-06-23 13:15 UTC (History)
4 users (show)

Fixed In Version: cumin-0.1.4746-1.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-06-23 13:15:54 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 699859 0 medium CLOSED Release note on adjusting max_fsm_pages configuration for cumin postgresql database 2021-02-22 00:41:40 UTC

Description Trevor McKay 2011-04-26 15:42:27 UTC
Description of problem:

We need to reset defaults and/or provide instructions to end users on how to set the cumin vacuum interval and the postgres parameter max_fsm_pages.

In overnight testing, with 100+ submissions per second, around 4000 slots, we found that the free space in postgres was not being managed effectively.  This caused the database to "leak", since more space was needed per vacuum interval than could be tracked by postgres (so postgres went to disk for more).

Shortening the vaccuum interval to 15 minutes and increasing the max_fsm_pages value to 256K seems to be effective, but we're not sure if there is a useful heuristic at this point.  These numbers will be relative to submissions/completions, etc.

Comment 1 Trevor McKay 2011-04-26 17:32:26 UTC
*** Bug 697640 has been marked as a duplicate of this bug. ***

Comment 2 Trevor McKay 2011-04-26 19:06:22 UTC
The plan is to address this in two ways:

1) change the "out of the box" configuration, which includes multiple cumin-data instances for medium scale and up, to run vacuuming and sample expiration from a single thread with a 15 minute interval.

2) include a Release Note which covers setting the max_fsm_pages postgres parameter, a suggested value, and how to run a SQL command that will indicate whether or not the current value is appropriate. (BZ699859)

Comment 3 Trevor McKay 2011-04-26 20:00:40 UTC
Default config file fixed in revision 4741.

To test, do something like:

1) Run cumin

2) grep -l "is enabled" data.*.log
data.grid.log

3) grep -l "is disabled" data.*.log
data.grid-slots.log
data.grid-submissions.log
data.sesame.log

4) Wait 15 minutes

5) grep -l "Starting vacuum" data.*.log
data.grid.log

6) grep -l "Starting expire" data.*.log
data.grid.log


Note You need to log in before you can comment on or make changes to this bug.