Bug 1074632 - RFE: Manage storage node snapshots
Summary: RFE: Manage storage node snapshots
Keywords:
Status: ON_QA
Alias: None
Product: RHQ Project
Classification: Other
Component: Plugins, Storage Node
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: GA
: RHQ 4.13
Assignee: Thomas Heute
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1118104 1074633 1099024
TreeView+ depends on / blocked
 
Reported: 2014-03-10 17:28 UTC by John Sanda
Modified: 2022-03-31 04:27 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
: 1074633 (view as bug list)
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description John Sanda 2014-03-10 17:28:07 UTC
Description of problem:
Snapshots are generated weekly during scheduled maintenance and when nodes are (un)deployed. A snapshot consists of hard links to SSTable files; consequently, it takes up little disk space. But when an SSTable is deleted during compaction, space will not be reclaimed if the SSTable is included in a snapshot. This can add up over time. There is currently nothing in place for managing snapshots. Here are a few possible options,

1) Move snapshots older than X to a specified location

2) Move all snapshots to a specified location

3) Delete snapshots older than X

4) Move N snapshots (from oldest to youngest) to a specified location

5) Delete N snapshots (from oldest to youngest) to a specified location

This could be done as a reoccurring operation. We could also introduce some new metrics to monitor snapshot disk usage similar to what we already have for the data directories. If the disk usage exceeds a threshold, we can fire an alert and perform one of the above actions. This is another good step we should take for providing storage node disk management.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Heiko W. Rupp 2014-05-08 14:43:00 UTC
Bump the target version now that 4.11 is out.

Comment 2 Libor Zoubek 2014-07-14 15:28:39 UTC
Based on discussion with John:

 - I'll create Storage Cluster Settings (under Topology/StorageNodes/Settings) that would provide interface to above snapshot management options
 - I'll implement periodic snaphosts as scheduled resource operations
 - when this settings changes, we'll reschedule operations on all storage nodes
 - need to schedule when new node joins to cluster
 - by default we'll completely disable automatic snapshots

Comment 3 Libor Zoubek 2014-07-24 09:47:24 UTC
1st. part merged to master

commit fe30df29d11d85bff76efca6e26302e4b6f96429
Merge: 181e5e7 2ac7010
Author: jsanda <jsanda>
Date:   Wed Jul 23 09:34:08 2014 -0400

    Merge pull request #95 from lzoubek/bugs/1074632

    Bug 1074632 - RFE: Manage storage node snapshots

Comment 4 Libor Zoubek 2014-07-25 13:33:22 UTC
Also this commit fixes tests which did not pass on jenkins

https://github.com/rhq-project/rhq/commit/2de223b66109636a8b073e70e510f35cd6b70238


Note You need to log in before you can comment on or make changes to this bug.