+++ This bug was initially created as a clone of Bug #1074632 +++ Description of problem: Snapshots are generated weekly during scheduled maintenance and when nodes are (un)deployed. A snapshot consists of hard links to SSTable files; consequently, it takes up little disk space. But when an SSTable is deleted during compaction, space will not be reclaimed if the SSTable is included in a snapshot. This can add up over time. There is currently nothing in place for managing snapshots. Here are a few possible options, 1) Move snapshots older than X to a specified location 2) Move all snapshots to a specified location 3) Delete snapshots older than X 4) Move N snapshots (from oldest to youngest) to a specified location 5) Delete N snapshots (from oldest to youngest) to a specified location This could be done as a reoccurring operation. We could also introduce some new metrics to monitor snapshot disk usage similar to what we already have for the data directories. If the disk usage exceeds a threshold, we can fire an alert and perform one of the above actions. This is another good step we should take for providing storage node disk management. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Note that snapshots also appear to be created during server JBoss ON server start. Is this expected?
upstream work done, see comments Bug 1074632 for cherry-picking
cherry-picking commits 6640eabc6735327f8261f0 0bef943a102c2bd13f4296 2de223b66109636a8b073e and 2ac701032cf7b05de8a285 branch: release/jon3.3.x link: https://github.com/rhq-project/rhq/commit/a2a567746 time: 2014-08-11 18:56:54 +0200 commit: a2a56774602a65677b4ef36bbecea7c19137ca7c author: Libor Zoubek - lzoubek message: Bug 1074633 - RFE: Manage storage node snapshots 2nd piece of impl above BZ. This patch adds server side capability to manage snapshots for storage cluster. This basically means that we regularly run takeSnaphost operation with several parameters - so user can decide whether to keep all of them, keep last or move older ones to specified location. 6 new private/readonly system settings have been introduced - those settings can be updated only via Storage admin pages. User can enable/disable snapshots management of storage cluster and set cron expression to run management task regulary (additional 4 settings are basically parameters for takeSnapshot operation introduced within previous commit for this BZ). When Cluster Setting is saved in UI we re-schedule takeSnapshot operations on all StorageNode resources in inventory. Snapshot related code was removed from StorageNodeManagerBean#runClusterMaintenance() (cherry picked from commit 6640eabc6735327f8261f0fed23afd942d0ce801) Signed-off-by: Jirka Kremser <jkremser> branch: release/jon3.3.x link: https://github.com/rhq-project/rhq/commit/b9a429f70 time: 2014-08-11 18:56:54 +0200 commit: b9a429f70973e757f2e89c13421c24bf03cafa80 author: Libor Zoubek - lzoubek message: Bug 1074633 - RFE: Manage storage node snapshots next attempt to fix storageNode takeSnapshots itests (cherry picked from commit 0bef943a102c2bd13f42964fa35c8dd651ca9790) Signed-off-by: Jirka Kremser <jkremser> branch: release/jon3.3.x link: https://github.com/rhq-project/rhq/commit/1da0e2100 time: 2014-08-11 18:56:54 +0200 commit: 1da0e2100b8336d5d4cdf298bbaf618392e44b2c author: Libor Zoubek - lzoubek message: Bug 1074633 - RFE: Manage storage node snapshots StorageNodeComponentItest - output more details when assertion fails (cherry picked from commit 2de223b66109636a8b073e70e510f35cd6b70238) Signed-off-by: Jirka Kremser <jkremser> branch: release/jon3.3.x link: https://github.com/rhq-project/rhq/commit/b0ab71681 time: 2014-08-11 18:56:54 +0200 commit: b0ab71681d20379c0c98c04d9c252aa5eab3a1fe author: Libor Zoubek - lzoubek message: Bug 1074633 - RFE: Manage storage node snapshots Added "Take Snapshot" operation to StorageNode resource. This operation takes several other parameters and it can either move or delete older snapshots. Basically it allows to - keep N latest snapshots and move/delete older ones - keep snapshots not older than N days and move/delete the older ones (cherry picked from commit 2ac701032cf7b05de8a285210de619a5f62e5191) Signed-off-by: Jirka Kremser <jkremser>
Additional commit went to master & release branch master commit 1c43d4db43f575c22ecf37d3bdb67e4ae449b613 Author: Libor Zoubek <lzoubek> Date: Tue Aug 12 13:26:35 2014 +0200 Bug 1074633 - RFE: Manage storage node snapshots Improve UX - disable several fields in Cluster Settings based on other field's selection when not relevant. release branch commit 0cf44decd9d4e569b168b4634c5e4abbf5b2572e Author: Libor Zoubek <lzoubek> Date: Tue Aug 12 13:26:35 2014 +0200 Bug 1074633 - RFE: Manage storage node snapshots Improve UX - disable several fields in Cluster Settings based on other field's selection when not relevant. (cherry picked from commit 1c43d4db43f575c22ecf37d3bdb67e4ae449b613) Signed-off-by: Libor Zoubek <lzoubek>
Moving to ON_QA as available to test in the following brew build: https://brewweb.devel.redhat.com//buildinfo?buildID=379025
additional commit fixing test failure branch: master link: https://github.com/rhq-project/rhq/commit/aa02fec5a time: 2014-09-25 09:59:04 +0200 commit: aa02fec5a20eb492bb911f5117f5eb1b020582d2 author: Libor Zoubek - lzoubek message: Bug 1074633 - RFE: Manage storage node snapshots Fix NPE that occured in org.rhq.enterprise.server.discovery.DiscoveryBossBeanTest.testAutoImportStorageNode
branch: master link: https://github.com/rhq-project/rhq/commit/41802d12c time: 2014-09-25 22:17:02 +0200 commit: 41802d12c152863f271c0aa8ad00902d62c3bbf9 author: Libor Zoubek - lzoubek message: Bug 1074633 - RFE: Manage storage node snapshots avoid NPE
test failure fixed in master - cherry-picked to release branch branch: release/jon3.3.x link: https://github.com/rhq-project/rhq/commit/8e5616c2b time: 2014-09-26 09:57:46 +0200 commit: 8e5616c2be3f8b9d5ef215ead8c27e94b1083303 author: Libor Zoubek - lzoubek message: Bug 1074633 - RFE: Manage storage node snapshots avoid NPE (cherry picked from commit 41802d12c152863f271c0aa8ad00902d62c3bbf9) Signed-off-by: Libor Zoubek <lzoubek> branch: release/jon3.3.x link: https://github.com/rhq-project/rhq/commit/26d8e06ce time: 2014-09-26 09:57:38 +0200 commit: 26d8e06ced4836f70eb87c3bf385dfc066a0a0a4 author: Libor Zoubek - lzoubek message: Bug 1074633 - RFE: Manage storage node snapshots Fix NPE that occured in org.rhq.enterprise.server.discovery.DiscoveryBossBeanTest.testAutoImportStorageNode (cherry picked from commit aa02fec5a20eb492bb911f5117f5eb1b020582d2) Signed-off-by: Libor Zoubek <lzoubek>
Moving to ON_QA as available for test with build: https://brewweb.devel.redhat.com/buildinfo?buildID=388959
moving to ER05 to verify with #1141885
fixed tests branch: master link: https://github.com/rhq-project/rhq/commit/56c84f19c time: 2014-10-14 17:18:48 +0200 commit: 56c84f19c4e4a39de0d0aea67f4ccdcf905c0b28 author: Libor Zoubek - lzoubek message: Bug 1074633 - RFE: Manage storage node snapshots Fix StorageNodeComponentItest - basically our tests were running too fast. Failures happened beause of fact, that lastModified file attribute is stored in seconds - so our tests tended to generate several snapshots within same second and that led to incorrect behavior of tested "takeSnapshot" operation. Several console logging has also been added. changes. Lines starting branch: release/jon3.3.x link: https://github.com/rhq-project/rhq/commit/084594a8f time: 2014-10-14 17:19:47 +0200 commit: 084594a8fdd9de051a0dedb21889077cc9990e02 author: Libor Zoubek - lzoubek message: Bug 1074633 - RFE: Manage storage node snapshots Fix StorageNodeComponentItest - basically our tests were running too fast. Failures happened beause of fact, that lastModified file attribute is stored in seconds - so our tests tended to generate several snapshots within same second and that led to incorrect behavior of tested "takeSnapshot" operation. Several console logging has also been added. changes. Lines starting (cherry picked from commit 56c84f19c4e4a39de0d0aea67f4ccdcf905c0b28) Signed-off-by: Libor Zoubek <lzoubek>
all realted bug verified, rfe is qualified. thank you