Bug 1567251

Summary: Metrics Casandra PV running out of space due to snapshots
Product: OpenShift Container Platform Reporter: John Sanda <jsanda>
Component: HawkularAssignee: John Sanda <jsanda>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.9.0CC: aos-bugs, jsanda, juzhao, pdwyer, rvargasp
Target Milestone: ---   
Target Release: 3.10.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: auto_snapshot was true in cassandra.yaml, and due to changes in Hawkular Metrics introduced in OCP 3.7, lots of snapshots are generated. Consequence: The snapshots can eventually fill up the disk. Fix: Disable auto_snapshot and make snapshotting configurable. Snapshots will only be generated now when the installer runs if the openshift_metrics_cassandra_take_snapshot property is set to true (case insensitive) in the ansible inventory file. Result:
Story Points: ---
Clone Of: 1567250 Environment:
Last Closed: 2018-07-30 19:12:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1567222, 1567250    

Comment 1 John Sanda 2018-04-17 14:55:58 UTC
PRs with changes:

https://github.com/openshift/origin-metrics/pull/410
https://github.com/openshift/openshift-ansible/pull/7998

The Cassandra image now sets auto_snapshot to false in cassandra.yaml. The image also has a new environment variable, TAKE_SNAPSHOT, which defaults to false. The environment variable can be controlled by setting the openshift_metrics_cassandra_take_snapshot property in your ansible inventory file. When set to true, a snapshot will be generated during the post startup life cycle hook. The name of the snapshot directory will be a timestamp.

Comment 7 Junqi Zhao 2018-05-17 01:49:23 UTC
Default value for openshift_metrics_cassandra_take_snapshot is false, environment variable TAKE_SNAPSHOT is also false, when set openshift_metrics_cassandra_take_snapshot to true, environment variable TAKE_SNAPSHOT will be true, and can take snapshot.

openshift-ansible vesion
openshift-ansible-3.10.0-0.47.0.git.0.c018c8f.el7.noarch

Images:
metrics-heapster/images/v3.10.0-0.47.0.0
metrics-hawkular-metrics/images/v3.10.0-0.47.0.0
metrics-cassandra/images/v3.10.0-0.47.0.0
metrics-schema-installer/images/v3.10.0-0.47.0.0

Comment 9 errata-xmlrpc 2018-07-30 19:12:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816