| Summary: | Metrics not working with 3.1 metrics images | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jaspreet Kaur <jkaur> | ||||
| Component: | Hawkular | Assignee: | Matt Wringe <mwringe> | ||||
| Status: | CLOSED CANTFIX | QA Contact: | Peng Li <penli> | ||||
| Severity: | low | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 3.1.0 | CC: | aos-bugs, jcantril, jkaur, mwringe, pweil | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2017-02-08 20:17:03 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Jaspreet Kaur
2016-11-16 12:13:42 UTC
Created attachment 1221133 [details]
Cassandra logs
The stats_resolution value dictates how often metrics are gathered. In 3.1 the default is 10s. Setting the stats_resolution to 5s is going to cause metrics to be gathered more often, which will cause increased load on the system. Setting the stats_resolution above something like 28-29s is going to cause problems at the 30 minute interval in the console since its expecting data to be collected more often. This will cause empty segments in the graph at the 30 minute view, but not at the 1 hour view or greater. If you want to use a hostPath setup, the steps to follow are below.
Deploy metrics, but with 'USE_PERSISTENT_STORGE=false' since we don't want it to be creating and waiting for a PVC.
# scale down the metric components:
oc scale rc heapster --replicas=0; oc scale rc hawkular-metrics --replicas=0;oc scale rc hawkular-cassandra-1 --replicas=0
# grant permissions to the 'cassandra' service account to allow it to use a hostPath:
oadm policy add-scc-to-user privileged system:serviceaccount:openshift-infra:cassandra
# update the hawkular-cassandra-1 template to specify that it needs a privileged permission since it wants to use a hostPath:
oc patch rc hawkular-cassandra-1 -p '{"spec":{"template":{"spec":{"containers":[{"name":"hawkular-cassandra-1","securityContext":{"privileged": true}}]}}}}'
# change the volume from emptyDir to a hostpath (where $CASSANDRA_DATA_DIRECTORY is the directory you want to store metrics to):
oc set volumes rc hawkular-cassandra-1 --add --overwrite --name=cassandra-data --type=hostPath --path=$CASSANDRA_DATA_DIRECTORY
# specify that the hawkular-cassandra-1 pod can only be deployed to a specific node:
oc patch rc hawkular-cassandra-1 -p '{"spec":{"template":{"spec":{"nodeSelector":{"${NODE_LABEL}":"${NODE_KEY}"}}}}}'
# bring back up the metric components:
oc scale rc heapster --replicas=1; oc scale rc hawkular-metrics --replicas=1;oc scale rc hawkular-cassandra-1 --replicas=1
Note: we will probably make this a more official option in a future release of OpenShift Metrics so that you only need to specify a few parameters to the deployer pod and it will take care of the rest.
I am closing this issue as 'CANTFIX', this is an issue with the speed of the writes to the network attached storage. There are other types of network attached storage that the user can use, or they can use host volumes, which will not have this issue. |