| Summary: | [userinterface_public_602]Metrics charts for pod are truncated intermittently | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OKD | Reporter: | Yanping Zhang <yanpzhan> | ||||||
| Component: | Management Console | Assignee: | Fabiano Franz <ffranz> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Yadan Pei <yapei> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 3.x | CC: | aos-bugs, ffranz, jforrest, mmccomas, spadgett | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2016-09-19 13:52:32 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
|
Description
Yanping Zhang
2016-04-12 09:39:56 UTC
Created attachment 1146296 [details]
metrics-on-chrome
Created attachment 1146299 [details]
metrics-on-safari
I don't think it's a Safari vs Chrome problem. There's simply not enough data from Hawkular to fill in this area of the chart. This seems to happen if the system is under too much load. https://github.com/openshift/origin/issues/7679#issuecomment-190369428 Jessica, would you object to making the smallest time range one hour? It'd greatly reduce the chances of charts like this. It'd also allow us to use point.max - point.min for calculated usage rates for cumulative metrics, which would be much cleaner. Hmm I suppose so for now, although i'd like to see us be able to have smaller time ranges in the future. Assume we are dependent on metrics sampling more often for that to be possible? The sampling is going to be increased to every 10s from every 30s, but if this is a problem when the system is under load, it might not matter. Adding Matt. Matt, I believe this is the same underlying issue as origin #7679. The sampling update is done, Sam should this be transferred to Matt related to the load issue? Yangping, can you confirm you had the latest metrics template when you tested. The frequency of sampling was increased to every 10s. Try running $ oc get rc/heapster -n openshift-infra -o yaml and look for "--metric_resolution=10s" under the container command. If not, we should try with the latest templates to check if you still see the issue. Marking ON_QA. I haven't seen issues with the new metric resolution value. Yangping, please make sure you test with the latest metrics template. On devenv-rhel7_4354, tested with latest metrics template and images. openshift/origin-metrics-cassandra latest 2aa439f8e002 3 hours ago 663.8 MB openshift/origin-metrics-hawkular-metrics latest 5549efe10a06 3 hours ago 770.7 MB openshift/origin-metrics-heapster latest 4fcf7f02cb2a 3 hours ago 753.2 MB openshift/origin-metrics-deployer latest b7215d58ab95 3 hours ago 704.8 MB $ oc get rc/heapster -n openshift-infra -o yaml "--metric_resolution=10s" is under the container command. Check the metrics on web, now the smallest Time Range is "Last hour", and charts are not truncated, the issue should have been fixed, so move the bug to Verified. |