+++ This bug was initially created as a clone of Bug #1369022 +++ Description of problem: In the last 12 hours we have detected several Pods (running Cassandra) that have eaten all the available CPU resources on the node, even though they had much lower quotas/limits. It seems that hawcular detects/interprets the CPU usage in a false way - it told us "-11320 available of 800 millicores" when the usage was 12120 millicores! See screenshot for details. Version-Release number of selected component (if applicable): 3.2 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from Alexander Koksharov on 2016-08-22 06:52 EDT --- --- Additional comment from Alexander Koksharov on 2016-08-22 06:53 EDT --- --- Additional comment from Alexander Koksharov on 2016-08-22 06:56:45 EDT --- last check done: Web console displays: "CPU -344 Available of 800 millicores" Whereas direct check through api: [root@i89540 ~]# curl http://localhost:8001/api/v1/namespaces/openshift-infra/services/https:heapster:/proxy/api/v1/model/namespaces/redko-dev/pods/cassandra-15-btpca/metrics/cpu-usage { "metrics": [ { "timestamp": "2016-08-22T02:42:00-04:00", "value": 1065 }, { "timestamp": "2016-08-22T02:42:10-04:00", "value": 1064 }, { "timestamp": "2016-08-22T02:42:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:43:00-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:43:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:43:40-04:00", "value": 1053 }, { "timestamp": "2016-08-22T02:44:00-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:44:10-04:00", "value": 1063 }, { "timestamp": "2016-08-22T02:44:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:45:00-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:45:10-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:45:40-04:00", "value": 1063 }, { "timestamp": "2016-08-22T02:45:50-04:00", "value": 1064 }, { "timestamp": "2016-08-22T02:46:00-04:00", "value": 1058 }, { "timestamp": "2016-08-22T02:46:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:46:40-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:46:50-04:00", "value": 1102 }, { "timestamp": "2016-08-22T02:47:20-04:00", "value": 1065 }, { "timestamp": "2016-08-22T02:47:30-04:00", "value": 1061 }, { "timestamp": "2016-08-22T02:48:00-04:00", "value": 1060 }, { "timestamp": "2016-08-22T02:48:10-04:00", "value": 1063 }, { "timestamp": "2016-08-22T02:48:40-04:00", "value": 1082 }, { "timestamp": "2016-08-22T02:49:00-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:49:10-04:00", "value": 1065 }, { "timestamp": "2016-08-22T02:49:40-04:00", "value": 1058 }, { "timestamp": "2016-08-22T02:49:50-04:00", "value": 1063 }, { "timestamp": "2016-08-22T02:50:00-04:00", "value": 1062 }, { "timestamp": "2016-08-22T02:50:10-04:00", "value": 1061 }, { "timestamp": "2016-08-22T02:50:20-04:00", "value": 1048 }, { "timestamp": "2016-08-22T02:50:40-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:51:10-04:00", "value": 1062 }, { "timestamp": "2016-08-22T02:52:00-04:00", "value": 1053 }, { "timestamp": "2016-08-22T02:52:10-04:00", "value": 1071 }, { "timestamp": "2016-08-22T02:52:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:52:40-04:00", "value": 1065 }, { "timestamp": "2016-08-22T02:53:10-04:00", "value": 1062 }, { "timestamp": "2016-08-22T02:53:40-04:00", "value": 1053 }, { "timestamp": "2016-08-22T02:54:10-04:00", "value": 1068 }, { "timestamp": "2016-08-22T02:54:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:54:40-04:00", "value": 1059 }, { "timestamp": "2016-08-22T02:55:10-04:00", "value": 1068 }, { "timestamp": "2016-08-22T02:55:40-04:00", "value": 1072 }, { "timestamp": "2016-08-22T02:55:50-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:56:30-04:00", "value": 1062 }, { "timestamp": "2016-08-22T02:56:40-04:00", "value": 0 } ], "latestTimestamp": "2016-08-22T02:56:40-04:00" }
Will fix the console to not show negative values and say they are over the limit by X amount
The following PR updates to the message to say "Over limit" rather than showing a negative value. It will also change the donut from blue to orange when the limit is reached. We're still investigating the underlying cause under the original bug. https://github.com/openshift/origin-web-console/pull/440
Merging to origin in https://github.com/openshift/origin/pull/10580
# openshift version openshift v3.3.0.25+d2ac65e-dirty kubernetes v1.3.0+507d3a7 etcd 2.3.0+git Tested on OCP v3.3.0.25, when used quota exceeding limits, the color of donut changed to yellow, and show "Over limit" on the left, see the attachment. The issue on web console has been fixed as designed, so move the bug to Verified.
Created attachment 1193889 [details] webquota.png
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1933