Bug 1369022
| Summary: | Negative CPU requests for a pod | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Alexander Koksharov <akokshar> | ||||||
| Component: | Node | Assignee: | Derek Carr <decarr> | ||||||
| Status: | CLOSED DUPLICATE | QA Contact: | DeShuai Ma <dma> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 3.2.0 | CC: | agoldste, akokshar, aos-bugs, eparis, hannsj_uhl, jokerman, mmccomas, yanpzhan | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1369160 (view as bug list) | Environment: | |||||||
| Last Closed: | 2016-10-25 18:34:42 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | 1336863 | ||||||||
| Bug Blocks: | 1369160 | ||||||||
| Attachments: |
|
||||||||
|
Description
Alexander Koksharov
2016-08-22 10:51:00 UTC
Created attachment 1192900 [details]
QuoraError2
Created attachment 1192901 [details]
QuotaError
last check done: Web console displays: "CPU -344 Available of 800 millicores" Whereas direct check through api: [root@i89540 ~]# curl http://localhost:8001/api/v1/namespaces/openshift-infra/services/https:heapster:/proxy/api/v1/model/namespaces/redko-dev/pods/cassandra-15-btpca/metrics/cpu-usage { "metrics": [ { "timestamp": "2016-08-22T02:42:00-04:00", "value": 1065 }, { "timestamp": "2016-08-22T02:42:10-04:00", "value": 1064 }, { "timestamp": "2016-08-22T02:42:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:43:00-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:43:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:43:40-04:00", "value": 1053 }, { "timestamp": "2016-08-22T02:44:00-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:44:10-04:00", "value": 1063 }, { "timestamp": "2016-08-22T02:44:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:45:00-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:45:10-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:45:40-04:00", "value": 1063 }, { "timestamp": "2016-08-22T02:45:50-04:00", "value": 1064 }, { "timestamp": "2016-08-22T02:46:00-04:00", "value": 1058 }, { "timestamp": "2016-08-22T02:46:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:46:40-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:46:50-04:00", "value": 1102 }, { "timestamp": "2016-08-22T02:47:20-04:00", "value": 1065 }, { "timestamp": "2016-08-22T02:47:30-04:00", "value": 1061 }, { "timestamp": "2016-08-22T02:48:00-04:00", "value": 1060 }, { "timestamp": "2016-08-22T02:48:10-04:00", "value": 1063 }, { "timestamp": "2016-08-22T02:48:40-04:00", "value": 1082 }, { "timestamp": "2016-08-22T02:49:00-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:49:10-04:00", "value": 1065 }, { "timestamp": "2016-08-22T02:49:40-04:00", "value": 1058 }, { "timestamp": "2016-08-22T02:49:50-04:00", "value": 1063 }, { "timestamp": "2016-08-22T02:50:00-04:00", "value": 1062 }, { "timestamp": "2016-08-22T02:50:10-04:00", "value": 1061 }, { "timestamp": "2016-08-22T02:50:20-04:00", "value": 1048 }, { "timestamp": "2016-08-22T02:50:40-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:51:10-04:00", "value": 1062 }, { "timestamp": "2016-08-22T02:52:00-04:00", "value": 1053 }, { "timestamp": "2016-08-22T02:52:10-04:00", "value": 1071 }, { "timestamp": "2016-08-22T02:52:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:52:40-04:00", "value": 1065 }, { "timestamp": "2016-08-22T02:53:10-04:00", "value": 1062 }, { "timestamp": "2016-08-22T02:53:40-04:00", "value": 1053 }, { "timestamp": "2016-08-22T02:54:10-04:00", "value": 1068 }, { "timestamp": "2016-08-22T02:54:30-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:54:40-04:00", "value": 1059 }, { "timestamp": "2016-08-22T02:55:10-04:00", "value": 1068 }, { "timestamp": "2016-08-22T02:55:40-04:00", "value": 1072 }, { "timestamp": "2016-08-22T02:55:50-04:00", "value": 0 }, { "timestamp": "2016-08-22T02:56:30-04:00", "value": 1062 }, { "timestamp": "2016-08-22T02:56:40-04:00", "value": 0 } ], "latestTimestamp": "2016-08-22T02:56:40-04:00" } we have spawned a separate bug to stop showing the negative values in the console https://bugzilla.redhat.com/show_bug.cgi?id=1369160 Transferring this bug to cluster infra team if there needs to be investigation into why its going over the limit I want to determine that cpu cgroup limits were properly set for the pod. To verify this, can we see the following output for the pod that demonstrated this behavior, where <pod-name> is that pod? 1. Pod YAML $ oc get pod <pod-name> -o yaml 2. Pod cgroup settings on the node $ oc exec <pod-name> -- cat /proc/self/cgroup $ oc exec <pod-name> -- cat /sys/fs/cgroup/cpu/cpu.shares $ oc exec <pod-name> -- cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us $ oc exec <pod-name> -- cat /sys/fs/cgroup/cpu/cpu.cfs_period_us Here's the cgroup info. Waiting on pod yaml. [root@i89540 ~]# oc exec cassandra-5-5783w -- cat /proc/self/cgroup 10:cpuset:/system.slice/docker-4404981688d3ca233cb690c5fcb7366dfb822f84b381c6646512a6d7afeb6139.scope 9:devices:/system.slice/docker-4404981688d3ca233cb690c5fcb7366dfb822f84b381c6646512a6d7afeb6139.scope 8:blkio:/system.slice/docker-4404981688d3ca233cb690c5fcb7366dfb822f84b381c6646512a6d7afeb6139.scope 7:net_cls:/system.slice/docker-4404981688d3ca233cb690c5fcb7366dfb822f84b381c6646512a6d7afeb6139.scope 6:perf_event:/system.slice/docker-4404981688d3ca233cb690c5fcb7366dfb822f84b381c6646512a6d7afeb6139.scope 5:freezer:/system.slice/docker-4404981688d3ca233cb690c5fcb7366dfb822f84b381c6646512a6d7afeb6139.scope 4:memory:/system.slice/docker-4404981688d3ca233cb690c5fcb7366dfb822f84b381c6646512a6d7afeb6139.scope 3:cpuacct,cpu:/system.slice/docker-4404981688d3ca233cb690c5fcb7366dfb822f84b381c6646512a6d7afeb6139.scope 2:hugetlb:/system.slice/docker-4404981688d3ca233cb690c5fcb7366dfb822f84b381c6646512a6d7afeb6139.scope 1:name=systemd:/system.slice/docker-4404981688d3ca233cb690c5fcb7366dfb822f84b381c6646512a6d7afeb6139.scope [root@i89540 ~]# oc exec cassandra-5-5783w -- cat /sys/fs/cgroup/cpu/cpu.shares 1433 [root@i89540 ~]# oc exec cassandra-5-5783w -- cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us -1 [root@i89540 ~]# oc exec cassandra-5-5783w -- cat /sys/fs/cgroup/cpu/cpu.cfs_period_us 100000 Marking UpcomingRelease as we're dependent on a kernel z-stream fix that doesn't have an ETA yet. I am marking this as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1336863 as it requires no additional code change beyond that bz being released. *** This bug has been marked as a duplicate of bug 1336863 *** |