Bug 1387274 - Container mem usage exceeds limit
Summary: Container mem usage exceeds limit
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Samuel Padgett
QA Contact: Yadan Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-20 14:03 UTC by Viet Nguyen
Modified: 2017-03-08 18:43 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
When a pod had more than one container, the web console was incorrectly showing total memory and CPU usage for all containers in the pod on the metrics page rather than only the selected container. This could make it appear that memory usage exceeded the limit set for the container. The web console now correctly shows the memory and CPU usage only for the selected container.
Clone Of:
Environment:
Last Closed: 2017-01-18 12:43:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
container memory usage graph (102.71 KB, image/png)
2016-10-20 14:03 UTC, Viet Nguyen
no flags Details
free cmd inside container (43.14 KB, image/png)
2016-10-20 14:04 UTC, Viet Nguyen
no flags Details
test template (4.19 KB, text/plain)
2016-10-20 14:05 UTC, Viet Nguyen
no flags Details
container1-memory.png (66.24 KB, image/png)
2016-11-21 09:35 UTC, Yanping Zhang
no flags Details
container2-memory.png (46.17 KB, image/png)
2016-11-21 09:37 UTC, Yanping Zhang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0066 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.4 RPM Release Advisory 2017-01-18 17:23:26 UTC

Description Viet Nguyen 2016-10-20 14:03:21 UTC
Created attachment 1212537 [details]
container memory usage graph

Description of problem:

Containers are given 2Gi memory limit but metrics show over usage > 2G 
See attached template file

Version-Release number of selected component (if applicable):
- Hack day env (Oct 19, 2016)

How reproducible:


Steps to Reproduce:
1.  Upload attached template to your porject
2.  oc new-app --template=hawkular-full

Actual results:
Metric graph shows mem usage > resource limit

Expected results:
OpenShift should restart containers once memory consumption exceeds resource limit

Additional info:

Comment 1 Viet Nguyen 2016-10-20 14:04:31 UTC
Created attachment 1212538 [details]
free cmd inside container

Comment 2 Viet Nguyen 2016-10-20 14:05:15 UTC
Created attachment 1212539 [details]
test template

Comment 3 Dan McPherson 2016-10-20 14:19:28 UTC
Note that free doesn't report the value inside the container. It's reporting node level metrics.

Comment 4 Derek Carr 2016-10-20 15:52:41 UTC
Seth - can you try to reproduce using the specified template to see if the actually produced pod looks as expected?

Comment 5 Seth Jennings 2016-11-01 21:43:31 UTC
I was able to reproduce.

# docker ps
CONTAINER ID        IMAGE                                               COMMAND                  CREATED             STATUS              PORTS               NAMES
901ee01d8893        docker.io/hawkularqe/cassandra:latest               "/docker-start.sh"       5 minutes ago       Up 5 minutes                            k8s_hawkular-full-cnode.61f275f8_hawkular-full-1-8k5rs_demo_9338c2d7-a079-11e6-a193-fa163e4dab0a_4466a9bb
45567860dd7b        docker.io/hawkularqe/hawkular-services:latest       "/bin/bash -c ${JBOSS"   5 minutes ago       Up 5 minutes                            k8s_hawkular-full.bd21ab94_hawkular-full-1-8k5rs_demo_9338c2d7-a079-11e6-a193-fa163e4dab0a_484ffbd0

There are two containers in the pod:

-bash-4.2# cd /sys/fs/cgroup/memory/system.slice/docker-901ee01d8893d24f15f674f9a086514a7ba5ab887c1d5d2a628d20c3549ee17c.scope/
-bash-4.2# cat memory.limit_in_bytes 
2147483648
-bash-4.2# cat memory.usage_in_bytes 
1473753088

-bash-4.2# cd /sys/fs/cgroup/memory/system.slice/docker-45567860dd7b191f1f15b19f284d3403c46a1658348d9d370c5639a6e843354a.scope/
-bash-4.2# cat memory.limit_in_bytes 
2147483648
-bash-4.2# cat memory.usage_in_bytes 
1669681152

This is an issue with the web console.

requests are made at the container-level, however the web console is using the container-level limit with the pod-level usage.  It should use the container-level usage of the container selected in the dropdown.

Comment 6 Derek Carr 2016-11-01 22:53:05 UTC
Moving this to the console team.

Comment 7 Samuel Padgett 2016-11-02 13:22:18 UTC
(In reply to Seth Jennings from comment #5)

> requests are made at the container-level, however the web console is using
> the container-level limit with the pod-level usage.  It should use the
> container-level usage of the container selected in the dropdown.

Seth, Derek, we request container-level metrics for memory usage from Hawkular. We should not be displaying pod-level metrics, which is why we have the container dropdown. Do you see different values if you switch the selected container in the dropdown?

Comment 8 Samuel Padgett 2016-11-02 13:31:05 UTC
I believe I see the problem. This does look like a web console bug.

Comment 10 Jessica Forrester 2016-11-17 19:16:12 UTC
Moving this to OCP since it is not a bug specific to online

Comment 11 Yanping Zhang 2016-11-21 09:26:06 UTC
Reproduced the issue on OCP 3.3. Tested on OCP 3.4 with version: v3.4.0.28+dfe3a66.
Created an app of which the pod has 2 containers, set memory limits to each container, check metrics of each container, when container memory limits is reached, there will be no available memory for the container. Attached the screeshots.

The bug has been fixed, so move it to verified.

Comment 12 Yanping Zhang 2016-11-21 09:35:31 UTC
Created attachment 1222344 [details]
container1-memory.png

Comment 13 Yanping Zhang 2016-11-21 09:37:18 UTC
Created attachment 1222346 [details]
container2-memory.png

Comment 15 errata-xmlrpc 2017-01-18 12:43:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0066


Note You need to log in before you can comment on or make changes to this bug.