Bug 1422049 - EmptyDir could lead to memory exhaustion
Summary: EmptyDir could lead to memory exhaustion
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.3.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.7.0
Assignee: Seth Jennings
QA Contact: Qixuan Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-14 11:24 UTC by Sergi Jimenez Romero
Modified: 2020-09-20 12:52 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: A design limitation in previous versions does not account memory-backed volumes against the pod's cumulative memory limit. Consequence: It is possible for a user to exhaust memory on the node by creating a large file in an memory-backed volume, regardless of the memory limit. Fix: Pod-level cgroups were added to, among other things, enforce limits on memory-backed volumes. Result: Memory-backed volume sizes are now bounded by cumulative pod memory limits.
Clone Of:
Environment:
Last Closed: 2017-11-28 21:52:23 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1442143 medium CLOSED [RFE] EmptyDir setting per volume memory limits 2020-10-14 00:28:05 UTC
Red Hat Product Errata RHSA-2017:3188 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Internal Links: 1442143

Description Sergi Jimenez Romero 2017-02-14 11:24:49 UTC
Description of problem:

OpenShift allows users to create in-memory EmptyDir volumes for their pods (by setting the option "medium: memory"), which translates into a tmpfs file system mounted inside the container. However, the API does not allow to limit the size of these tmpfs filesystems, which defaults to the half of the node RAM (as usual in Linux). This situation could lead to a memory exhaustion on the node where the pods are running.



Version-Release number of selected component (if applicable):

3.3.0

How reproducible:
Always

Steps to Reproduce:
1. Start pod with EmptyDir (medium: memory) and set a memory limit.
2. Rsh to the pod and use dd to create a file on the EmptyDir bigger than the pod's memory limit.
3. The pod will be restarted.
4. Repeat 1-2.

Actual results:
The EmptyDir keeps all the files and based on kernel documentation [1][2], that could potentially lead to memory exhaustion on the host.

Expected results:

- It should be possible to set a fixed size for an in-memory EmptyDir.
- It should be possible to limit the size of all the in-memory EmptyDir volumes defined in each user pod via limits (as we currently can limit RAM and CPU).
- It should be possible to limit how many GB can a user allocate for in-memory EmptyDir volumes via the quota.

Additional info:

Comment 5 Derek Carr 2017-02-14 22:06:22 UTC
To address this issue, we need pod level cgroup hierarchy planned in Kubernetes 1.6.

Comment 11 Seth Jennings 2017-05-05 17:14:24 UTC
I just verified that having pod cgroups enabled on the node in 3.6 (it is enabled by default) enforces the memory limit wrt memory backed emptydirs.

[root@test ~]# cat busybox.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: busybox
spec:
  containers:
  - name: busybox
    image: busybox
    resources:
      limits:
        memory: 1Gi
        cpu: 1
    command:
    - dd
    - if=/dev/zero
    - of=/mnt/zero
    - bs=1M
    - count=2000
    volumeMounts:
    - name: myvol
      mountPath: /mnt
  terminationGracePeriodSeconds: 0
  volumes:
  - name: myvol
    emptyDir:
      medium: Memory
[root@test ~]# oc create -f busybox.yaml 
pod "busybox" created
[root@test ~]# oc describe pod | grep -A 5 "Last State"
    Last State:		Terminated
      Reason:		OOMKilled
      Exit Code:	137
      Started:		Fri, 05 May 2017 17:10:42 +0000
      Finished:		Fri, 05 May 2017 17:10:42 +0000
    Ready:		False
# mount | grep myvol
tmpfs on /var/lib/origin/openshift.local.volumes/pods/c296d664-31b5-11e7-a96c-fa163e71bc65/volumes/kubernetes.io~empty-dir/myvol type tmpfs (rw,relatime,seclabel)
[root@test ~]# cd /var/lib/origin/openshift.local.volumes/pods/c296d664-31b5-11e7-a96c-fa163e71bc65/volumes/kubernetes.io~empty-dir/myvol
[root@test myvol]# ls -alh
total 1023M
drwxrwsrwt. 2 root       1000040000    60 May  5 17:10 .
drwxr-xr-x. 3 root       root          19 May  5 17:10 ..
-rw-r--r--. 1 1000040000 1000040000 1023M May  5 17:11 zero

Even though the pod tries to write a 2Gi file, it is OOMKilled when the file reached 1Gi in size i.e. the memory limit set on the container.

Comment 12 Seth Jennings 2017-05-08 13:41:09 UTC
Upstream PR:
https://github.com/kubernetes/kubernetes/pull/41349

Comment 13 Seth Jennings 2017-05-19 18:48:00 UTC
Included in Origin 1.6.1 rebase:
https://github.com/openshift/origin/pull/13653

Comment 14 Qixuan Wang 2017-05-24 06:56:07 UTC
Tested on OCP3.6 (openshift v3.6.79, kubernetes v1.6.1+5115d708d7, etcd 3.1.0)

EmptyDir won't exhaust memory. Move the bug to VERIFIED, thanks.

Comment 21 errata-xmlrpc 2017-11-28 21:52:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188

Comment 22 fanlong 2018-12-12 19:24:30 UTC
you can use sizeLimit and  i verified it already.
even though after entering container, df -h , you see the emptydir is 128G, but you could only use the space under the sizeLimit.


apiVersion: v1
kind: Pod
metadata:
  name: busybox
spec:
  containers:
  - name: busybox
    image: gcr.io/google_containers/busybox:1.24
    imagePullPolicy: IfNotPresent
    resources:
      limits:
        memory: 1Gi
        cpu: 1
    command: ['sh', '-c', 'echo Hello Kubernetes!>/test-pd/mfltest.txt && sleep 3600' ]
    ports:
    - containerPort: 80
    volumeMounts:
    - mountPath: /test-pd  
      name: test-volume
  volumes:
  - name: test-volume
    emptyDir:
      medium: Memory
      sizeLimit: "1M" 


after enter the container, you can verify by typing: dd if=/dev/zero of=/test-pd/zero bs=1M count=10
the container exit.

if you have further question, please let met know fanlong_meng@msn.com


Note You need to log in before you can comment on or make changes to this bug.