Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1349311 - Using EmptyDir as storage option for openshift pods leads to filling up openshift node storage space
Using EmptyDir as storage option for openshift pods leads to filling up opens...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.2.1
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Seth Jennings
DeShuai Ma
: Performance
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-06-23 04:15 EDT by Elvir Kuric
Modified: 2017-07-24 10 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Using hostPath for storage could lead to out of disk space. Consequence: openshift root disk could become full and unusable. Fix: add support for pod eviction based on disk space. Result: If a pod using hostPath uses too much space it may be evicted from the node.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-04-12 15:05:48 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0884 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.5 RPM Release Advisory 2017-04-12 18:50:07 EDT

  None (edit)
Description Elvir Kuric 2016-06-23 04:15:54 EDT
Description of problem:

If one decide to use EmptyDir as storage option for openshift pods , and if intensively write inside pods then space under /var/lib/origin/openshift.local.volumes/pods will fill up without any limitation what will lead to situation that openshift node file system will be filled and normal operation of openshift node will be affected 


Version-Release number of selected component (if applicable):

I noticed this with below packages

atomic-openshift-clients-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64
tuned-profiles-atomic-openshift-node-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64
atomic-openshift-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64
atomic-openshift-sdn-ovs-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64
atomic-openshift-master-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64
atomic-openshift-node-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64

How reproducible:

always 


Steps to Reproduce:
1. create pod(s) with EmptyDir as storage option 
2. write data inside pod
3. watch space usage in /var/lib/origin/openshift.local.volumes/pods 


Actual results:

space usage on /var/lib/origin/openshift.local.volumes/pods will go up. If /var is not separate partition on openshift node this will lead that / space will filled up to 100% and services as atomic-openshift-node/docker will not function.
df from affected system 

# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda2       10G   10G   20K 100% /
devtmpfs        3.9G     0  3.9G   0% /dev
tmpfs           3.7G     0  3.7G   0% /dev/shm
tmpfs           3.7G  377M  3.4G  10% /run
tmpfs           3.7G     0  3.7G   0% /sys/fs/cgroup
tmpfs           757M     0  757M   0% /run/user/0


Expected results:

to prevent above behavior as this lead to situation where due to filled up / file system that openshift node service will not function properly and openshift cluster can be degraded 


Additional info:

Once above critical information happen, pods residing on affected  node will be moved to ( once affected openshift-node stops responding to master ) new node and process will repeat and new node will also get / ( /var/lib/origin/openshift.local.volumes/pods ) filled up, then next one and so on.
Comment 1 Andy Goldstein 2016-06-27 16:23:10 EDT
The out of disk eviction work that Derek is doing should help here (and might be sufficient to close this out).
Comment 2 Bradley Childs 2016-08-03 09:27:26 EDT
Upstream:

https://github.com/kubernetes/kubernetes/pull/27199
Comment 3 Bradley Childs 2016-08-08 13:23:45 EDT
@Hou Jianwei The disk pressure changes were merged, can you verify that this resolves the usability issue?
Comment 4 Andy Goldstein 2016-08-08 14:05:44 EDT
This is not in origin yet
Comment 7 Troy Dawson 2016-10-18 12:20:31 EDT
This has been merged into ose and is in OSE v3.4.0.12 or newer.
Comment 9 Wenqi He 2016-11-09 05:43:42 EST
I have tested this on version below, this is fixed:
oc v3.4.0.23+24b1a58
kubernetes v1.4.0+776c994

The / is limited after I created a bigger size data:

bash-4.3$ dd if=/dev/zero of=/tmp/test bs=3072M count=1
0+1 records in
0+1 records out
2147479552 bytes (2.1 GB) copied, 119.747 s, 17.9 MB/s

On the node:
$ df -h 
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        10G  6.6G  3.5G  66% /

So will update status to verified. Thanks.
Comment 12 Derek Carr 2016-11-28 17:17:28 EST
Upstream tracking issue:
https://github.com/kubernetes/kubernetes/issues/35406
Comment 15 Seth Jennings 2017-01-25 15:55:53 EST
Upstream PR merged
https://github.com/kubernetes/kubernetes/pull/37228

Origin PR opened
https://github.com/openshift/origin/pull/12669
Comment 16 Troy Dawson 2017-01-31 15:16:18 EST
This has been merged into ocp and is in OCP v3.5.0.12 or newer.
Comment 17 DeShuai Ma 2017-02-03 04:32:27 EST
Verified on openshift v3.5.0.14+20b49d0
When pod is terminated, kubelet should remove disk backed emptydir volume.

Steps:
1. Create Failed/Succeeded pods with host disk backed emptyDir volume
$ oc create -f https://raw.githubusercontent.com/mdshuai/testfile-openshift/master/k8s/terminatedpods/emtydir-host.yaml

2. On node, Make sure disk backed emptyDir volume removed when pod become Failed/Succeeded
# ls /var/lib/origin/openshift.local.volumes/pods/${pod.uid}/volumes/kubernetes.io~empty-dir/${volumeName}
Comment 19 errata-xmlrpc 2017-04-12 15:05:48 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0884

Note You need to log in before you can comment on or make changes to this bug.