Description of problem: If one decide to use EmptyDir as storage option for openshift pods , and if intensively write inside pods then space under /var/lib/origin/openshift.local.volumes/pods will fill up without any limitation what will lead to situation that openshift node file system will be filled and normal operation of openshift node will be affected Version-Release number of selected component (if applicable): I noticed this with below packages atomic-openshift-clients-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64 tuned-profiles-atomic-openshift-node-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64 atomic-openshift-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64 atomic-openshift-sdn-ovs-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64 atomic-openshift-master-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64 atomic-openshift-node-3.2.1.3-1.git.0.dfa4ad6.el7.x86_64 How reproducible: always Steps to Reproduce: 1. create pod(s) with EmptyDir as storage option 2. write data inside pod 3. watch space usage in /var/lib/origin/openshift.local.volumes/pods Actual results: space usage on /var/lib/origin/openshift.local.volumes/pods will go up. If /var is not separate partition on openshift node this will lead that / space will filled up to 100% and services as atomic-openshift-node/docker will not function. df from affected system # df -h Filesystem Size Used Avail Use% Mounted on /dev/xvda2 10G 10G 20K 100% / devtmpfs 3.9G 0 3.9G 0% /dev tmpfs 3.7G 0 3.7G 0% /dev/shm tmpfs 3.7G 377M 3.4G 10% /run tmpfs 3.7G 0 3.7G 0% /sys/fs/cgroup tmpfs 757M 0 757M 0% /run/user/0 Expected results: to prevent above behavior as this lead to situation where due to filled up / file system that openshift node service will not function properly and openshift cluster can be degraded Additional info: Once above critical information happen, pods residing on affected node will be moved to ( once affected openshift-node stops responding to master ) new node and process will repeat and new node will also get / ( /var/lib/origin/openshift.local.volumes/pods ) filled up, then next one and so on.
The out of disk eviction work that Derek is doing should help here (and might be sufficient to close this out).
Upstream: https://github.com/kubernetes/kubernetes/pull/27199
@Hou Jianwei The disk pressure changes were merged, can you verify that this resolves the usability issue?
This is not in origin yet
This has been merged into ose and is in OSE v3.4.0.12 or newer.
I have tested this on version below, this is fixed: oc v3.4.0.23+24b1a58 kubernetes v1.4.0+776c994 The / is limited after I created a bigger size data: bash-4.3$ dd if=/dev/zero of=/tmp/test bs=3072M count=1 0+1 records in 0+1 records out 2147479552 bytes (2.1 GB) copied, 119.747 s, 17.9 MB/s On the node: $ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 10G 6.6G 3.5G 66% / So will update status to verified. Thanks.
Upstream tracking issue: https://github.com/kubernetes/kubernetes/issues/35406
Upstream PR merged https://github.com/kubernetes/kubernetes/pull/37228 Origin PR opened https://github.com/openshift/origin/pull/12669
This has been merged into ocp and is in OCP v3.5.0.12 or newer.
Verified on openshift v3.5.0.14+20b49d0 When pod is terminated, kubelet should remove disk backed emptydir volume. Steps: 1. Create Failed/Succeeded pods with host disk backed emptyDir volume $ oc create -f https://raw.githubusercontent.com/mdshuai/testfile-openshift/master/k8s/terminatedpods/emtydir-host.yaml 2. On node, Make sure disk backed emptyDir volume removed when pod become Failed/Succeeded # ls /var/lib/origin/openshift.local.volumes/pods/${pod.uid}/volumes/kubernetes.io~empty-dir/${volumeName}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0884