Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1352390 - Monitoring and managing the quota size of containers and local volumes [NEEDINFO]
Monitoring and managing the quota size of containers and local volumes
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.2.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Derek Carr
DeShuai Ma
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-07-04 00:58 EDT by Jaspreet Kaur
Modified: 2017-03-08 13 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Feature: Ability to detect local disk pressure and reclaim resources Reason: To maintain stability of the node, the operator is able to set eviction thresholds that when crossed, will cause the node to reclaim disk resource by pruning images, or evicting pods. Result: Node is able to recover from disk pressure
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-01-18 07:41:20 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
agoldste: needinfo? (jkaur)


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0066 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.4 RPM Release Advisory 2017-01-18 12:23:26 EST

  None (edit)
Description Jaspreet Kaur 2016-07-04 00:58:04 EDT
Today no process on the node manages the amount of storage in use for:

    Container CoW layers (the container's writable directory)
    Empty dir volumes
    Docker volumes created implicitly on startup of the container

Since neither Docker nor the host operating system can totally manage this, I think it should be possible to have a management loop in / around the kubelet to control the total usage of these elements. Today, it's possible to write to the empty dir until you fill the filesystem, at which point all pods fail. You could set quotas (if the uids of the processes are different), but that requires all your pods to be running unique uids (or set quota to a multiple of the count of the pods with the same UID which doesn't protect against bad actors).

I would propose the following

    Add something to cAdvisor to track top layer CoW usage where possible for the filesystems that can provide it
        Docker today does not expose this - we could potentially add an abstraction endpoint there, or do the checks directly against devmapper / overlay
    In the Kubelet, honor a resource limit for usage of emptyDir (per mount / pod?) as well as the cow filesystem with a simple check loop.
        In situations where the container CoW layer has exceeded the maximum, restart the container gracefully (this is hard to do outside of the kubelet today)
        In situations where the emptyDir has exceeded size... do something??
    Report the volume and cow metrics up through cAdvisor
    Support read only containers through docker as a pod setting and allow it to be controlled via pod policy or similar (user cannot set non-readonly containers)
    Be able to deny emptyDir volumes via pod policy or forcibly limit them to a specific size.


Additional information : https://github.com/kubernetes/kubernetes/issues/13479
Comment 2 Andy Goldstein 2016-07-22 14:14:24 EDT
You can currently do the following in OpenShift:

- restrict allowed volume types via SCC (i.e. deny emptyDir)

- enforce quota size restrictions on emptyDir (XFS only, see https://docs.openshift.com/enterprise/3.2/install_config/master_node_configuration.html and look for localQuota)

- prohibit containers from starting that use Docker volumes (chattr +i /var/lib/docker/volumes; not the best user experience, but it works for now)


We are currently working on:

- tracking container CoW layer usage (done, will be in 3.3)

- evicting pods when a node gets low on free disk space (in progress, release TBD. see https://trello.com/c/3LvGAHr3/371-5-kubelet-evicts-pods-when-low-on-disk-aid-google-node-reliability to track the status)


We have discussed read-only containers, but have not started implementing them yet.

Jaspreet, will the combination of what you can currently do above plus what we're working on be sufficient to satisfy this RFE?
Comment 3 Andy Goldstein 2016-08-05 16:52:59 EDT
Jaspreet, any update on this?
Comment 7 Derek Carr 2016-10-25 16:02:05 EDT
The ability to detect and respond to local disk pressure was added in Kubernetes 1.4 which OCP has now rebased against.  Moving this to QA for test.
Comment 8 DeShuai Ma 2016-10-26 01:55:21 EDT
Test on openshift v3.4.0.15+9c963ec, disk pressure works as expected. 
detail in the card. https://trello.com/c/3LvGAHr3/371-5-kubelet-evicts-pods-when-low-on-disk-node-reliability

Verify this bug.
Comment 10 errata-xmlrpc 2017-01-18 07:41:20 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0066

Note You need to log in before you can comment on or make changes to this bug.