Bug 1352390

Summary:	Monitoring and managing the quota size of containers and local volumes
Product:	OpenShift Container Platform	Reporter:	Jaspreet Kaur <jkaur>
Component:	Node	Assignee:	Derek Carr <decarr>
Status:	CLOSED ERRATA	QA Contact:	DeShuai Ma <dma>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.2.0	CC:	aos-bugs, jkaur, jokerman, mmccomas, tdawson, wmeng
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Enhancement
Doc Text:	Feature: Ability to detect local disk pressure and reclaim resources Reason: To maintain stability of the node, the operator is able to set eviction thresholds that when crossed, will cause the node to reclaim disk resource by pruning images, or evicting pods. Result: Node is able to recover from disk pressure	Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-01-18 12:41:20 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Jaspreet Kaur 2016-07-04 04:58:04 UTC

Today no process on the node manages the amount of storage in use for:

    Container CoW layers (the container's writable directory)
    Empty dir volumes
    Docker volumes created implicitly on startup of the container

Since neither Docker nor the host operating system can totally manage this, I think it should be possible to have a management loop in / around the kubelet to control the total usage of these elements. Today, it's possible to write to the empty dir until you fill the filesystem, at which point all pods fail. You could set quotas (if the uids of the processes are different), but that requires all your pods to be running unique uids (or set quota to a multiple of the count of the pods with the same UID which doesn't protect against bad actors).

I would propose the following

    Add something to cAdvisor to track top layer CoW usage where possible for the filesystems that can provide it
        Docker today does not expose this - we could potentially add an abstraction endpoint there, or do the checks directly against devmapper / overlay
    In the Kubelet, honor a resource limit for usage of emptyDir (per mount / pod?) as well as the cow filesystem with a simple check loop.
        In situations where the container CoW layer has exceeded the maximum, restart the container gracefully (this is hard to do outside of the kubelet today)
        In situations where the emptyDir has exceeded size... do something??
    Report the volume and cow metrics up through cAdvisor
    Support read only containers through docker as a pod setting and allow it to be controlled via pod policy or similar (user cannot set non-readonly containers)
    Be able to deny emptyDir volumes via pod policy or forcibly limit them to a specific size.


Additional information : https://github.com/kubernetes/kubernetes/issues/13479

Comment 2 Andy Goldstein 2016-07-22 18:14:24 UTC

You can currently do the following in OpenShift:

- restrict allowed volume types via SCC (i.e. deny emptyDir)

- enforce quota size restrictions on emptyDir (XFS only, see https://docs.openshift.com/enterprise/3.2/install_config/master_node_configuration.html and look for localQuota)

- prohibit containers from starting that use Docker volumes (chattr +i /var/lib/docker/volumes; not the best user experience, but it works for now)


We are currently working on:

- tracking container CoW layer usage (done, will be in 3.3)

- evicting pods when a node gets low on free disk space (in progress, release TBD. see https://trello.com/c/3LvGAHr3/371-5-kubelet-evicts-pods-when-low-on-disk-aid-google-node-reliability to track the status)


We have discussed read-only containers, but have not started implementing them yet.

Jaspreet, will the combination of what you can currently do above plus what we're working on be sufficient to satisfy this RFE?

Comment 3 Andy Goldstein 2016-08-05 20:52:59 UTC

Jaspreet, any update on this?

Comment 7 Derek Carr 2016-10-25 20:02:05 UTC

The ability to detect and respond to local disk pressure was added in Kubernetes 1.4 which OCP has now rebased against.  Moving this to QA for test.

Comment 8 DeShuai Ma 2016-10-26 05:55:21 UTC

Test on openshift v3.4.0.15+9c963ec, disk pressure works as expected. 
detail in the card. https://trello.com/c/3LvGAHr3/371-5-kubelet-evicts-pods-when-low-on-disk-node-reliability

Verify this bug.

Comment 10 errata-xmlrpc 2017-01-18 12:41:20 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0066

Comment 11 Red Hat Bugzilla 2023-09-14 03:27:37 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days