Description of problem: Observed on starter-ca-central-1 cluster. Our pre-upgrade diagnostics reported disk pressure on 10 compute nodes. Upon further investigation, all 10 nodes had had their container storage fully consumed by a single rogue pod (tropospheric) that continuously creates a core dump on the container storage volume. Version-Release number of selected component (if applicable): openshift v3.11.44 cri-o://1.11.9 How reproducible: 100% Steps to Reproduce: 1. Create a project that continuously writes to the container storage volume. Actual results: Eventually, this project will consume the entire disk and ultimately criple the container runtime. Expected results: Ideally, a container would not be able to consume all of the shared resources that the container runtime is providing. Additional info:
*** Bug 1658385 has been marked as a duplicate of this bug. ***
Do we expose this in the OpenShift Operator for 4.0?
Yes, we did but Justin wanted this open for adding this to openshift-ansible.
Sending to Installer to eval change for 3.x. For 4.x, I opened https://jira.coreos.com/browse/NODE-163 to verify we can set `overlay.size` in /etc/containers/storage via the ContainerRuntimeConfigs CRD.
Online no longer needs this and there's no customer case attached so marking as deferred.