1420994 – django quickstart can trigger excessive disk io when it hits memory limits

Bug 1420994 - django quickstart can trigger excessive disk io when it hits memory limits

Summary: django quickstart can trigger excessive disk io when it hits memory limits

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Software Collections
Classification:	Red Hat
Component:	rh-python35-container
Sub Component:
Version:	rh-python35
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	3.1
Assignee:	Python Maintainers
QA Contact:	BaseOS QE - Apps
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-02-10 05:12 UTC by Jaspreet Kaur
Modified:	2020-07-16 09:12 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-12-07 15:57:31 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Jaspreet Kaur 2017-02-10 05:12:36 UTC

Description of problem: Running the OpenShift recommended example:
https://github.com/openshift/django-ex

results in subtantially more memory consumption than the assumed '10M'.

This means that running on a shared platform where the memory/cpu ratio may
seem strange due to memory quotas (for example '8' CPUs but '512M' RAM) we
can end up spawning far more workers than we have memory for. Resulting in
new workers being constantly killed, and high disk IO usage as the workers
keep respawning.

This then happens add nauseum exhausting the Burst IO allocated to the AWS volume and impacting all users of that node.

Note: While Docker can support limiting the number of CPUs seen, currently
Kubernetes and OpenShift do not support this.

Version-Release number of selected component (if applicable):

How reproducible:

Preparation:
Our nodes are m4.2xlarge: 32G Ram & 8 CPU
Our docker storage is 100G GP2 (300 IOP/s)
We apply a reasonably aggressive default memory limit. We use 512Mi, which means the pods need to see moderate use before they start having problems. To simplify triggering the issue you can setting the memory.limit to around 256Mi, however you can also expose the issue by increasing the number of cores on the host. The CPU limits don't have any impact on the calculations since they're not seen by the s2i startup script.

Trigger:
Create a project using the 'Python 3.5' example and the example django app: https://github.com/openshift/django-ex

Effect:
Once the pod starts up you'll see that some of the workers keep getting killed by the OOM Killer and respawned. Depending on your storage backend you'll also see an increased disk IO usage.

Steps to Reproduce:
1.
2.
3.

Actual results: Excessive disk IO with django Quickstart

Expected results: The issue should have prevented in the default image itself.

Additional info: Workaround tested : setting WEB_CONCURRENCY=1 in the dc

Comment 7 Charalampos Stratakis 2017-12-07 15:57:31 UTC

The fix is at the latest image so closing the issue.

Please feel free to reopen it, if you experience again the issue.

Note You need to log in before you can comment on or make changes to this bug.