1512679 – Failed docker builds leave temporary containers on node

Bug 1512679 - Failed docker builds leave temporary containers on node

Summary: Failed docker builds leave temporary containers on node

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Build
Sub Component:
Version:	3.7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	3.7.z
Assignee:	Cesar Wong
QA Contact:	Wenjing Zheng
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1515358 (view as bug list)
Depends On:
Blocks:	1533181
TreeView+	depends on / blocked

Reported:	2017-11-13 20:12 UTC by Cesar Wong
Modified:	2021-06-10 13:34 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: The OpenShift Docker builder invokes the Docker build API without the ForceRmTemp flag Consequence: Containers from failed builds remain on the node where the build ran. These containers are not recognized by the kubelet for gc and are therefore accumulated until the node runs out of space. Fix: Modified the Docker build API call from the OpenShift Docker builder to force the removal of temporary containers. Result: Failed containers no longer remain on the node where a Docker build ran.
Clone Of:
Clones:	1538413 (view as bug list)
Environment:
Last Closed:	2017-12-18 13:23:56 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:3464	0	normal	SHIPPED_LIVE	Red Hat OpenShift Container Platform 3.7 bug fix and enhancement update	2017-12-18 18:22:05 UTC

Description Cesar Wong 2017-11-13 20:12:40 UTC

Description of problem:
After running a Docker strategy build that fails on a node, a container that represents that build remains on the node. The container is not cleaned up by the Kubelet because it's not a container managed by Kubernetes. This causes the node to keep containers that will not get cleaned up, eventually causing the node to run out of space.

Version-Release number of selected component (if applicable):
All versions

How reproducible:
Always

Steps to Reproduce:
1. Create a Docker build that will fail:
   echo "FROM openshift/origin:latest\nRUN exit 1" | oc new-build -D - --name failing-build
2. Wait for the build to finish
3. Inspect containers on the node where the build ran with 'docker ps -a'

Actual results:
A container that runs the last failing RUN instruction will exist ('exit 1')

Expected results:
No containers related to the failed build should exist on the node


Additional info:

Comment 1 Cesar Wong 2017-11-13 20:20:50 UTC

PR https://github.com/openshift/origin/pull/17285

Comment 2 Cesar Wong 2017-11-27 15:14:35 UTC

PR for origin master https://github.com/openshift/origin/pull/17283

Comment 4 Dongbo Yan 2017-12-06 05:58:20 UTC

Verified
# openshift version
openshift v3.7.11
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8

Comment 5 Ben Parees 2017-12-06 20:30:19 UTC

*** Bug 1515358 has been marked as a duplicate of this bug. ***

Comment 8 errata-xmlrpc 2017-12-18 13:23:56 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3464

Note You need to log in before you can comment on or make changes to this bug.