1463534 – Parallel creation of containers with "--rm" leads to "Error response from daemon: Unable to remove filesystem for (...)"

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1463534 - Parallel creation of containers with "--rm" leads to "Error response from daemon: Unable to remove filesystem for (...)"

Summary: Parallel creation of containers with "--rm" leads to "Error response from dae...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	docker
Sub Component:
Version:	7.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Antonio Murdaca
QA Contact:	atomic-bugs@redhat.com
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1460728 (view as bug list)
Depends On:	1465485
Blocks:	1186913
TreeView+	depends on / blocked

Reported:	2017-06-21 08:14 UTC by Sergio Lopez
Modified:	2021-08-30 13:19 UTC (History)
CC List:	11 users (show)
Fixed In Version:	docker-1.12.6-48.git0fdc778.el7.x86_64
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-11-21 20:09:05 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	3235041	0	None	None	None	2017-12-29 22:11:46 UTC

Description Sergio Lopez 2017-06-21 08:14:37 UTC

Description of problem:

Creating multiple (>= 2) containers with the "--rm" option in parallel, frequently leads to some of them to fail to be properly clean up, giving a response to the user like this:

Error response from daemon: Unable to remove filesystem for 815a98e314769f5bedf57c56ffcc9b034e4b12f30c1dc06ab0c0939992cdca36: remove /var/lib/docker/containers/815a98e314769f5bedf57c56ffcc9b034e4b12f30c1dc06ab0c0939992cdca36/shm: device or resource busy

Version-Release number of selected component (if applicable):

Tested with docker-1.12.6-28.git1398f24.el7.x86_64

How reproducible:

~1/3 of the time

Steps to Reproduce:

Manually:
1. Use an application which allows you to control multiple terminals simultaneously (like Terminator).
2. Run "docker run --rm busybox" from 2 or more terminals in parallel.

Scripted:
#! /bin/bash -e

for i in $(seq 5); do
(
while true; do
docker run --rm busybox
sleep 1
done
)&
done

sleep 10
pkill -f test.sh

Actual results:

Some containers will fail to clean up with an error like this:

Additionally, if using the devmapper graphdriver, each time clean up fails, an internal DM device will be leaked (see also BZ #1460728).

Expected results:

Clean up shouldn't fail.

Additional info:

These symptoms are caused by multiple issues:

1. dockerd unmounts container's "/dev/shm" lazily (using syscall.MNT_DETACH), which causes os.RemoveAll call from daemon/delete.go:cleanupContainer to fail with EBUSY.

2. When using the devmapper graphdriver, dockerd also unmounts lazily the DM devices, causing the os.RemoveAll call from daemon/graphdriver/devmapper/driver.go:Remove to fail with EBUSY.

3. When using the devmapper graphdriver with dm.use_deferred_removal=true and dm.use_deferred_deletion=fail (the default options in RHEL7.3), the removal of the DM device is deferred, so it may be still present when reaching the devicemapper.DeleteDevice call in daemon/graphdriver/devmapper/deviceset.go:deleteTransaction, causing it to fail with EBUSY.

Issues (1) and (2) are fixed by upstream's "Do not remove containers from memory on error" (54dcbab25ea4771da303fa95e0c26f2d39487b49).

Issue (3) can be fixed or worked around with one of these options:

1. Disabling dm.use_deferred_removal. The non-deferred removal path has a retry loop to deal with potential EBUSY.

2. Enabling dm.use_deferred_deletion. This way the delete operation is also deferred, making the whole operation coherent with dm.use_deferred_removal=true.

3. Implement a retry loop for devicemapper.DeleteDevice in daemon/graphdriver/devmapper/deviceset.go:deleteTransaction.

Comment 2 Sergio Lopez 2017-06-21 10:48:59 UTC

I've just created a PR request for addressing this issue: https://github.com/projectatomic/docker/pull/258

After rereading the code, I've decided against implementing a retry loop for devicemapper.DeleteDevice in daemon/graphdriver/devmapper/deviceset.go:deleteTransaction, as IMHO the actual problem is that having deferred removal without deferred deletion is not coherent, as deletion depends on removal (deactivation).

Comment 3 Sergio Lopez 2017-06-27 14:10:36 UTC

*** Bug 1460728 has been marked as a duplicate of this bug. ***

Comment 4 Sergio Lopez 2017-06-27 15:21:02 UTC

The PR mentioned in comment #2 has been merged into Project Atomic's Docker, in branches docker-1.12.6, docker-1.13.1 and docker-1.13.1-rhel.

With this change, the situation has improved, as container clean up is less likely to fail, and if it fails, it will not delete the container, so the user can try to delete it later (this is better than leaking a DM device).

But I'd still like to minimize the situations in which the auto-removal fails. So far I've identified two situations that trigger the issue:

 1. Some container mount points being leaked into systemd-machined namespace. This looks like a bug in systemd, and I've created a separate bug for it (BZ #1465485). Retrying the call to DeleteDevice will not help, as the the DM device will be busy until systemd-machined exits.

 2. Device-Mapper losing the race against the docker daemon, which calls to devicemapper.DeleteDevice before the deferred removal has completed. I'm still not 100% if this is a simple race, or if someone is keeping the DM busy, but retrying the deletion always succeeds, so I've created a PR upstream implementing a simple retry loop (https://github.com/moby/moby/pull/33846).

Comment 6 Vivek Goyal 2017-07-18 11:09:43 UTC

One more PR has been committed upstream and backported in projectatomic/docker. I am hoping that helps with the problem.

https://github.com/projectatomic/docker/issues/266

Lokesh, can we do another docker build with latest commit in the tree.

Comment 7 Vivek Goyal 2017-07-24 12:19:42 UTC

Latest docker build has the changes which might fix the issue. Give it a try.

docker-2:1.12.6-48.git0fdc778

Note You need to log in before you can comment on or make changes to this bug.