Bug 1463534
Summary: | Parallel creation of containers with "--rm" leads to "Error response from daemon: Unable to remove filesystem for (...)" | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Sergio Lopez <slopezpa> |
Component: | docker | Assignee: | Antonio Murdaca <amurdaca> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | atomic-bugs <atomic-bugs> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.3 | CC: | amurdaca, byount, dornelas, dwalsh, imcleod, inetkach, jamills, lsm5, santiago, schoudha, vgoyal |
Target Milestone: | rc | Keywords: | Extras |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | docker-1.12.6-48.git0fdc778.el7.x86_64 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-11-21 20:09:05 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1465485 | ||
Bug Blocks: | 1186913 |
Description
Sergio Lopez
2017-06-21 08:14:37 UTC
I've just created a PR request for addressing this issue: https://github.com/projectatomic/docker/pull/258 After rereading the code, I've decided against implementing a retry loop for devicemapper.DeleteDevice in daemon/graphdriver/devmapper/deviceset.go:deleteTransaction, as IMHO the actual problem is that having deferred removal without deferred deletion is not coherent, as deletion depends on removal (deactivation). *** Bug 1460728 has been marked as a duplicate of this bug. *** The PR mentioned in comment #2 has been merged into Project Atomic's Docker, in branches docker-1.12.6, docker-1.13.1 and docker-1.13.1-rhel. With this change, the situation has improved, as container clean up is less likely to fail, and if it fails, it will not delete the container, so the user can try to delete it later (this is better than leaking a DM device). But I'd still like to minimize the situations in which the auto-removal fails. So far I've identified two situations that trigger the issue: 1. Some container mount points being leaked into systemd-machined namespace. This looks like a bug in systemd, and I've created a separate bug for it (BZ #1465485). Retrying the call to DeleteDevice will not help, as the the DM device will be busy until systemd-machined exits. 2. Device-Mapper losing the race against the docker daemon, which calls to devicemapper.DeleteDevice before the deferred removal has completed. I'm still not 100% if this is a simple race, or if someone is keeping the DM busy, but retrying the deletion always succeeds, so I've created a PR upstream implementing a simple retry loop (https://github.com/moby/moby/pull/33846). One more PR has been committed upstream and backported in projectatomic/docker. I am hoping that helps with the problem. https://github.com/projectatomic/docker/issues/266 Lokesh, can we do another docker build with latest commit in the tree. Latest docker build has the changes which might fix the issue. Give it a try. docker-2:1.12.6-48.git0fdc778 |