| Summary: | Handle docker corruption effectively | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jaspreet Kaur <jkaur> |
| Component: | docker | Assignee: | Vivek Goyal <vgoyal> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | atomic-bugs <atomic-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.2 | CC: | aos-bugs, aweiteka, bvincell, dwalsh, ghelleks, jokerman, lsm5, michael.voegele, mmccomas |
| Target Milestone: | rc | Keywords: | Extras |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-10-18 13:16:06 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Jaspreet Kaur
2016-03-19 05:21:47 UTC
This seems to be pointed more at docker registry then at docker. Couldn't you have just removed some container images, and then it would start working again? Hello, No it doesnt help once it is corrupted + it is not an effective way as it might prevent deployment of application that need those images for an existing project. Regards, Jaspreet My point is there is probably a lot of junk images that you don't even know about. atomic image prune Should get rid of hanging images which nothing is using. It would temporarily get you out of this situation, and get your containers working again. Being able to expand the disk image would also help supposedly. Does a reboot solve the problem? If thin pool is full, xfs can infinitely and to solve that, one needs to add more storage to thin pool and as of now system needs to be rebooted to get rid of unkillable IO thread. I think after reboot,one can also first try to delete some images and hopefully that will work. If not, we first need to add more storage to thin pool and make sure it grows successfully and then do further docker operations. Once you run into the situation, please attach following - journalctl output - Preferrably run docker daemon in debug mode (-D) - output of commands "lvs", "vgs" We need to document a better way to get out of this state. You need to reboot. atomic images purge Now if you still need more space list docker images and see if there is other images that can be removed. Long range we have patches for docker-1.10 that will block docker pull and docker create when the system is 90% used up. No I am not saying that this will not happen in docker-1.10, we are just taking steps to make it less likely, Giving the users 10% of disk space to figure out he is having a problem. This will block new containers and images from being installed but will not prevent existing containers from growing. Thanks Daniel for the information. But if the docker gets corrupt after growing containers it should have an easy way to get it back to ready state. Reboot will not be an option for any of the users. The only concern is that even they they take preventions and meet the corrupt state then there should be a resolution to that. xfs going wild is a kernel issue that we can not fix. I believe their is a kernel bug on it. Only way to fix this with current kernels is to reboot. Hello Daniel, Can you please share the Kernel bugzilla on this. Regards, Jaspreet Vivek, I could not google up a bugzilla on the kernel for this. Do you know of any? Dan, Following is one of the bugs which talked about xfs being full and leading to hang. https://bugzilla.redhat.com/show_bug.cgi?id=1240437 Since we are now shipping docker-1.10, I am going to close this as fixed in the current release. |