Bug 1251458 - docker-selinux: Package upgrade taking a long time and using large amounts of resources
Summary: docker-selinux: Package upgrade taking a long time and using large amounts of...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: docker
Version: 22
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Lokesh Mandvekar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1262374
TreeView+ depends on / blocked
 
Reported: 2015-08-07 10:57 UTC by Daniel Miranda
Modified: 2015-09-28 18:21 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1262374 (view as bug list)
Environment:
Last Closed: 2015-09-28 18:21:56 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Script to show what fixfiles will exclude on your system (592 bytes, text/plain)
2015-09-11 13:08 UTC, Daniel Walsh
no flags Details
Patch to only run restorecon -R -v /var/lib/docker on initial install (1.28 KB, patch)
2015-09-11 13:50 UTC, Daniel Walsh
no flags Details | Diff

Description Daniel Miranda 2015-08-07 10:57:28 UTC
Description of problem:

While upgrading the docker-selinux package, a `restorecon` command is being run and taking multiple minutes and occupying huge CPU and I/O resources.

I took a look at the command line but didn't manage to save it, but noticed it included /var/lib/docker. That has the potential to take a huge time if lots of images exist, and I have a suspicion it might be problematic with symbolic/hard links being followed outside of a container environment.


Version-Release number of selected component (if applicable):

docker-selinux.x86_64 1.7.1-4.gitcc60fc3.fc22


How reproducible:

Hard to tell, but probably easily since it seems the command is part of the package's upgrade script.


Steps to Reproduce:

1. Upgrade the docker-selinux package using dnf
2. Watch it take forever


Actual results:

Package update took 10+ mins, in a desktop system with a fast SSD. Would probably take much longer if /var/lib/docker was in a slow HDD, or in a laptop.


Expected results:

Package upgrade does not perform blocking, resource and time-consuming tasks unnecessarily. I also hope it won't happen every time the package is upgraded, which would be very inconvenient.

Comment 1 Daniel Walsh 2015-08-07 11:57:00 UTC
Yes not much we can do about, if the labeling changes we need to fix the labeling, if there are a huge number of files, then it will take some time.

It should only happen this once.  If you did a resistall of docker-selinux it should happen much quicker.

Comment 2 Daniel Miranda 2015-08-07 13:15:15 UTC
What about a service that runs before the docker service, checks if a relabelling is needed and does it instead? Seems much saner than doing this with no warning or choice during package upgrades. It seems even crazier in servers if it means a possible service disruption due to resource exhausting, with no way for an administrator to predict it.

Comment 3 Daniel Walsh 2015-08-08 09:31:33 UTC
This has been the way that SELinux has been updating for greater then 10 years.  

There should be no resource consumption, since basically it is walking /var/lib/docker and changing context, with a fairly stable tool restorecon/setfiles.

The loading/compiling which takes a lot of memory and CPU is greatly improved in Fedora 23 which should improve the situation.

How many files do you have under /var/lib/docker?

You can also eliminate the relabel of /var/lib/docker by adding /var/lib/docker to /etc/selinux/fixfiles_exclude_dirs.

man fixfiles

Comment 4 Daniel Miranda 2015-08-26 00:32:58 UTC
I just got another update and I can confirm there was actually a lot of resource consumption. I/O was heavily saturated and about 80% CPU was used. Changing the contexts itself obviously isn't free, it causes a bunch of disk writes and cache trashing. I understand the need, but it's honestly quite a bit inconvenient still.

My /var/lib/docker folder has 1728485 files totalling 31GB, and this time the relabelling took about 3min.

I don't have a F23 installation right now to test, but it's nice to know work is happening on making things speedier. For now though, I'll probably have to exclude /var/lib/docker from the relabelling and do it myself occasionally.

Comment 5 Daniel Walsh 2015-08-26 11:34:05 UTC
This should be a one time occurance, not something to happen regularly.  The 80% CPU might be the recompiling of policy.

Comment 6 Daniel Miranda 2015-08-26 16:50:57 UTC
As I mentioned, it happened for a second time in less than a month. If it wasn't supposed to happen I can attempt to provide information to help you debug it.

Comment 7 Daniel Miranda 2015-09-11 12:03:08 UTC
Just for the record: creating a /etc/selinux/fixfiles_exclude_dirs with a single line of /var/lib/docker had no effect on the relabelling. It happened *again* and is taking 10 minutes *again*.

Comment 8 Daniel Walsh 2015-09-11 12:09:19 UTC
When is this happening on a selinux-policy-targeted update?

Comment 9 Daniel Miranda 2015-09-11 12:49:18 UTC
Apologies, but I don't think I understood your question (or I can't answer it). What I see is a very long delay while DNF is upgrading docker-selinux. This time when I saw it in the pending upgrade list, I edited /etc/selinux/fixfiles_exclude_dirs, but the upgrade still seemingly did the relabel, as it took a good amount of time.

Thanks for taking the time to help me.

Comment 10 Daniel Walsh 2015-09-11 13:07:34 UTC
There are two things that can cause the delay. Every selinux-policy update currently will take over a minute while the policy compiles.  This is a selinux policy compiler that we have faced for years, which is finally fixed in Rawhide.  But will not be backported to older versions.

You can see this delay by executing:

# semodule -B

The second delay is looking for changes in the file_context files between one version of policy and the next.  If there is a change then the rpm will do as minimum a fixfilex/restorecon recursively on the difference as possible.  This usually should not take that long, unless it ends up doing something like restorecon -R /var or restorecon -R /home.

In this case it walks those parts of the file system and fixes the labels.  
fixfiles is supposed to skip any directories in the /etc/selinux/fixfiles_exclude_dirs file

I will attach a patch script that would show what it will exclude.

Comment 11 Daniel Walsh 2015-09-11 13:08:40 UTC
Created attachment 1072554 [details]
Script to show what fixfiles will exclude on your system

This code is cut from the current fixfiles on my machine.  Fedora-Rawhide

Comment 12 Daniel Miranda 2015-09-11 13:27:39 UTC
Running `semodule -B` manually takes 13.4 seconds, while the package upgrade takes 10m+ plus. Doesn't look like it is the issue to me.

You say that doing a restorecon in /var would be bad, but I think even getting into /var/lib/docker would already cause a long delay, since it's quite large in my system (currently 20GB).

Running your script (replacing logit with echo) shows:

$ sh fixfiles.sh 
skipping the directory /var/lib/docker
 -e /var/lib/docker

But looking at the docker spec file, I don't see it using the fixfiles script at all. It calls restorecon manually:

http://pkgs.fedoraproject.org/cgit/docker.git/tree/docker.spec#n73

Maybe that's the issue, and why the exclusion had no effect.

Comment 13 Daniel Walsh 2015-09-11 13:37:21 UTC
Ok I was thinking you were reporting this on selinux-policy update.

If the docker-selinux package is restorecon -R /var/lib/docker on every update that is indeed wrong.

Comment 14 Daniel Walsh 2015-09-11 13:50:38 UTC
Created attachment 1072566 [details]
Patch to only run restorecon -R -v /var/lib/docker on initial install

Lokesh pleas patch docker.spec to only run restorecon on /var/lib/docker on initial install.  Otherwise on lare /var/lib/docker this will go very slow.

Comment 15 Daniel Walsh 2015-09-11 13:54:01 UTC
This should go out with the docker-1.8.2 update

Comment 16 Daniel Walsh 2015-09-28 18:21:56 UTC
Fixed in docker-1.8.2


Note You need to log in before you can comment on or make changes to this bug.