RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1262374 - docker-selinux: Package upgrade taking a long time and using large amounts of resources
Summary: docker-selinux: Package upgrade taking a long time and using large amounts of...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.1
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Daniel Walsh
QA Contact: atomic-bugs@redhat.com
URL:
Whiteboard:
Depends On: 1251458
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-09-11 13:51 UTC by Daniel Walsh
Modified: 2019-03-06 00:43 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1251458
Environment:
Last Closed: 2016-03-31 23:22:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0536 0 normal SHIPPED_LIVE docker bug fix and enhancement update 2016-04-01 03:19:56 UTC

Description Daniel Walsh 2015-09-11 13:51:07 UTC
+++ This bug was initially created as a clone of Bug #1251458 +++

Description of problem:

While upgrading the docker-selinux package, a `restorecon` command is being run and taking multiple minutes and occupying huge CPU and I/O resources.

I took a look at the command line but didn't manage to save it, but noticed it included /var/lib/docker. That has the potential to take a huge time if lots of images exist, and I have a suspicion it might be problematic with symbolic/hard links being followed outside of a container environment.


Version-Release number of selected component (if applicable):

docker-selinux.x86_64 1.7.1-4.gitcc60fc3.fc22


How reproducible:

Hard to tell, but probably easily since it seems the command is part of the package's upgrade script.


Steps to Reproduce:

1. Upgrade the docker-selinux package using dnf
2. Watch it take forever


Actual results:

Package update took 10+ mins, in a desktop system with a fast SSD. Would probably take much longer if /var/lib/docker was in a slow HDD, or in a laptop.


Expected results:

Package upgrade does not perform blocking, resource and time-consuming tasks unnecessarily. I also hope it won't happen every time the package is upgraded, which would be very inconvenient.

--- Additional comment from Daniel Walsh on 2015-08-07 07:57:00 EDT ---

Yes not much we can do about, if the labeling changes we need to fix the labeling, if there are a huge number of files, then it will take some time.

It should only happen this once.  If you did a resistall of docker-selinux it should happen much quicker.

--- Additional comment from Daniel Miranda on 2015-08-07 09:15:15 EDT ---

What about a service that runs before the docker service, checks if a relabelling is needed and does it instead? Seems much saner than doing this with no warning or choice during package upgrades. It seems even crazier in servers if it means a possible service disruption due to resource exhausting, with no way for an administrator to predict it.

--- Additional comment from Daniel Walsh on 2015-08-08 05:31:33 EDT ---

This has been the way that SELinux has been updating for greater then 10 years.  

There should be no resource consumption, since basically it is walking /var/lib/docker and changing context, with a fairly stable tool restorecon/setfiles.

The loading/compiling which takes a lot of memory and CPU is greatly improved in Fedora 23 which should improve the situation.

How many files do you have under /var/lib/docker?

You can also eliminate the relabel of /var/lib/docker by adding /var/lib/docker to /etc/selinux/fixfiles_exclude_dirs.

man fixfiles

--- Additional comment from Daniel Miranda on 2015-08-25 20:32:58 EDT ---

I just got another update and I can confirm there was actually a lot of resource consumption. I/O was heavily saturated and about 80% CPU was used. Changing the contexts itself obviously isn't free, it causes a bunch of disk writes and cache trashing. I understand the need, but it's honestly quite a bit inconvenient still.

My /var/lib/docker folder has 1728485 files totalling 31GB, and this time the relabelling took about 3min.

I don't have a F23 installation right now to test, but it's nice to know work is happening on making things speedier. For now though, I'll probably have to exclude /var/lib/docker from the relabelling and do it myself occasionally.

--- Additional comment from Daniel Walsh on 2015-08-26 07:34:05 EDT ---

This should be a one time occurance, not something to happen regularly.  The 80% CPU might be the recompiling of policy.

--- Additional comment from Daniel Miranda on 2015-08-26 12:50:57 EDT ---

As I mentioned, it happened for a second time in less than a month. If it wasn't supposed to happen I can attempt to provide information to help you debug it.

--- Additional comment from Daniel Miranda on 2015-09-11 08:03:08 EDT ---

Just for the record: creating a /etc/selinux/fixfiles_exclude_dirs with a single line of /var/lib/docker had no effect on the relabelling. It happened *again* and is taking 10 minutes *again*.

--- Additional comment from Daniel Walsh on 2015-09-11 08:09:19 EDT ---

When is this happening on a selinux-policy-targeted update?

--- Additional comment from Daniel Miranda on 2015-09-11 08:49:18 EDT ---

Apologies, but I don't think I understood your question (or I can't answer it). What I see is a very long delay while DNF is upgrading docker-selinux. This time when I saw it in the pending upgrade list, I edited /etc/selinux/fixfiles_exclude_dirs, but the upgrade still seemingly did the relabel, as it took a good amount of time.

Thanks for taking the time to help me.

--- Additional comment from Daniel Walsh on 2015-09-11 09:07:34 EDT ---

There are two things that can cause the delay. Every selinux-policy update currently will take over a minute while the policy compiles.  This is a selinux policy compiler that we have faced for years, which is finally fixed in Rawhide.  But will not be backported to older versions.

You can see this delay by executing:

# semodule -B

The second delay is looking for changes in the file_context files between one version of policy and the next.  If there is a change then the rpm will do as minimum a fixfilex/restorecon recursively on the difference as possible.  This usually should not take that long, unless it ends up doing something like restorecon -R /var or restorecon -R /home.

In this case it walks those parts of the file system and fixes the labels.  
fixfiles is supposed to skip any directories in the /etc/selinux/fixfiles_exclude_dirs file

I will attach a patch script that would show what it will exclude.

--- Additional comment from Daniel Walsh on 2015-09-11 09:08:40 EDT ---

This code is cut from the current fixfiles on my machine.  Fedora-Rawhide

--- Additional comment from Daniel Miranda on 2015-09-11 09:27:39 EDT ---

Running `semodule -B` manually takes 13.4 seconds, while the package upgrade takes 10m+ plus. Doesn't look like it is the issue to me.

You say that doing a restorecon in /var would be bad, but I think even getting into /var/lib/docker would already cause a long delay, since it's quite large in my system (currently 20GB).

Running your script (replacing logit with echo) shows:

$ sh fixfiles.sh 
skipping the directory /var/lib/docker
 -e /var/lib/docker

But looking at the docker spec file, I don't see it using the fixfiles script at all. It calls restorecon manually:

http://pkgs.fedoraproject.org/cgit/docker.git/tree/docker.spec#n73

Maybe that's the issue, and why the exclusion had no effect.

--- Additional comment from Daniel Walsh on 2015-09-11 09:37:21 EDT ---

Ok I was thinking you were reporting this on selinux-policy update.

If the docker-selinux package is restorecon -R /var/lib/docker on every update that is indeed wrong.

--- Additional comment from Daniel Walsh on 2015-09-11 09:50:38 EDT ---

Lokesh pleas patch docker.spec to only run restorecon on /var/lib/docker on initial install.  Otherwise on lare /var/lib/docker this will go very slow.

Comment 2 Daniel Walsh 2015-10-28 14:16:20 UTC
Fixed in docker-1.9

Comment 4 Luwen Su 2016-02-03 09:19:55 UTC
In docker-1.9.1-15.el7.x86_64, the new patch is in docker.spec.
As i have limited resource to test the installing speed, only check the code.

Comment 6 errata-xmlrpc 2016-03-31 23:22:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0536.html


Note You need to log in before you can comment on or make changes to this bug.