Bug 902483 - Cannot handle 2000 mounts
Summary: Cannot handle 2000 mounts
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 17
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 908531 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-21 18:56 UTC by Ben Greear
Modified: 2013-03-04 22:33 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-03-04 22:33:21 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Ben Greear 2013-01-21 18:56:13 UTC
Description of problem:

I have a system with 2000+ mount points.  On shutdown, there are lots of
systemd errors about having too many open files.  Shutdown can hang after
this, although not always, and I'm not sure if this is root cause or not.

systemd[1]: Failed to kill control group: Too many open files
systemd[1]: Failed to kill control group: Too many open files
systemd[1]: irqbalance.service failed to kill processes: Too many open files
systemd[1]: Failed to kill control group: Too many open files
systemd[1]: irqbalance.service failed to kill processes: Too many open files
systemd[1]: Unit irqbalance.service entered failed state.
systemd[1]: Failed to kill control group: Too many open files
systemd[1]: atd.service failed to kill processes: Too many open files
systemd[1]: Failed to kill control group: Too many open files
systemd[1]: atd.service failed to kill processes: Too many open files
systemd[1]: Unit atd.service entered failed state.
systemd[1]: Failed to kill control group: Too many open files
systemd[1]: chronyd.service failed to kill processes: Too many open files
systemd[1]: Failed to kill control group: Too many open files
systemd[1]: chronyd.service failed to kill processes: Too many open files
systemd[1]: Unit chronyd.service entered failed state.
systemd[1]: Failed to kill control group: Too many open files



Version-Release number of selected component (if applicable):


How reproducible:

Error messages are always reproducible..the hang on shutdown is not,
and may or may not be related to systemd.


Steps to Reproduce:
1. Create 2000 NFS mounts (probably other mounts would work as well)
2. 'reboot'
3. Watch console output for the errors.
  
Actual results:

Lots of 'too many open files' error messages.


Expected results:

Clean shutdown.

Additional info:

Comment 1 Jóhann B. Guðmundsson 2013-01-21 19:39:47 UTC
Have you increased the default file descriptor limit in units/system.conf to handle all these mount points? 

Also see...

http://en.usenet.digipedia.org/thread/18978/19676/

Comment 2 Michal Schmidt 2013-01-22 12:40:13 UTC
This may be fixed by:

commit 4096d6f5879aef73e20dd7b62a01f447629945b0
Author: Lennart Poettering <lennart>
Date:   Mon Sep 17 16:35:59 2012 +0200

    main: bump up RLIMIT_NOFILE for systemd itself

... which is not in F17 however.

Comment 3 Ben Greear 2013-01-22 18:13:41 UTC
I tried editing file:

/etc/systemd/system.conf

and changed the setting as below:

DefaultLimitNOFILE=6000


It still complained and hung on reboot when I have 3000 mounts.
I have updated it to 12,000
and will try again.  But, maybe the systemd code needs some better
logic to deal with lots of open files and/or better recovery
logic if it does hit an error?

Even if it can't bring things down gracefully, it would be nice if
at least it could manage a reboot...

Comment 4 Michal Schmidt 2013-01-22 19:25:34 UTC
DefaultLimitNOFILE has no effect on systemd itself. It's a setting for the services it spawns.

We really need to backport that patch.

We can also consider economizing systemd's fd usage. For example, timerfds - we could have a tree of timeouts and schedule always only the earliest one using a single timerfd.

Comment 5 Ben Greear 2013-01-22 19:36:34 UTC
That patch in comment #2 still hard-codes things (though 64k is
big enough for anyone! (tm))

Maybe instead make it configurable in the system.conf file,
with a 64k default?

If someone can cook up a patched RPM for 64-bit Fedora 17,
I'll be happy to test it.

Comment 6 Lennart Poettering 2013-02-14 18:22:11 UTC
*** Bug 908531 has been marked as a duplicate of this bug. ***

Comment 7 Joe Miller 2013-02-14 20:25:33 UTC
We would love to see this backported to F17 which was our reason for opening the dupe bug 908531 (it was more of a request than a bug report.)

Ben, we have backported this to f17 rpms because it is very important to us. You can build rpm's yourself from this repo:  https://github.com/pantheon-systems/systemd


sudo yum install -y yum-utils rpm-build spectool
sudo yum-builddep systemd
git clone git:pantheon-systems/systemd.git
cd systemd/
git checkout f17
cd ..
mkdir -p ~/rpmbuild/{BUILD,RPMS,SOURCES,SPECS,SRPMS}
spectool --get-files --sourcedir systemd/systemd.spec
cp systemd/* ~/rpmbuild/SOURCES/
rpmbuild -ba systemd/systemd.spec

Comment 8 Fedora Update System 2013-02-15 10:15:33 UTC
systemd-44-24.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/systemd-44-24.fc17

Comment 9 Fedora Update System 2013-02-16 01:19:25 UTC
Package systemd-44-24.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-44-24.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-2564/systemd-44-24.fc17
then log in and leave karma (feedback).

Comment 10 Fedora Update System 2013-03-04 22:33:24 UTC
systemd-44-24.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.