Bug 919793 - /etc/security/limits.d/90-nproc.conf enforces ancient ulimit of 1024 nproc's and overrides /etc/security/limits.conf custom settings
/etc/security/limits.d/90-nproc.conf enforces ancient ulimit of 1024 nproc's ...
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: pam (Show other bugs)
6.3
All Linux
medium Severity medium
: rc
: ---
Assigned To: Tomas Mraz
Dalibor Pospíšil
:
: 1298576 (view as bug list)
Depends On:
Blocks: 1002711
  Show dependency treegraph
 
Reported: 2013-03-10 01:05 EST by james.greene
Modified: 2016-01-14 20:47 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-09-18 09:54:14 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description james.greene 2013-03-10 01:05:08 EST
Description of problem:

/etc/security/limits.d/90-nproc.conf was introduced in pam package in RH6. It contains this line:

# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

*          soft    nproc     1024


This line overrides the conventionally set /etc/security/limits.conf value of the same name. Years of expected behavior are thrown out the window and honest system administrators are exposed to outages on Redhat 6!

BZ 432903 states many contradictory things in 2008:

"The obvious argument against this is that it doesn't prevent a malicious user
from raising their soft limit and forkbombing the system."

And:

"For typical desktop use, 256 would probably be fine, but it might annoy
power-users, and cause problems for highly-threaded java servlets that aren't so
highly-threaded to warrant limits.conf configuration currently."


The original Bugzilla starts showing some rebellion in 2009, and in 2011, 2012, and 2013, various users complain about their outages when using the previously-working /etc/security/limits.conf settings.

The entire methodology of migrating widely-impacting config settings into separate subdirectory files is suspect and open to abuse by any package maintainer.

An earlier model for this (/etc/modprobe.d) does not have the same logic -- because values for a package will be specific to that particular module. Setting module parameters for a foreign module would be considered rude at best.

But different package maintainers will have different philosophical and practical considerations in adjusting ulimits for all or some users. There is much room for conflict there!

And thus, nproc * soft 1024 is just the first conflict.


Version-Release number of selected component (if applicable):

Redhat 6 all versions. 6.1-6.3 tested at the least.

How reproducible:

100%

Steps to Reproduce:
1. vi /etc/security/limits.conf and put in desired value for nproc. Verify settings:
# grep '^[^#]*soft.*nproc' /etc/security/limits.conf /etc/security/limits.d/90-nproc.conf
/etc/security/limits.conf:*             soft    nproc   16384
/etc/security/limits.d/90-nproc.conf:*          soft    nproc     1024

2. Re-login and ulimit -Su:
# ulimit -Su
1024

3. vi /etc/security/limits.d/90-nproc.conf and comment out the line:
# grep '^[^#]*soft.*nproc' /etc/security/limits.conf /etc/security/limits.d/90-nproc.conf
/etc/security/limits.conf:*             soft    nproc   16384

4. Re-login and ulimit -Su:
# ulimit -Su
16384

Actual results:

# ulimit -Su
Returns the value in /etc/security/limits.d/90-nproc.conf and the default package install overrides the custom (and customary) value in /etc/security/limits.conf.

Expected results:

/etc/security/limits.conf is the conventional location for ulimit settings.

Creating /etc/security/limits.d/90-nproc.conf and similar files creates a "Who is the master file conundrum?"

Meaning, if you have a value in each of these files, who is the real master? What about other files in /etc/security/limits.d?

Each side of the argument has merit, but the convention is that limits.conf is the master config location. With RH6, this is no longer the case, and package maintainers are free to abuse the new convention by dropping in their own philosophical values.

In this case, 90-nproc.conf has a default value placed in by the pam rpm, and that value is taken as the overriding value, obviating years of experience.

This results in outages. Perhaps only in the upper 5% of use cases, but is that where we want to concentrate avoidable failures?


Additional info:

Invisibly overriding conventional default configuration locations is always a bad engineering practice. The only way a normal sysadmin will know about this setting is after their first, second, or third outage. Presumably they lost their job if there was a fourth outage.

Additional problem: The default pam package's /etc/security/limits.conf does not even reference the new /etc/security/limits.d/ structure in its comments -- missing a critical "heads-up" mechanism to sysadmins everywhere.

Additional problem: The /usr/share/doc/pam-1.1.1/txts/README.pam_limits file references the /etc/security/limits.d/ structure, but does not make plain the consequences of a default 90-nproc.conf file: it will override your normal limits.conf setting. This paragraph is too coy to be comprehended -- only the most paranoid will understand the trap of the 90-nproc.conf file:

"By default limits are taken from the /etc/security/limits.conf config file.
Then individual *.conf files from the /etc/security/limits.d/ directory are
read. The files are parsed one after another in the order of "C" locale. The
effect of the individual files is the same as if all the files were
concatenated together in the order of parsing. If a config file is explicitly
specified with a module option then the files in the above directory are not
parsed."  [snooze]

Additional problem: In standard RPM config, there is the macro %config(noreplace), which has the effect of "don't blow away folks' configs even if we have better stuff". By duplicating a config value out of the normal location into somewhere else, you have circumvented the %config vs %config(noreplace) logic. A new macro would logically need to preserve the desirable "don't blow away folks' configs" ethos: %config(don'toverridethisotherfile,"/etc/security/limits.conf"). I don't see this happening.

Final problem: This creates a mechanism whereby random package maintainers have an easy drop-in method to override local /etc/security/limits.conf settings, aside from %{POSTIN}. All RH6 sysadmins must now always defensively manage /etc/security/limits.d, and ANY RPM CAN DROP A * VALUE THERE. As an example, look at the rhevm rpm adding /etc/security/limits.d/10-ovirt-engine.conf:

ovirt - nofile 65535

Different packages will have different priorities. As referenced in the original BZ 432903, the "make" maintainer may take to heart the comments about "make -j" as a threat to proper system administration, and add 95-make-nproc.conf to the make rpm with another value mentioned in BZ 432903:

*             soft    nproc   256

Think of the outages!

Meanwhile, the maintainer of httpd might see the comments form BZ 432903 from 2011:

"There should be some documentation about this setting for Apache admins going nuts about their MaxClients setting being ignored!"

and then the httpd maintainer says, "I must save my people, er, rpm users!" and puts in 99-httpd-nproc.conf:

*             soft    nproc   106496

Thus, two different package maintainers each can read the same BZ and come up with two mutually-exclusive values to put in. They are both valid from their own packages' standpoint, but they are at odds with each other. Shall we debate whether all the ulimits truly address resource limit exhaustion cases?

But, what is the logic of who wins the /etc/security/limits.d/ race?

Numeric/alpha ordering. A determined and phonebook-savvy package maintainer would then always create his/her limits.d file as: 99-zzzzzzzzzz.conf. And maybe put %config(noreplace) on it.

A better solution would be to isolate the values by their true linear breakdowns:

/etc/limits.d/[user]/[soft|hard]/[ulimit].conf

So, the ovirt value would be set up in these two files:

/etc/limits.d/ovirt/soft/nofile.conf
/etc/limits.d/ovirt/hard/nofile.conf

Each file contains only the value:

65535

Should comments be a good idea? Hmmm...

But what to do with the really protected stuff (*)? It would be in this directory:

/etc/limits.d/*/soft/
/etc/limits.d/*/hard/

By "*" I mean really, a directory named "*".

This could get ugly, when one considers netgroup entries (@admin) -- but then again, it might be an easier way to manage netgroups as well, just leave that to the end users...
Comment 1 james.greene 2013-03-10 11:16:30 EDT
I came up with a consistent fix for my point that a sysadmin has to defend against new settings popping up in /etc/security/limits.d:

cd /etc/security/limits.d/
ln -s ../limits.conf 99-zzzzzzzzzzzzzz.conf

This way, when all the files are stacked, the limits.conf file is re-examined last of all files in limits.d, due to the symlink with exaggerated alphanumeric listing.

Aside from my other points, can you put this symlink into a base package to protect the original intent of limits.conf being the real config location?
Comment 2 Tomas Mraz 2013-03-12 07:12:20 EDT
I mostly agree with you that the ordering should have been done the other way - that is the limits.conf contents should override the settings in the files in the limits.d directory. On the other hand the upstream pam_limits module behaves this way and we cannot change this in a released version of the Red Hat Enterprise Linux as we could break configurations that depend on this.

We can document your workaround in the manual pages and the limits.conf file.
Comment 3 james.greene 2013-03-23 18:46:45 EDT
Tomas, I understand your statement that this bug is now in the current release and upstream (I see for several years). However, the expected behavior is that /etc/security/limits.conf is the authoritative configuration file.

The bug behavior breaks the user experience, and will lead to continued outages for future generations. And it is only slowly being seen to cause issues because very few packagers (exception: pam's 90-nproc.conf) have seen fit to use it.

To address the statement, "we could break configurations that depend on this", consider the likely impact -- if you revert to original behavior and a user puts in limits.conf a value that overrides the few examples of packagers' limits.d/*.conf files, then who is to blame?

The user is to blame -- with any config file, you are free to set your own values (as long as they are valid) and cause yourself problems.
Comment 4 Jeff Lightner 2013-07-01 15:12:35 EDT
I just ran into this on RHEL6.4 and am in agreement that it is annoying that this seems to have silently changes in RHEL6.

It would really help if the messages said not just that a resource had been exceeded but specifically WHICH resource (e.g. nproc, nfile some other...) had been exceeded.

Failing that and in line with other posts here why not at least put a comment in the default limits.conf file that highlights it:
"******NOTE:  These values may be overridden by /etc/limits.d/* files.*********

Related command line symptoms that I saw before the secure.log:
On doing su - <user> (or sudo su - <user>) I saw initially:
su: cannot set user id: Resource temporarily unavailable

Later I saw:
-bash: fork: retry: Resource temporarily unavailable

I guess the fork error should have been my clue it was an nproc limit but that wasn't what I saw initially.
Comment 5 RHEL Product and Program Management 2013-10-14 00:07:45 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 6 Mimmus 2013-12-18 10:40:03 EST
Serious annoyance for me too :(
Half a day struggling with Oracle and this limit.
Comment 7 Jeremy Nickurak 2014-02-21 15:30:07 EST
Chrome/chromium can *easily* have a few hundred threads just by itself.

Without anything else doing crazy multi-process:

$ ps xH | wc -l
490

Do anything interesting on top of that, and it's pretty easy to hit the current limit.
Comment 8 Christina Plummer 2014-07-29 14:38:52 EDT
+1 to everyone saying that this is unexpected behavior for admins/software accustomed to configuring limits.conf - we had no idea that the 90-nproc.conf file existed.  This just caused an issue for us in RHEL 6.5 running Weblogic.
Comment 13 Chris Williams 2015-09-18 09:54:14 EDT
This Bugzilla has been reviewed by Red Hat and is not planned on being addressed in Red Hat Enterprise Linux 6 and therefore will be closed. If this bug is critical to production systems, please contact your Red Hat support representative and provide sufficient business justification.
Comment 14 ihavenoemail 2016-01-14 08:22:51 EST
Why the hell is there such a file?Any good reason?
Comment 15 Tomas Mraz 2016-01-14 09:28:22 EST
*** Bug 1298576 has been marked as a duplicate of this bug. ***
Comment 16 Jeff Lightner 2016-01-14 09:59:16 EST
(In reply to ihavenoemail from comment #14)
> Why the hell is there such a file?Any good reason?

The conf.d exists as it allows one to do individual settings for variables that might be complex (e.g. you've got cases where you're setting up a different value for each of multiple users).   Having separate files allows you to focus only on the value you're interested in rather than having to look at all other values which may also have separate settings for different users.

This is actually a common thing in RHEL where they will have standard conf file and a conf.d subdirectory that allows one to separate items.  (Have a look at /etc/xinetd.conf vs /etc/xinetd.d for exmaple.)   

My complaint was never that it exists but that they didn't bother to put a note in the main conf file to tell you it exists at the point they first began using it.
Comment 17 ihavenoemail 2016-01-14 20:47:42 EST
(In reply to Jeff Lightner from comment #16)
> (In reply to ihavenoemail from comment #14)
> > Why the hell is there such a file?Any good reason?
> 
> The conf.d exists as it allows one to do individual settings for variables
> that might be complex (e.g. you've got cases where you're setting up a
> different value for each of multiple users).   Having separate files allows
> you to focus only on the value you're interested in rather than having to
> look at all other values which may also have separate settings for different
> users.
> 
> This is actually a common thing in RHEL where they will have standard conf
> file and a conf.d subdirectory that allows one to separate items.  (Have a
> look at /etc/xinetd.conf vs /etc/xinetd.d for exmaple.)   
> 
> My complaint was never that it exists but that they didn't bother to put a
> note in the main conf file to tell you it exists at the point they first
> began using it.

I'm so furious yesterday because this file works in a sneaky way

Note You need to log in before you can comment on or make changes to this bug.