Bug 460875

Summary: Programs run out of threads and die
Product: [Fedora] Fedora Reporter: Nigel Horne <njh>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: medium    
Version: 9CC: edwin+bugs, marco.crosio
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-14 17:06:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nigel Horne 2008-09-02 10:09:02 UTC
Description of problem:
Many mutli-threaded programs on an out-of-the-box configured FC9 fail

Version-Release number of selected component (if applicable):
kernel-2.6.25.14-108.fc9.i686

How reproducible:
Every time

Steps to Reproduce:
1. Download and compile http://people.redhat.com/mingo/cfs-scheduler/tools/hackbench.c (you may need to remove the STACK_MIN stuff on lines 234/235 - a Redhat program that doesn't compile on Fedora!!
2. Run "a.out 5 thread 1000"
3.
  
Actual results:
Running with 5*40 (== 200) tasks.
pthread_create failed: Resource temporarily unavailable (11)


Expected results:
Should run fine

Additional info:

Comment 1 Török Edwin 2008-09-02 11:09:26 UTC
(In reply to comment #0)
> Description of problem:
> Many mutli-threaded programs on an out-of-the-box configured FC9 fail
> 
> Version-Release number of selected component (if applicable):
> kernel-2.6.25.14-108.fc9.i686
> 
> How reproducible:
> Every time

Same here, with same kernel version.

> 
> Steps to Reproduce:
> 1. Download and compile
> http://people.redhat.com/mingo/cfs-scheduler/tools/hackbench.c (you may need to
> remove the STACK_MIN stuff on lines 234/235 - a Redhat program that doesn't
> compile on Fedora!!

or #include <limits.h>

> 2. Run "a.out 5 thread 1000"
> 3.
> 
> Actual results:
> Running with 5*40 (== 200) tasks.
> pthread_create failed: Resource temporarily unavailable (11)

This works for me if I set vm.overcommit_memory to 0, however
"./a.out 10 thread 1000" fails with same error message even with overcommit set to zero.

However spawning same number of processes works:

$ ./a.out 10 process 1000
Running with 10*40 (=400) tasks.
Time: 17.111

$ ./a.out 10 thread 1000
Running with 10*40 (=400) tasks.
pthread_create failed: Resource temporarily unavailable (11)

If I set overcommit to 2, '400 processes' still works, but '200 threads' fail too (which was previously working).

I have 1G RAM and 964M swap (Nigel has 0.75G RAM):
$ free
        total         used   free    shared buffers  cached
Mem : 1025548       444832 580716         0   13744  180440
-/+ buffers/cache:  250648 774900
Swap:  987956            0 987956

ulimit -u is set to default of 1024 processes.

FWIW running a 2.6.26 Debian kernel doesn't show this problem (on same hardware):
Linux thunder 2.6.26-1-686 #1 SMP Thu Aug 28 12:00:54 UTC 2008 i686 GNU/Linux

If I set overcommit to 0 or 2, run 5, 10 or 20 threads, it all works on Debian with a 2.6.26 kernel (libc is 2.7 there, and kernel is compiled with gcc 4.1.2 should that matter).

Comment 2 Chuck Ebbert 2008-09-12 06:04:09 UTC
Can't reproduce until threads is >25 here. Looks like you're running out of file descriptors??

Comment 3 Török Edwin 2008-09-12 06:56:20 UTC
(In reply to comment #2)
> Can't reproduce until threads is >25 here. Looks like you're running out of
> file descriptors??

I had problems with threads 10 (that is 400 threads) as you can see above.
That is well below the 1024 file descriptor limit. 
And Nigel has problem with threads 5 (that is 200 threads), which is below the limit as well.

Is there a command we could run that tells us what resource pthread_create has run out of?

Comment 4 Chuck Ebbert 2008-09-14 02:25:04 UTC
Can you post the output of 'ulimit -a'?

Comment 5 Török Edwin 2008-09-14 07:47:30 UTC
(In reply to comment #4)
> Can you post the output of 'ulimit -a'?

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 16237
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

I am running the hackbench with Nigel's suggestion to #ifdef out the line that sets the stacksize. (pthread_attr_getstacksize says default is 10485760).
If I leave that line there, and #include <limits.h>, then it doesn't fail.

It is not just hackbench failing, programs such as clamav-milter fail to create threads too, and it was previously working on FC8.

FWIW on the same hardware if I boot a Debian 2.6.26 kernel, I get 8388608 as default stack size.

Comment 6 Nigel Horne 2008-09-17 12:51:05 UTC
On my system it is

core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 12287
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 4096
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Comment 7 Chuck Ebbert 2008-09-22 03:22:46 UTC
(In reply to comment #5)
> I am running the hackbench with Nigel's suggestion to #ifdef out the line that
> sets the stacksize. (pthread_attr_getstacksize says default is 10485760).
> If I leave that line there, and #include <limits.h>, then it doesn't fail.
>

That explains why mine works -- I didn't change anything and it built with no problems.
 
> It is not just hackbench failing, programs such as clamav-milter fail to create
> threads too, and it was previously working on FC8.
>

With which kernel??
 
> FWIW on the same hardware if I boot a Debian 2.6.26 kernel, I get 8388608 as
> default stack size.

Debian kernel on F8, or Debian kernel on Debian install?

Comment 8 Bug Zapper 2009-06-10 02:36:10 UTC
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 9 Bug Zapper 2009-07-14 17:06:15 UTC
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.