Bug 72896

Summary: squid does not support >1024 file descriptors
Product: Red Hat Enterprise Linux 3 Reporter: Jason Duerstock <jason>
Component: squidAssignee: Martin Stransky <stransky>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: i, jameskwh, mgalgoci, tao, zing
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2006-0322 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-08 18:53:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 181408    
Attachments:
Description Flags
Adds file descriptor limit override argument of "-O [fd limit]" to squid
none
Proposed patch for this issue none

Description Jason Duerstock 2002-08-28 20:41:27 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.1) Gecko/20020826

Description of problem:
In my cache.log file, I get MANY "WARNING! Your cache is running out of
filedescriptors" messages when my cache is heavily used.  Despite adjusting the
FD limit with 'ulimit -n' and several attempts to modify the source to behave
'correctly' (ulimit -n must be set before rebuilding the RPM for the configure
script to notice), squid still
starts up with "With 1024 file descriptors available" in the cache.log.
How can I make squid use more than 1024 file descriptors?

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. install squid
2. load the server with > 512 requests
3. watch cache.log
	

Actual Results:  I got several warnings and cache performance was inadequate

Expected Results:  it should happily accept >1024 file descriptors

Additional info:

I run the web cache for my university's network, and it is not performing
acceptably due to this bug.

Comment 1 Jason Duerstock 2002-09-04 03:07:17 UTC
The problem can be remedied with the following instructions:
1) add the following patch to the RPM:
--- squid-2.4.STABLE6/src/main.c        Sat May 19 20:09:59 2001
+++ /root/main.c        Tue Sep  3 14:34:42 2002
@@ -574,8 +574,8 @@
     mode_t oldmask;
 
     debug_log = stderr;
-    if (FD_SETSIZE < Squid_MaxFD)
-       Squid_MaxFD = FD_SETSIZE;
+    /* if (FD_SETSIZE < Squid_MaxFD)
+       Squid_MaxFD = FD_SETSIZE; */
 
     /* call mallopt() before anything else */
 #if HAVE_MALLOPT

2) remove the '--enable-delay-pools' option from the '%configure' statement.
3) ulimit -n 8192
4) rebuild RPM

This will allow squid to use >1024 file descriptors.  '--enable-delay-pools' has
to be disabled because it uses select() which chokes.

Comment 2 Jason Duerstock 2002-09-04 03:08:35 UTC
Oops.  I neglected to mention

ulimit -n 8192

needs to be added to /etc/sysconfig/squid to make sure the file descriptors are
set correctly before the daemon is started.

Comment 3 Need Real Name 2002-09-10 05:44:47 UTC
Created attachment 75624 [details]
Adds file descriptor limit override argument of "-O [fd limit]" to squid

Comment 4 Need Real Name 2002-09-10 07:02:53 UTC
  I agree with Jason that the 1024 hard limit built into the Red hat squid
binaries is problematic when under moderate/heavy use.  For such a situation,
not only must the RPM be rebuilt but RHN users should add squid to the
pkgSkipList to ensure that the binary isn't reverted to a newer version with the
1024 hard limit put back.  Thus, the problem not only limits the usefullness of
the current squid package but also protentally limits the usefullness of RHN for
ongoing maintenance of squid.  The end result is that the current Squid code
works better under Gentoo's compile everything (including updates) philophy than
pre-roled binaries.

  The Jason solution exchanges the old 1024 limit for a new hard limit of 8192
and breaks delay pools.  While this will probably be much more useful in
moderate loads than 1024 limit, I dislike the solution of yet another hard limit
compiled into the binary.  The value of 8192 may cause squid to use excessive
memory on systems that do not need the higher value (this problem would become
more clear if someone attempted to create a cheap embedded or limited RAM
caching "toaster" hardware based around the future package).  Also, while a 2Ghz
machine with 4GB of memory should provide enough resources to do caching for a
Class B network/64K nodes, the issue of having to rebuild/recompile may come up
again to further raise the limit of 8192 to a higher value to address the
additional resources and number of clients.  Lastly, the issue of select() being
used for delayed pools should not be an issue.  This is an issue in the Jason
solution because the Squid developers (sloppy?) use of both a SQUID_MAXFD
precompiler variable and a Squid_MaxFD global variable in comm_select.c and that
the solution does not keep them in sync.

  In the squid-2.4.STABLE7-fdoverride.patch attempts to accomplish the following:

  1) Reduce use in the code on precompiler variable SQUID_MAXFD in favor of the
global variable Squid_MaxFD (--enable-delay-pools appears to still work)
  2) Add a new file descriptor override argument of "-O [fd value]" to squid to
allow for a runtime modification to Squid_MaxFD

The patch can be added to the squid.spec file as follows:

--- squid.spec-orig     Tue Sep 10 01:15:03 2002
+++ squid.spec  Tue Sep 10 00:07:58 2002
@@ -15,6 +15,7 @@
 Patch2: squid-perlpath.patch
 Patch3: squid-location.patch
 Patch10: squid-2.4.STABLE7-msntauth.patch
+Patch11: squid-2.4.STABLE7-fdoverride.patch
 BuildRoot: %{_tmppath}/%{name}-%{version}-root
 Prereq: /sbin/chkconfig logrotate shadow-utils
 Requires: bash >= 2.0
@@ -40,6 +41,7 @@
 %patch2 -p1 -b .perlpath
 %patch3 -p1
 %patch10 -p1
+%patch11 -p1
 
 %build
 %configure \


Here is an example of how to take advantage of a patched version of the binary
by modification of the /etc/rc.d/init.d/squid script:

--- squid-orig  Tue Sep 10 00:22:03 2002
+++ squid       Tue Sep 10 00:18:12 2002
@@ -34,6 +34,7 @@
 # don't raise an error ifthe config file is incomplete 
 # set defaults instead:
 SQUID_OPTS=${SQUID_OPTS:-"-D"}
+SQUID_FDNUM=32768
 SQUID_PIDFILE_TIMEOUT=${SQUID_PIDFILE_TIMEOUT:-20}
 SQUID_SHUTDOWN_TIMEOUT=${SQUID_SHUTDOWN_TIMEOUT:-100}
 
@@ -57,8 +58,9 @@
             $SQUID -z -F 2>/dev/null
        fi
     done
+    ulimit -n $SQUID_FDNUM
     echo -n $"Starting $prog: "
-    $SQUID $SQUID_OPTS 2> /dev/null
+    $SQUID $SQUID_OPTS -O $SQUID_FDNUM 2> /dev/null
     RETVAL=$?
     if [ $RETVAL -eq 0 ]; then 
        timeout=0;


  *** (Dangerous?) Assumption ***  commAddSlowFd() and commGetSlowFd() used
SQUID_MAXFD to initalize the size of the slowfdarr array as a global variable. 
This is then used like a stack where commAddSlowFd() pushes onto and
commGetSlowFd() does a pop from a random position.  The Add/Get
(push/random-pop) functions are only called in comm_poll() and comm_select(). 
Based on what I can tell, both comm_poll() and comm_select() do a
commGetSlowFd()/random-pop for every commAddSlowFd()/push issued so slowfdarr
does not have to be global since the stack should be empty before either
comm_poll or comm_select() ends.  Anotherwords, comm_select() never expects to
get from commGetSlowFd() data that was put by a commAddSlowFd() call in
comm_poll() (or the other way around of comm_poll() expecting a commGetSlowFd()
result based on a fd from comm_select()).  By initalizing slowfdarr local to
comm_poll() and comm_select(), the Squid_MaxFD variable can be used instead of
the SQUID_MAXFD precompiler variable thus addressing goal #1 of the patch.

  Wishlist item for Squid v2.6:  The original intention of writting the patch
was to make it so that the compiled setting of Squid_MaxFD could be changed via
the squid.conf.  But parseConfigFile() is not called until after several
xxxInit() functions have already used Squid_MaxFD to initalize some of the data
structures.  Therefore, a modification to mainParseOptions() (which is called
before the xxxInit() function calls) was more approbate for a modification to a
"STABLE" code tree.  Adding this as a squid.conf option in the v2.6 tree should
be considered by the Squid developement team.


Comment 5 Jay Fenlason 2003-01-29 16:14:37 UTC
I put up squid-2.5.STABLE1-3.i386.rpm and squid-2.5.STABLE1-3.src.rpm on
ftp://people.redhat.com/fenlason/  The binary was built on 8.0 because I don't
have any 7.3 systems to hand.  It has two changes: one adds the null storage
module, another is the above patch to the number of maximum file descriptors it
can open.  If you can check it out, and it works, I'll see if I can slip it into
the next beta.  It may be too late, though.

Comment 6 Jay Fenlason 2003-01-29 16:16:48 UTC
*** Bug 82987 has been marked as a duplicate of this bug. ***

Comment 7 Jason Duerstock 2003-01-29 17:18:00 UTC
The squid.spec needs 'openssl-devel' and 'cyrus-sasl-devel' added as build
dependencies.

Comment 8 Jason Duerstock 2003-01-29 17:25:30 UTC
I am now getting these messages in my log file:

2003/01/29 11:40:42| errorTryLoadText: '/usr/share/squid/errors/ERR_READ_TIMEOUT
': (2) No such file or directory
2003/01/29 11:40:42| errorTryLoadText: '/usr/share/squid/errors/ERR_LIFETIME_EXP
': (2) No such file or directory
2003/01/29 11:40:42| errorTryLoadText: '/usr/share/squid/errors/ERR_READ_ERROR':
 (2) No such file or directory
2003/01/29 11:40:42| errorTryLoadText: '/usr/share/squid/errors/ERR_WRITE_ERROR'
: (2) No such file or directory

Shouldn't it be trying to read them from /usr/share/squid/errors/English/ ... ?

Jason

Comment 9 Jay Fenlason 2003-01-29 19:16:43 UTC
I think I fixed the errors path to be the correct /etc/squid/errors .  I put a
new set of RPMs (squid-2.5.STABLE1-4.i386.rpm squid-2.5.STABLE1-4.src.rpm) on
people.redhat.com.  See if they work better.

Since I just recently took over squid, I don't have any kind of test setup
running yet.  Real Soon Now. . .

Comment 10 Need Real Name 2003-01-29 23:01:58 UTC
The above "patch" should be *discarded* as explained in the Squid bugzilla entry
#435 ( http://www.squid-cache.org/bugs/show_bug.cgi?id=435 )

To quote Henrik Nordstrom from the Squid project team:

"... the proposed comman line option is not safe, as some parts of Squid depends
on the FD_SETSIZE define at compile time."

Unlike v2.4 of Squid, this new version of Squid will set the compile time
setting of FD_SETSIZE to whatever "ulimit -n" is at the time that "configure" is
run.  So, assuming rpmbuild is being run as root, the best way to address this
is to modify the FD ulimit in the SPEC file to a *large* number such as:

%build
ulimit -n 1048576
%configure \
...


Also, the ulimit should also be equally increased in the init script.  This
change will alter fd_set allocations to take up 128K instead of 128 bytes.  But
this should not be a problem on a setup dedicated to Squid with a RH 8 minimal
memory requirement of 64Megs.

Comment 11 Jason Duerstock 2003-02-10 21:13:19 UTC
from comm_select.c:310:
#if DELAY_POOLS
    fd_set slowfds;
#endif

IMHO, all of the FD_* nonsense should be removed from squid, as the select()
functionality has been replaced by poll().  The remaining FD_* macros are only
used to support --enable-delay-pools, and should be replaced with a more generic
bit array implementation.  The generic implementation needs a FD_ZERO() call is
notified of the FD_SETSIZE value as a parameter vs. assuming it from the #define.

I've tried addressing this with the actual squid team but they did not seem to
want to hear it.


Comment 12 james.k 2005-05-03 02:56:43 UTC
I'm still getting this same error in RH AS3. Will RH be releasing a stable 
squid version that support a larger file descriptor?

my cust acct:597393

Name        : squid                        
Version     : 2.5.STABLE3                     
Release     : 6.3E.8                        
Build Date: Thu 17 Feb 2005 04:52:27 AM SGT
Source RPM: squid-2.5.STABLE3-6.3E.8.src.rpm


Comment 14 Matthew Galgoci 2005-05-06 23:14:59 UTC
This is still a problem even on fc4/rawhide on a 64bit platform (amd64)!!!!

Here is a test case:

set up squid on 127.0.0.1

install httpd

start httpd

run this: ab -X 127.0.0.1:3128 -n 2048 -c 1024 http://localhost/

[mgalgoci@razor ~]$ ab -X 127.0.0.1:3128 -n 2048 -c 1024 http://127.0.0.1/ This
is ApacheBench, Version 2.0.41-dev <$Revision: 1.141 $> apache-2.0
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 [through 127.0.0.1:3128] (be patient)
socket: Too many open files (24)

expected results: I should at least be able to do 16384 connections. I mean,
really. C'mon guys.

Comment 16 Martin Stransky 2005-11-17 15:20:51 UTC
I work on it, but I'm going to review this issue for the devel first...

Comment 17 Martin Stransky 2005-12-01 14:53:13 UTC
*** Bug 168088 has been marked as a duplicate of this bug. ***

Comment 18 Martin Stransky 2005-12-06 13:40:56 UTC
Created attachment 121905 [details]
Proposed patch for this issue

This patch is applied in rawhide now and I'm going to wait for some feedback.

Comment 19 Zing 2006-02-03 21:15:30 UTC
i applied the patch from #18 to squid-2.5.STABLE11-4.FC3 and have been running
at "max_filedesc 8192" for a couple of days now.  Seems to be ok:

File descriptor usage for squid:
	Maximum number of file descriptors:   8192
	Largest file desc currently in use:   1843
	Number of file desc currently in use: 1508

Comment 20 Martin Stransky 2006-02-06 22:45:27 UTC
Critical parts of squid are the delay-pools, they can be eventually broken with
this patch.

Comment 21 james.k 2006-02-07 01:17:46 UTC
Does the patch work for the versiond of squid included with RH AS 3?

Comment 22 Martin Stransky 2006-02-07 07:46:44 UTC
This patch can be modified for RHEL3/4, I'll make some test packages for RHEL3/4.

Comment 23 Bastien Nocera 2006-02-07 11:08:26 UTC
Patch tested successfully on RHEL3.

Comment 25 Jason Duerstock 2006-02-07 15:57:24 UTC
First off I'd like to thank you for the patch.  I waited a long time for this
but at least in theory this seems to be the right way to fix it, and I really
appreciate the effort.

I don't use delay pools so I'm not sure if this works right with them or not.

This seems to build and work fine for me under RHEL4.

I had to add

   --with-build-environment=default \

after

   --with-build-environment=default \

in the .spec file to get it to build under x86_64 though.  I wouldn't be
surprised if this was the wrong way to fix it for i386 though.

Comment 26 Martin Stransky 2006-02-08 15:15:32 UTC
(In reply to comment #25)
>    --with-build-environment=default \

Why did you have to add this option?

Comment 27 Martin Stransky 2006-02-08 20:58:19 UTC
Packages with this patch for RHEL3/4 are here -
http://people.redhat.com/stransky/squid/


Comment 28 Jason Duerstock 2006-02-08 22:00:33 UTC
(In reply to comment #26)
> (In reply to comment #25)
> >    --with-build-environment=default \
> 
> Why did you have to add this option?

Otherwise I got stuff like this:
gcc -DHAVE_CONFIG_H -I. -I. -I../include -I../include -I../include 
-I/usr/kerberos/include   -m32 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -fPIE
-Os -g -pipe -fsigned-char -D_REENTRANT -c `test -f rfc2617.c || echo './'`rfc2617.c
In file included from /usr/include/openssl/e_os2.h:56,
                 from /usr/include/openssl/md5.h:62,
                 from ../include/md5.h:16,
                 from rfc2617.c:52:
/usr/include/openssl/opensslconf.h:13:30: opensslconf-i386.h: No such file or
directory

I'm not sure exactly what's going on except that it calls
getconf POSIX_V6_ILP32_OFFBIG_CFLAGS
which returns:
-m32 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
and then decides that that's the right thing to do and accepts it as its own
CFLAGS.  Then everything goes to hell.

Comment 29 Martin Stransky 2006-02-13 14:08:23 UTC
(In reply to comment #28)
Do you have openssl-devel package?

Comment 30 Martin Stransky 2006-02-13 14:22:00 UTC
(In reply to comment #28)
btw. Why do you compile this package as i386?? (-m32 directive, for x86-64
should be -m64)

Comment 31 Jason Duerstock 2006-02-13 15:15:59 UTC
(In reply to comment #30)
> (In reply to comment #28)
> btw. Why do you compile this package as i386?? (-m32 directive, for x86-64
> should be -m64)

Yes I have openssl-devel installed.  I'm not compiling it for i386.
The configure script (erroneously) pulls in the -m32 when it uses getconf to get
the CFLAGS.


Comment 32 Martin Stransky 2006-02-13 16:48:04 UTC
(In reply to comment #31)
> Yes I have openssl-devel installed.  I'm not compiling it for i386.
> The configure script (erroneously) pulls in the -m32 when it uses getconf to get
> the CFLAGS.

How do you compile this package? Can you check $rpmbuild --rebuild
--target=x86_64 squid....src.rpm?



Comment 33 Jason Duerstock 2006-02-13 19:15:32 UTC
(In reply to comment #32)
> (In reply to comment #31)
> > Yes I have openssl-devel installed.  I'm not compiling it for i386.
> > The configure script (erroneously) pulls in the -m32 when it uses getconf to get
> > the CFLAGS.
> 
> How do you compile this package? Can you check $rpmbuild --rebuild
> --target=x86_64 squid....src.rpm?
> 
> 


I did my best to replicate this with
rpmbuild -ba --target=x86_64 SPECS/squid.spec
and it still blows up.  The problems appear to start around line 2673 of the
configure script. if "$buildmodel" is null, it tries to test getconf options
starting with POSIX_V6_ILP32_OFFBIG, which it interprets as working properly. 
After retrieving the CFLAGS (which includes -m32), it tries to apply them to the
rest of the build:

$ getconf _POSIX_V6_ILP32_OFFBIG
1
$ getconf POSIX_V6_ILP32_OFFBIG_CFLAGS
-m32 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
$ rpm -qf --qf '%{NAME}-%{VERSION}.%{ARCH}\n' `which getconf`
glibc-common-2.3.4.x86_64

Perhaps glibc is to blame for returning such options under x86_64, but this
breaks regardless.

Comment 52 Red Hat Bugzilla 2006-07-20 14:51:18 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0322.html


Comment 54 Martin Stransky 2007-02-09 15:31:21 UTC
Note:

If you'd like to use this feature, you have to edit /etc/squid/squid.conf (a
max_filedesc directive, it's at end of the file) and set the same value via.
ulimit (#ulimit -n max_filedesc_number). You can check it by "#ulimit -n".