Bug 72896
Summary: | squid does not support >1024 file descriptors | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Jason Duerstock <jason> | ||||||
Component: | squid | Assignee: | Martin Stransky <stransky> | ||||||
Status: | CLOSED ERRATA | QA Contact: | |||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3.0 | CC: | i, jameskwh, mgalgoci, tao, zing | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i386 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | RHBA-2006-0322 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2006-08-08 18:53:08 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 181408 | ||||||||
Attachments: |
|
Description
Jason Duerstock
2002-08-28 20:41:27 UTC
The problem can be remedied with the following instructions: 1) add the following patch to the RPM: --- squid-2.4.STABLE6/src/main.c Sat May 19 20:09:59 2001 +++ /root/main.c Tue Sep 3 14:34:42 2002 @@ -574,8 +574,8 @@ mode_t oldmask; debug_log = stderr; - if (FD_SETSIZE < Squid_MaxFD) - Squid_MaxFD = FD_SETSIZE; + /* if (FD_SETSIZE < Squid_MaxFD) + Squid_MaxFD = FD_SETSIZE; */ /* call mallopt() before anything else */ #if HAVE_MALLOPT 2) remove the '--enable-delay-pools' option from the '%configure' statement. 3) ulimit -n 8192 4) rebuild RPM This will allow squid to use >1024 file descriptors. '--enable-delay-pools' has to be disabled because it uses select() which chokes. Oops. I neglected to mention ulimit -n 8192 needs to be added to /etc/sysconfig/squid to make sure the file descriptors are set correctly before the daemon is started. Created attachment 75624 [details]
Adds file descriptor limit override argument of "-O [fd limit]" to squid
I agree with Jason that the 1024 hard limit built into the Red hat squid binaries is problematic when under moderate/heavy use. For such a situation, not only must the RPM be rebuilt but RHN users should add squid to the pkgSkipList to ensure that the binary isn't reverted to a newer version with the 1024 hard limit put back. Thus, the problem not only limits the usefullness of the current squid package but also protentally limits the usefullness of RHN for ongoing maintenance of squid. The end result is that the current Squid code works better under Gentoo's compile everything (including updates) philophy than pre-roled binaries. The Jason solution exchanges the old 1024 limit for a new hard limit of 8192 and breaks delay pools. While this will probably be much more useful in moderate loads than 1024 limit, I dislike the solution of yet another hard limit compiled into the binary. The value of 8192 may cause squid to use excessive memory on systems that do not need the higher value (this problem would become more clear if someone attempted to create a cheap embedded or limited RAM caching "toaster" hardware based around the future package). Also, while a 2Ghz machine with 4GB of memory should provide enough resources to do caching for a Class B network/64K nodes, the issue of having to rebuild/recompile may come up again to further raise the limit of 8192 to a higher value to address the additional resources and number of clients. Lastly, the issue of select() being used for delayed pools should not be an issue. This is an issue in the Jason solution because the Squid developers (sloppy?) use of both a SQUID_MAXFD precompiler variable and a Squid_MaxFD global variable in comm_select.c and that the solution does not keep them in sync. In the squid-2.4.STABLE7-fdoverride.patch attempts to accomplish the following: 1) Reduce use in the code on precompiler variable SQUID_MAXFD in favor of the global variable Squid_MaxFD (--enable-delay-pools appears to still work) 2) Add a new file descriptor override argument of "-O [fd value]" to squid to allow for a runtime modification to Squid_MaxFD The patch can be added to the squid.spec file as follows: --- squid.spec-orig Tue Sep 10 01:15:03 2002 +++ squid.spec Tue Sep 10 00:07:58 2002 @@ -15,6 +15,7 @@ Patch2: squid-perlpath.patch Patch3: squid-location.patch Patch10: squid-2.4.STABLE7-msntauth.patch +Patch11: squid-2.4.STABLE7-fdoverride.patch BuildRoot: %{_tmppath}/%{name}-%{version}-root Prereq: /sbin/chkconfig logrotate shadow-utils Requires: bash >= 2.0 @@ -40,6 +41,7 @@ %patch2 -p1 -b .perlpath %patch3 -p1 %patch10 -p1 +%patch11 -p1 %build %configure \ Here is an example of how to take advantage of a patched version of the binary by modification of the /etc/rc.d/init.d/squid script: --- squid-orig Tue Sep 10 00:22:03 2002 +++ squid Tue Sep 10 00:18:12 2002 @@ -34,6 +34,7 @@ # don't raise an error ifthe config file is incomplete # set defaults instead: SQUID_OPTS=${SQUID_OPTS:-"-D"} +SQUID_FDNUM=32768 SQUID_PIDFILE_TIMEOUT=${SQUID_PIDFILE_TIMEOUT:-20} SQUID_SHUTDOWN_TIMEOUT=${SQUID_SHUTDOWN_TIMEOUT:-100} @@ -57,8 +58,9 @@ $SQUID -z -F 2>/dev/null fi done + ulimit -n $SQUID_FDNUM echo -n $"Starting $prog: " - $SQUID $SQUID_OPTS 2> /dev/null + $SQUID $SQUID_OPTS -O $SQUID_FDNUM 2> /dev/null RETVAL=$? if [ $RETVAL -eq 0 ]; then timeout=0; *** (Dangerous?) Assumption *** commAddSlowFd() and commGetSlowFd() used SQUID_MAXFD to initalize the size of the slowfdarr array as a global variable. This is then used like a stack where commAddSlowFd() pushes onto and commGetSlowFd() does a pop from a random position. The Add/Get (push/random-pop) functions are only called in comm_poll() and comm_select(). Based on what I can tell, both comm_poll() and comm_select() do a commGetSlowFd()/random-pop for every commAddSlowFd()/push issued so slowfdarr does not have to be global since the stack should be empty before either comm_poll or comm_select() ends. Anotherwords, comm_select() never expects to get from commGetSlowFd() data that was put by a commAddSlowFd() call in comm_poll() (or the other way around of comm_poll() expecting a commGetSlowFd() result based on a fd from comm_select()). By initalizing slowfdarr local to comm_poll() and comm_select(), the Squid_MaxFD variable can be used instead of the SQUID_MAXFD precompiler variable thus addressing goal #1 of the patch. Wishlist item for Squid v2.6: The original intention of writting the patch was to make it so that the compiled setting of Squid_MaxFD could be changed via the squid.conf. But parseConfigFile() is not called until after several xxxInit() functions have already used Squid_MaxFD to initalize some of the data structures. Therefore, a modification to mainParseOptions() (which is called before the xxxInit() function calls) was more approbate for a modification to a "STABLE" code tree. Adding this as a squid.conf option in the v2.6 tree should be considered by the Squid developement team. I put up squid-2.5.STABLE1-3.i386.rpm and squid-2.5.STABLE1-3.src.rpm on ftp://people.redhat.com/fenlason/ The binary was built on 8.0 because I don't have any 7.3 systems to hand. It has two changes: one adds the null storage module, another is the above patch to the number of maximum file descriptors it can open. If you can check it out, and it works, I'll see if I can slip it into the next beta. It may be too late, though. *** Bug 82987 has been marked as a duplicate of this bug. *** The squid.spec needs 'openssl-devel' and 'cyrus-sasl-devel' added as build dependencies. I am now getting these messages in my log file: 2003/01/29 11:40:42| errorTryLoadText: '/usr/share/squid/errors/ERR_READ_TIMEOUT ': (2) No such file or directory 2003/01/29 11:40:42| errorTryLoadText: '/usr/share/squid/errors/ERR_LIFETIME_EXP ': (2) No such file or directory 2003/01/29 11:40:42| errorTryLoadText: '/usr/share/squid/errors/ERR_READ_ERROR': (2) No such file or directory 2003/01/29 11:40:42| errorTryLoadText: '/usr/share/squid/errors/ERR_WRITE_ERROR' : (2) No such file or directory Shouldn't it be trying to read them from /usr/share/squid/errors/English/ ... ? Jason I think I fixed the errors path to be the correct /etc/squid/errors . I put a new set of RPMs (squid-2.5.STABLE1-4.i386.rpm squid-2.5.STABLE1-4.src.rpm) on people.redhat.com. See if they work better. Since I just recently took over squid, I don't have any kind of test setup running yet. Real Soon Now. . . The above "patch" should be *discarded* as explained in the Squid bugzilla entry #435 ( http://www.squid-cache.org/bugs/show_bug.cgi?id=435 ) To quote Henrik Nordstrom from the Squid project team: "... the proposed comman line option is not safe, as some parts of Squid depends on the FD_SETSIZE define at compile time." Unlike v2.4 of Squid, this new version of Squid will set the compile time setting of FD_SETSIZE to whatever "ulimit -n" is at the time that "configure" is run. So, assuming rpmbuild is being run as root, the best way to address this is to modify the FD ulimit in the SPEC file to a *large* number such as: %build ulimit -n 1048576 %configure \ ... Also, the ulimit should also be equally increased in the init script. This change will alter fd_set allocations to take up 128K instead of 128 bytes. But this should not be a problem on a setup dedicated to Squid with a RH 8 minimal memory requirement of 64Megs. from comm_select.c:310: #if DELAY_POOLS fd_set slowfds; #endif IMHO, all of the FD_* nonsense should be removed from squid, as the select() functionality has been replaced by poll(). The remaining FD_* macros are only used to support --enable-delay-pools, and should be replaced with a more generic bit array implementation. The generic implementation needs a FD_ZERO() call is notified of the FD_SETSIZE value as a parameter vs. assuming it from the #define. I've tried addressing this with the actual squid team but they did not seem to want to hear it. I'm still getting this same error in RH AS3. Will RH be releasing a stable squid version that support a larger file descriptor? my cust acct:597393 Name : squid Version : 2.5.STABLE3 Release : 6.3E.8 Build Date: Thu 17 Feb 2005 04:52:27 AM SGT Source RPM: squid-2.5.STABLE3-6.3E.8.src.rpm This is still a problem even on fc4/rawhide on a 64bit platform (amd64)!!!! Here is a test case: set up squid on 127.0.0.1 install httpd start httpd run this: ab -X 127.0.0.1:3128 -n 2048 -c 1024 http://localhost/ [mgalgoci@razor ~]$ ab -X 127.0.0.1:3128 -n 2048 -c 1024 http://127.0.0.1/ This is ApacheBench, Version 2.0.41-dev <$Revision: 1.141 $> apache-2.0 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/ Benchmarking 127.0.0.1 [through 127.0.0.1:3128] (be patient) socket: Too many open files (24) expected results: I should at least be able to do 16384 connections. I mean, really. C'mon guys. I work on it, but I'm going to review this issue for the devel first... *** Bug 168088 has been marked as a duplicate of this bug. *** Created attachment 121905 [details]
Proposed patch for this issue
This patch is applied in rawhide now and I'm going to wait for some feedback.
i applied the patch from #18 to squid-2.5.STABLE11-4.FC3 and have been running at "max_filedesc 8192" for a couple of days now. Seems to be ok: File descriptor usage for squid: Maximum number of file descriptors: 8192 Largest file desc currently in use: 1843 Number of file desc currently in use: 1508 Critical parts of squid are the delay-pools, they can be eventually broken with this patch. Does the patch work for the versiond of squid included with RH AS 3? This patch can be modified for RHEL3/4, I'll make some test packages for RHEL3/4. Patch tested successfully on RHEL3. First off I'd like to thank you for the patch. I waited a long time for this but at least in theory this seems to be the right way to fix it, and I really appreciate the effort. I don't use delay pools so I'm not sure if this works right with them or not. This seems to build and work fine for me under RHEL4. I had to add --with-build-environment=default \ after --with-build-environment=default \ in the .spec file to get it to build under x86_64 though. I wouldn't be surprised if this was the wrong way to fix it for i386 though. (In reply to comment #25) > --with-build-environment=default \ Why did you have to add this option? Packages with this patch for RHEL3/4 are here - http://people.redhat.com/stransky/squid/ (In reply to comment #26) > (In reply to comment #25) > > --with-build-environment=default \ > > Why did you have to add this option? Otherwise I got stuff like this: gcc -DHAVE_CONFIG_H -I. -I. -I../include -I../include -I../include -I/usr/kerberos/include -m32 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -fPIE -Os -g -pipe -fsigned-char -D_REENTRANT -c `test -f rfc2617.c || echo './'`rfc2617.c In file included from /usr/include/openssl/e_os2.h:56, from /usr/include/openssl/md5.h:62, from ../include/md5.h:16, from rfc2617.c:52: /usr/include/openssl/opensslconf.h:13:30: opensslconf-i386.h: No such file or directory I'm not sure exactly what's going on except that it calls getconf POSIX_V6_ILP32_OFFBIG_CFLAGS which returns: -m32 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 and then decides that that's the right thing to do and accepts it as its own CFLAGS. Then everything goes to hell. (In reply to comment #28) Do you have openssl-devel package? (In reply to comment #28) btw. Why do you compile this package as i386?? (-m32 directive, for x86-64 should be -m64) (In reply to comment #30) > (In reply to comment #28) > btw. Why do you compile this package as i386?? (-m32 directive, for x86-64 > should be -m64) Yes I have openssl-devel installed. I'm not compiling it for i386. The configure script (erroneously) pulls in the -m32 when it uses getconf to get the CFLAGS. (In reply to comment #31) > Yes I have openssl-devel installed. I'm not compiling it for i386. > The configure script (erroneously) pulls in the -m32 when it uses getconf to get > the CFLAGS. How do you compile this package? Can you check $rpmbuild --rebuild --target=x86_64 squid....src.rpm? (In reply to comment #32) > (In reply to comment #31) > > Yes I have openssl-devel installed. I'm not compiling it for i386. > > The configure script (erroneously) pulls in the -m32 when it uses getconf to get > > the CFLAGS. > > How do you compile this package? Can you check $rpmbuild --rebuild > --target=x86_64 squid....src.rpm? > > I did my best to replicate this with rpmbuild -ba --target=x86_64 SPECS/squid.spec and it still blows up. The problems appear to start around line 2673 of the configure script. if "$buildmodel" is null, it tries to test getconf options starting with POSIX_V6_ILP32_OFFBIG, which it interprets as working properly. After retrieving the CFLAGS (which includes -m32), it tries to apply them to the rest of the build: $ getconf _POSIX_V6_ILP32_OFFBIG 1 $ getconf POSIX_V6_ILP32_OFFBIG_CFLAGS -m32 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 $ rpm -qf --qf '%{NAME}-%{VERSION}.%{ARCH}\n' `which getconf` glibc-common-2.3.4.x86_64 Perhaps glibc is to blame for returning such options under x86_64, but this breaks regardless. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0322.html Note: If you'd like to use this feature, you have to edit /etc/squid/squid.conf (a max_filedesc directive, it's at end of the file) and set the same value via. ulimit (#ulimit -n max_filedesc_number). You can check it by "#ulimit -n". |