Bug 105725

Summary: long httpd graceful reload times
Product: Red Hat Enterprise Linux 3 Reporter: Christopher McCrory <chrismcc>
Component: httpdAssignee: Joe Orton <jorton>
Status: CLOSED ERRATA QA Contact: David Lawrence <dkl>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-12-16 21:45:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 106712    

Description Christopher McCrory 2003-09-26 20:32:14 UTC
Description of problem:
httpd takes a long time to graceful restart

Version-Release number of selected component (if applicable):
httpd-2.0.46-23.ent

How reproducible:
all graceful restarts , timeout varies

Steps to Reproduce:
1. install httpd
2. sudo /sbin/service httpd graceful
3. wait... :)
    
Actual results:
192.168.10.15 - - [26/Sep/2003:13:14:03 -0700] "GET / HTTP/1.0" 403 2898 "-"
"Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7"
192.168.10.15 - - [26/Sep/2003:13:14:03 -0700] "GET / HTTP/1.0" 403 2898 "-"
"Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7"
192.168.10.15 - - [26/Sep/2003:13:14:12 -0700] "GET / HTTP/1.0" 403 2898 "-"
"Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7"
192.168.10.15 - - [26/Sep/2003:13:14:13 -0700] "GET / HTTP/1.0" 403 2898 "-"
"Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7"
192.168.10.15 - - [26/Sep/2003:13:14:13 -0700] "GET / HTTP/1.0" 403 2898 "-"
"Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7"

[Fri Sep 26 13:14:03 2003] [notice] Graceful restart requested, doing restart
[Fri Sep 26 13:14:12 2003] [notice] Apache/2.0.46 (Red Hat) configured --
resuming normal operations


Expected results:

192.168.10.15 - - [26/Sep/2003:13:12:46 -0700] "GET / HTTP/1.0" 200 843 "-"
"Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7"
192.168.10.15 - - [26/Sep/2003:13:12:46 -0700] "GET / HTTP/1.0" 200 843 "-"
"Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7"
192.168.10.15 - - [26/Sep/2003:13:12:50 -0700] "GET / HTTP/1.0" 200 843 "-"
"Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7"
192.168.10.15 - - [26/Sep/2003:13:12:50 -0700] "GET / HTTP/1.0" 200 843 "-"
"Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7"
192.168.10.15 - - [26/Sep/2003:13:12:50 -0700] "GET / HTTP/1.0" 200 843 "-"
"Lynx/2.8.5dev.7 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7"


[Fri Sep 26 13:12:46 2003] [notice] Graceful restart requested, doing restart
[Fri Sep 26 13:12:50 2003] [notice] Apache/2.0.44 (Unix) mod_ssl/2.0.44
OpenSSL/0.9.6g configured -- resuming normal operations


Additional info:

actual is taroon
expected is freebsd ( for reference)


On another taroon machine

30 second timout
[Fri Sep 26 11:28:46 2003] [notice] Graceful restart requested, doing restart
[Fri Sep 26 11:29:14 2003] [notice] Apache/2.0.46 (Red Hat) configured --
resuming normal operations

I looked for another example, but couldn't find it.  IIRC it was a ~2 minute
timeout.  possibley the "resuming normal operations" message happens before
connections are possible.


I left the priority/severity as normal, but since taroon/RHEL3 is designed for a
server environment ( where httpd.conf updates happen frequently and log rotation
is daily/semi-daily rather than weekly), I would recomend a higher priority.

Comment 1 Christopher McCrory 2003-10-15 20:10:55 UTC
Have you had time to look into this?

/me willing to test




Comment 2 Joe Orton 2003-10-16 10:00:00 UTC
Yes, we have a tentative fix for this - some testing would be really useful,
I'll build some packages.

Comment 3 Joe Orton 2003-10-16 10:57:33 UTC
OK, can you try the packages here:

http://people.redhat.com/jorton/Taroon-httpd/

feedback gratefully received!

Comment 4 Christopher McCrory 2003-10-16 16:46:52 UTC
Giving it a try now.




Comment 5 Christopher McCrory 2003-10-16 17:35:27 UTC
That seems to have done the trick!
On a production server, the time was a second or two, about the same as apache 1.3.x

On a test server with ~8000 virtual hosts (yes 8000), the time to reload was
about 15 seconds.  Most of which I imagine was parsing ~35000 lines of configs.

You da' man!






Comment 6 Christopher McCrory 2003-10-16 17:36:40 UTC
more misc info:

running httpd, mod_ssl and php, no mod_{python,perl,etc}



Comment 7 Christopher McCrory 2003-10-23 22:54:49 UTC
Any idea on when this will be pushed to up2date?




Comment 8 Joe Orton 2003-11-12 15:29:27 UTC
It's scheduled for inclusion in the next update.  Thanks for the
report and testing!

Comment 9 Christopher McCrory 2003-12-15 21:15:54 UTC
httpd-2.0.46-26.ent ? yes



Comment 10 Joe Orton 2003-12-15 21:24:55 UTC
Yes indeed, -26.ent has the fix and is included in the Update 1 Beta.

Comment 11 Mark J. Cox 2003-12-16 21:45:23 UTC
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2003-320.html