Description of problem: Freshly updated httpd daemon hangs up during start. It turns out to be read() from /dev/random that is hanging in a server that does not (for some reason) accumulate much entropy.. Backtrace shows: 0x403db8f8 in read () from /lib/i686/libpthread.so.0 (gdb) where #0 0x403db8f8 in read () from /lib/i686/libpthread.so.0 #1 0x40366190 in apr_proc_mutex_unix_flock_methods () from /usr/lib/libapr-0.so.0 #2 0x4002296a in _init () from /etc/httpd/modules/mod_auth_digest.so #3 0x40022afa in _init () from /etc/httpd/modules/mod_auth_digest.so #4 0x08067fca in ap_run_post_config () #5 0x0806d648 in main () #6 0x404368c7 in __libc_start_main () from /lib/i686/libc.so.6 (gdb) Version-Release number of selected component (if applicable): How reproducible: Up to date kernel (2.4.21*nptlsmp) and httpd Actual results: Server hung-up Expected results: Server running Additional info: This is rather endemic problem in recent Rawhide things. Eventually I solved things by throwing away blocking /dev/random and replaced it with symlink to /dev/urandom It isn't pretty, nor exactly kosher, but having services hung is worse, than slight weakening of randomness.
Yes, we're switching to use /dev/urandom, there's no real need to have strong random bits for what httpd does with them.
Fixed in apr-0.9.3-14.
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2003-320.html
It looks like this fix is only working around a kernel issue where the entropy is never being replenished. did any R&D happen on looking into that?
There were some known issues in earlier 2.4 kernels in the entropy handling, which did get fixed, IIRC, but still, it's expected behaviour that a read() on /dev/random may block for "a long time".
you don't happen to know what patchlevels in the 2.4 series had the problem do you? we have a box running 2.4.20+RH patches that has an entropy of 0 (which thus hung httpd on restart), and has been that way for over a day. I symlink'd /dev/random to /dev/urandom for now.
It affected the 2.4.21-based RHEL3 kernel; bug 117218 tracked it.