i'm getting frequent hangs recently.. but get this.. i attach to the process with gdb, then just quit gdb, and it unhangs! this system is nothing but latest stable rpm software. these hangs have happened (repeatedly) so far in evolution(1.4.4), galeon(1.2.7), gthumb(2.0.1), and gnucash(1.8.7), in either kde or gnome, under redhat 9, all up2date. i've tried various kernels, 2.4.20-18, 2.4.20-19, 2.4.20-20, no difference. before 3-6 weeks or so ago it wasn't happening. dunno if i should blame up2dates or what. could be i just wasn't yet putting quite so heavy a load on the machine back then? my best guess is it's some sort of latent scheduling bug, brought out under load, that is, when i've got enough going to really call significantly upon swap.
sometimes when something (eg galeon) is clearly stuck, all that is required is to bring forward another window, then return to the "stuck" window, and presto, it's fine again. then again sometimes the gdb hack above seems needed. more rarely, even that doesn't help, and i end up killing and relaunching the app.
the summary is a self diagnosis that needs confirmation. how would i go about confirming it? i'm getting evolution and galeon lockups several times daily. for an evolution example see http://bugzilla.ximian.com/show_bug.cgi?id=49373
Can you try the test version of the RHL9 errata at ftp://people.redhat.com/jakub/glibc/errata/2.3.2-27.9.4/ and let us know whether it works? This code should have a backport of the problems in NPTL we know of.
ok, after downloading and installing, neither kde nor gnome will launch anymore. the versions i have are from up2date. do i need something even more recent?
There shouldn't be any problems at all. Did you download the i686 version (I assume that is what you used before)? What problems are reported?
ah, yes, glibc 686 is much better, thank you. but alas, the hang problem is still here. several hours of light duty computing with no problem, but a few extra apps running, swap space more active, and evolution hung again. i've learned a new trick for getting out of the hangs. just STOP the process, and CONT again. usually works, tho not always. <sigh>.
If you can try the glibc in Fedora Core 1 (which you only should try with a complete installation) this would help. I very much doubt that there is any problem in FC1 and ordinarily I'd say the backport to RHL9 has the important pieces. But who knows, there have been tons of changes. I'm not going to try hunting down the bug in RHL9. If somebody identifies it and it indeed is a libc problem, we can look into fixing it. But the really up-to-date code is in FC1 and RHEL3.
successful workaround: invoke as follows: $ LD_ASSUME_KERNEL=2.2.5 evolution &