From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040809 Description of problem: I've observed for quite some time (before FC2 was released; possibly earlier) that squid will sometimes enter a state in which it consumes all cpu cycles, repeatedly calling gettimeofday() and poll() with a timeout of zero. I've finally had a chance to attach to such a runaway process. I couldn't figure out exactly what was wrong or how it gets into this state, but I have some clues that might help someone more familiar with the codebase to do so. Basically, it appears that we call comm_poll(0) (or at least msec == 0 and the first entry in the tasks list is long overdue at the time I notice the problem and attach to squid), and then we never return from comm_poll because poll() always returns 0 and npending is zero. No active http requests exist, and there are only two open file descriptors, both polled with POLLIN only: the tcp server socket and the ipcv2 udp socket. The first entry in the tasks list is: {func = 0xf60716 <storeUfsDirCleanEvent>, arg = 0x0, name = 0xf85315 "storeDirClean", when = 1094694402.237715, next = 0x9386bd0, weight = 1, id = 40278} I don't quite understand enough about squid to tell for sure how we got into this state, but I don't see anything that would get us out of the comm_poll do/while loop in this case, or that would get comm_poll to return or change msec to something other than zero. Version-Release number of selected component (if applicable): squid-2.5.STABLE5-5
You should definitely report this upstream at www.squid-cache.org/bugs/index.cgi I've also recently updated fc3 to squid-2.5.STABLE6 . You may want to see if it fixes the problem, although I don't see anything in the list of patches that looks relevent.
It doesn't fix the problem, been running it since it made it to rawhide and no change. The problem appears to happen only on boxes on which I run vpnc, that use Red Hat-internal servers as proxies. It happens more often on boxes that stay up all night, such that vpnc loses the connection and squid becomes unable to access the proxy. I'll file upstream, thanks.
This might be the same bug: http://www.squid-cache.org/bugs/show_bug.cgi?id=354 I'll investigate some more, with some of the tips given there.
One cpu eating bug was fixed in squid-2.5.STABLE8. You could check if it's the same bug.
Hard to tell. I've been using STABLE7 for quite some time (tracking rawhide) and haven't observed the problem any more on the box where it would usually show up.
Are you still seeing this problem with the later squids, or can I close this?
I haven't run into this for quite a while, running rawhide, which the box that experienced it has been tracking since FC4test1 or so. I don't recall whether the problem was gone before that.