Bug 132136 - squid eats all cpu
Summary: squid eats all cpu
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: squid
Version: 3
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jay Fenlason
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-09-09 02:23 UTC by Alexandre Oliva
Modified: 2014-08-31 23:26 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-05-25 02:16:58 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Alexandre Oliva 2004-09-09 02:23:24 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2)
Gecko/20040809

Description of problem:
I've observed for quite some time (before FC2 was released; possibly
earlier) that squid will sometimes enter a state in which it consumes
all cpu cycles, repeatedly calling gettimeofday() and poll() with a
timeout of zero.

I've finally had a chance to attach to such a runaway process.  I
couldn't figure out exactly what was wrong or how it gets into this
state, but I have some clues that might help someone more familiar
with the codebase to do so.

Basically, it appears that we call comm_poll(0) (or at least msec == 0
and the first entry in the tasks list is long overdue at the time I
notice the problem and attach to squid), and then we never return from
comm_poll because poll() always returns 0 and npending is zero.

No active http requests exist, and there are only two open file
descriptors, both polled with POLLIN only: the tcp server socket and
the ipcv2 udp socket.

The first entry in the tasks list is:
{func = 0xf60716 <storeUfsDirCleanEvent>, arg = 0x0,
  name = 0xf85315 "storeDirClean", when = 1094694402.237715, next =
0x9386bd0,
  weight = 1, id = 40278}

I don't quite understand enough about squid to tell for sure how we
got into this state, but I don't see anything that would get us out of
the comm_poll do/while loop in this case, or that would get comm_poll
to return or change msec to something other than zero.

Version-Release number of selected component (if applicable):
squid-2.5.STABLE5-5

Comment 1 Jay Fenlason 2004-10-01 15:35:28 UTC
You should definitely report this upstream at 
www.squid-cache.org/bugs/index.cgi 
 
I've also recently updated fc3 to squid-2.5.STABLE6 .  You may want 
to see if it fixes the problem, although I don't see anything in the 
list of patches that looks relevent. 

Comment 2 Alexandre Oliva 2004-10-01 22:49:21 UTC
It doesn't fix the problem, been running it since it made it to
rawhide and no change.  The problem appears to happen only on boxes on
which I run vpnc, that use Red Hat-internal servers as proxies.  It
happens more often on boxes that stay up all night, such that vpnc
loses the connection and squid becomes unable to access the proxy.

I'll file upstream, thanks.

Comment 3 Alexandre Oliva 2004-10-02 00:06:25 UTC
This might be the same bug:
http://www.squid-cache.org/bugs/show_bug.cgi?id=354

I'll investigate some more, with some of the tips given there.

Comment 4 Nerijus Baliūnas 2005-02-14 20:36:03 UTC
One cpu eating bug was fixed in squid-2.5.STABLE8. You could check if it's the
same bug.

Comment 5 Alexandre Oliva 2005-02-15 19:46:06 UTC
Hard to tell.  I've been using STABLE7 for quite some time (tracking
rawhide) and haven't observed the problem any more on the box where it
would usually show up.

Comment 6 Jay Fenlason 2005-05-24 20:05:03 UTC
Are you still seeing this problem with the later squids, or can I close this? 

Comment 7 Alexandre Oliva 2005-05-25 02:16:58 UTC
I haven't run into this for quite a while, running rawhide, which the box that
experienced it has been tracking since FC4test1 or so.  I don't recall whether
the problem was gone before that.


Note You need to log in before you can comment on or make changes to this bug.