Bug 132136 - squid eats all cpu
squid eats all cpu
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: squid (Show other bugs)
3
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Jay Fenlason
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-09-08 22:23 EDT by Alexandre Oliva
Modified: 2014-08-31 19:26 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-05-24 22:16:58 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alexandre Oliva 2004-09-08 22:23:24 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2)
Gecko/20040809

Description of problem:
I've observed for quite some time (before FC2 was released; possibly
earlier) that squid will sometimes enter a state in which it consumes
all cpu cycles, repeatedly calling gettimeofday() and poll() with a
timeout of zero.

I've finally had a chance to attach to such a runaway process.  I
couldn't figure out exactly what was wrong or how it gets into this
state, but I have some clues that might help someone more familiar
with the codebase to do so.

Basically, it appears that we call comm_poll(0) (or at least msec == 0
and the first entry in the tasks list is long overdue at the time I
notice the problem and attach to squid), and then we never return from
comm_poll because poll() always returns 0 and npending is zero.

No active http requests exist, and there are only two open file
descriptors, both polled with POLLIN only: the tcp server socket and
the ipcv2 udp socket.

The first entry in the tasks list is:
{func = 0xf60716 <storeUfsDirCleanEvent>, arg = 0x0,
  name = 0xf85315 "storeDirClean", when = 1094694402.237715, next =
0x9386bd0,
  weight = 1, id = 40278}

I don't quite understand enough about squid to tell for sure how we
got into this state, but I don't see anything that would get us out of
the comm_poll do/while loop in this case, or that would get comm_poll
to return or change msec to something other than zero.

Version-Release number of selected component (if applicable):
squid-2.5.STABLE5-5
Comment 1 Jay Fenlason 2004-10-01 11:35:28 EDT
You should definitely report this upstream at 
www.squid-cache.org/bugs/index.cgi 
 
I've also recently updated fc3 to squid-2.5.STABLE6 .  You may want 
to see if it fixes the problem, although I don't see anything in the 
list of patches that looks relevent. 
Comment 2 Alexandre Oliva 2004-10-01 18:49:21 EDT
It doesn't fix the problem, been running it since it made it to
rawhide and no change.  The problem appears to happen only on boxes on
which I run vpnc, that use Red Hat-internal servers as proxies.  It
happens more often on boxes that stay up all night, such that vpnc
loses the connection and squid becomes unable to access the proxy.

I'll file upstream, thanks.
Comment 3 Alexandre Oliva 2004-10-01 20:06:25 EDT
This might be the same bug:
http://www.squid-cache.org/bugs/show_bug.cgi?id=354

I'll investigate some more, with some of the tips given there.
Comment 4 Nerijus Baliūnas 2005-02-14 15:36:03 EST
One cpu eating bug was fixed in squid-2.5.STABLE8. You could check if it's the
same bug.
Comment 5 Alexandre Oliva 2005-02-15 14:46:06 EST
Hard to tell.  I've been using STABLE7 for quite some time (tracking
rawhide) and haven't observed the problem any more on the box where it
would usually show up.
Comment 6 Jay Fenlason 2005-05-24 16:05:03 EDT
Are you still seeing this problem with the later squids, or can I close this? 
Comment 7 Alexandre Oliva 2005-05-24 22:16:58 EDT
I haven't run into this for quite a while, running rawhide, which the box that
experienced it has been tracking since FC4test1 or so.  I don't recall whether
the problem was gone before that.

Note You need to log in before you can comment on or make changes to this bug.