Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 768137

Summary:	squid consumes 100% CPU looping on read()
Product:	Red Hat Enterprise Linux 6	Reporter:	Benjamin Gilbert <bgilbert>
Component:	squid	Assignee:	Michal Luscon <mluscon>
Status:	CLOSED CURRENTRELEASE	QA Contact:	BaseOS QE Security Team <qe-baseos-security>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	6.2	CC:	prc
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2013-07-24 12:29:45 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Benjamin Gilbert 2011-12-15 20:58:56 UTC

Description of problem:
squid sometimes begins consuming 100% CPU and continues until killed.  While doing this, it responds normally to requests.

Version-Release number of selected component (if applicable):
3.1.10-1.el6_2.1

How reproducible:
Has occurred several times on multiple machines, but no reproducer has been found.

Additional info:
stracing an affected squid process produces:

epoll_wait(6, {{EPOLLIN, {u32=19, u64=19}}}, 16384, 205) = 1
read(21, 0x7fff1135c210, 256)           = -1 EAGAIN (Resource temporarily unavailable)
epoll_wait(6, {{EPOLLIN, {u32=19, u64=19}}}, 16384, 205) = 1
read(21, 0x7fff1135c210, 256)           = -1 EAGAIN (Resource temporarily unavailable)
epoll_wait(6, {{EPOLLIN, {u32=19, u64=19}}}, 16384, 205) = 1
read(21, 0x7fff1135c210, 256)           = -1 EAGAIN (Resource temporarily unavailable)
epoll_wait(6, {{EPOLLIN, {u32=19, u64=19}}}, 16384, 205) = 1
read(21, 0x7fff1135c210, 256)           = -1 EAGAIN (Resource temporarily unavailable)
[...]

fd 21 is a pipe:

# ls -l /proc/32128/fd
total 0
lrwx------ 1 squid squid 64 Dec 15 15:15 0 -> /dev/null
lrwx------ 1 squid squid 64 Dec 15 15:15 1 -> /dev/null
l-wx------ 1 squid squid 64 Dec 15 15:15 10 -> /usr7/squid/swap.state
l-wx------ 1 squid squid 64 Dec 15 15:15 12 -> pipe:[169475]
lrwx------ 1 squid squid 64 Dec 15 15:15 15 -> socket:[169678]
lr-x------ 1 squid squid 64 Dec 15 15:15 19 -> pipe:[351636]
lrwx------ 1 squid squid 64 Dec 15 15:15 2 -> /dev/null
l-wx------ 1 squid squid 64 Dec 15 15:15 20 -> pipe:[351636]
lr-x------ 1 squid squid 64 Dec 15 15:15 21 -> pipe:[351637]
l-wx------ 1 squid squid 64 Dec 15 15:15 22 -> pipe:[351637]
lrwx------ 1 squid squid 64 Dec 15 15:15 3 -> /var/log/squid/cache.log-20111211 (deleted)
lrwx------ 1 squid squid 64 Dec 15 15:15 4 -> /dev/null
lrwx------ 1 squid squid 64 Dec 15 15:15 5 -> /var/log/squid/cache.log
lrwx------ 1 squid squid 64 Dec 15 15:15 6 -> anon_inode:[eventpoll]
lrwx------ 1 squid squid 64 Dec 15 15:15 7 -> socket:[169469]
lrwx------ 1 squid squid 64 Dec 15 15:15 8 -> socket:[169470]
lr-x------ 1 squid squid 64 Dec 15 15:15 9 -> pipe:[169474]

gdb backtraces:
#0  0x00007f8c76f94d03 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007f8c79706cf2 in comm_select (msec=<value optimized out>) at comm_epoll.cc:266
#2  0x00007f8c797c6c8e in CommSelectEngine::checkEvents (this=<value optimized out>, timeout=<value optimized out>) at comm.cc:2683
#3  0x00007f8c7971cda4 in EventLoop::checkEngine (this=0x7fff977f6fa0, engine=0x7fff977f7010, primary=<value optimized out>)
    at EventLoop.cc:50
#4  0x00007f8c7971cf1d in EventLoop::runOnce (this=0x7fff977f6fa0) at EventLoop.cc:124
#5  0x00007f8c7971d008 in EventLoop::run (this=0x7fff977f6fa0) at EventLoop.cc:94
#6  0x00007f8c7976aa25 in SquidMain (argc=<value optimized out>, argv=<value optimized out>) at main.cc:1399
#7  0x00007f8c7976aff6 in SquidMainSafe (argc=<value optimized out>, argv=<value optimized out>) at main.cc:1159
#8  main (argc=<value optimized out>, argv=<value optimized out>) at main.cc:1151

#0  0x00007f8c78fcc4ed in read () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007f8c797214ec in read (fd=<value optimized out>, buf=<value optimized out>, len=<value optimized out>)
    at /usr/include/bits/unistd.h:45
#2  default_read_method (fd=<value optimized out>, buf=<value optimized out>, len=<value optimized out>) at fd.cc:157
#3  0x00007f8c797bde66 in CommIO::FlushPipe () at comm.cc:2395
#4  0x00007f8c797be408 in CommIO::NULLFDHandler (fd=19, data=<value optimized out>) at comm.cc:2401
#5  0x00007f8c79706e16 in comm_select (msec=<value optimized out>) at comm_epoll.cc:307
#6  0x00007f8c797c6c8e in CommSelectEngine::checkEvents (this=<value optimized out>, timeout=<value optimized out>) at comm.cc:2683
#7  0x00007f8c7971cda4 in EventLoop::checkEngine (this=0x7fff977f6fa0, engine=0x7fff977f7010, primary=<value optimized out>)
    at EventLoop.cc:50
#8  0x00007f8c7971cf1d in EventLoop::runOnce (this=0x7fff977f6fa0) at EventLoop.cc:124
#9  0x00007f8c7971d008 in EventLoop::run (this=0x7fff977f6fa0) at EventLoop.cc:94
#10 0x00007f8c7976aa25 in SquidMain (argc=<value optimized out>, argv=<value optimized out>) at main.cc:1399
#11 0x00007f8c7976aff6 in SquidMainSafe (argc=<value optimized out>, argv=<value optimized out>) at main.cc:1159
#12 main (argc=<value optimized out>, argv=<value optimized out>) at main.cc:1151

squid.conf:

acl localhost src 127.0.0.1/32 ::1
http_access allow localhost
http_access deny all
http_port localhost:3128
access_log none
cache_dir aufs /usr7/squid 225000 16 256
coredump_dir /var/spool/squid
hosts_file /etc/squid/hosts
refresh_pattern -i (/cgi-bin/|\?) 0     0%      0
refresh_pattern .               0       20%     4320

Comment 2 Jiri Skala 2012-01-12 16:11:56 UTC

(In reply to comment #0)
> #0  0x00007f8c78fcc4ed in read () at ../sysdeps/unix/syscall-template.S:82
> #1  0x00007f8c797214ec in read (fd=<value optimized out>, buf=<value optimized
> out>, len=<value optimized out>)
>     at /usr/include/bits/unistd.h:45
> #2  default_read_method (fd=<value optimized out>, buf=<value optimized out>,
> len=<value optimized out>) at fd.cc:157
> 
> squid.conf:
> 
> cache_dir aufs /usr7/squid 225000 16 256

As was described there are difficulties with read operation. So I'd like to ask about /usr7? Is it a mounting point?

Thanks, Jiri

Comment 3 Benjamin Gilbert 2012-01-12 18:24:45 UTC

/usr7 is a mountpoint, yes.

Comment 4 Jiri Skala 2012-01-13 07:32:51 UTC

How is /usr7 mounted? Did you test another location for cache_dir option? I guess it could be the source of the issue.

Comment 5 Benjamin Gilbert 2012-01-13 19:49:27 UTC

/usr7 is ext2 with no special mount options.  Since the file descriptor being read is a pipe rather than a regular file, the filesystem is likely not the issue.

Comment 6 Jiri Skala 2012-01-17 09:42:17 UTC

You've mentioned 'Has occurred several times on multiple machines...'. What about comparison these machines - setting, squid conf?

How often do you observe 100% CPU consumption? Does it fix only service restart?

/usr7 inquiry - please could you test another cache_dir location? What about this parameter on different machines?

Comment 7 Benjamin Gilbert 2012-01-19 00:41:24 UTC

The machines are all identically configured.  Once the CPU consumption starts, it continues until squid is restarted.

I moved the cache_dir to /var/cache/squid (ext4) and was able to reproduce the problem almost immediately.

I found <http://bugs.squid-cache.org/show_bug.cgi?id=3189>.  Comments 3 and 5 (not the rest of the bug) look exactly like what I'm seeing.

Comment 9 Jiri Skala 2012-02-14 08:57:46 UTC

I've tested squid a couple of days with automatic requests to generate adequate load. I've used recommendation 'how to configure squid to fail' from Squid's BZ#3189 too. 

Unfortunately I was not able to reproduce this issue. I guess there must be something unique in hardware or software configuration what is behind originating this unwanted cpu load. Please, try to put more details about your system (private mail is also fine for me if you worry about security).

Comment 10 Suzanne Logcher 2012-02-14 23:24:44 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 11 Benjamin Gilbert 2012-02-17 19:57:22 UTC

The request load originates from an in-house multithreaded application running on localhost, and the remote HTTP server is another in-house application.  squid is not listening on any external address.

After some effort, I have been unable to reliably reproduce the problem.  If it recurs, I can attach gdb to the squid process and run some commands for you if you'd like.

Comment 12 Michal Luscon 2013-04-13 13:37:55 UTC

Are you still observing high CPU utilization with latest version of squid(3.1.10-16)?

Comment 13 Benjamin Gilbert 2013-04-15 19:43:06 UTC

I haven't seen any recurrences recently.

Comment 14 Michal Luscon 2013-07-24 12:29:45 UTC

Ok, I suppose that this was fixed meanwhile. Feel free to reopen when the high utilization occur again.