Bug 177075 - httpd related kernel crash
httpd related kernel crash
Status: CLOSED NEXTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
3
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Dave Jones
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-01-05 19:19 EST by William Lovaton
Modified: 2015-01-04 17:24 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-01-05 21:21:45 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
/var/log/messages from the crash with several httpd related backtraces (273.53 KB, text/plain)
2006-01-05 19:22 EST, William Lovaton
no flags Details
Output of lspci -v (6.19 KB, text/plain)
2006-01-05 19:24 EST, William Lovaton
no flags Details
Output of dmesg (25.35 KB, text/plain)
2006-01-05 19:26 EST, William Lovaton
no flags Details

  None (edit)
Description William Lovaton 2006-01-05 19:19:19 EST
Today I got a FC3 kernel crash (2.6.12-1.1378_FC3smp) on my production server. 
It runs a huge web app with Apache and PHP and handles a big number of
concurrent users.

Certainly, those logs and traces are new to me, I have never experienced a
problem like that with all the operating systems we were using (RH9 -> FC2 ->
FC3) during the 3 years of life of our web system.  The only way to get the
system back was to make a hard reboot.

The server is an IBM xSeries 445, 4GB of RAM, 4 HT processors with a second
(disabled) bank of processors that would boost it to 18GB of RAM and 8 HT
processors. So we are kind of half powered here.

FC3 in general is very stable and these crashes are hard to reproduce.

I know this is not the latest kernel for FC3 but I'll try to update to the
latest software available in the repositories.

Any idea about the cause of this problem?  besides, is there any sign that there
will be a last FC3 kernel update just before FC3 EOL?

I'll attach the output of dmesg, lspci -v and /var/log/messages which shows
several kernel backtraces and some other kind of kernel info.
Comment 1 William Lovaton 2006-01-05 19:22:05 EST
Created attachment 122856 [details]
/var/log/messages from the crash with several httpd related backtraces
Comment 2 William Lovaton 2006-01-05 19:24:22 EST
Created attachment 122857 [details]
Output of lspci -v
Comment 3 William Lovaton 2006-01-05 19:26:47 EST
Created attachment 122858 [details]
Output of dmesg

If you can seen any anomaly with the hardware and/or the kernel, could you
please point it out?
Comment 4 Dave Jones 2006-01-05 21:21:45 EST
basically you ran out of memory.

here's how your memory is laid out..

  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 225280 pages, LIFO batch:31
  HighMem zone: 950272 pages, LIFO batch:31

When you put a lot of RAM into 32bit machines, the lower 896MB of memory (Your
'Normal zone' has to contain pagetable pointers for every 4KB of memory in the
system.

Certain allocations can also only work if they come from the lower 16MB of
memory (such as DMA for certain device drivers).

If an allocation for a 'normal zone' page fails, it falls back to the dma zone.
However with dma zone being so small, bad things happen when this gets depleted.
Because your normal zone is filled with pagetables, it's falling back to the dma
zone for more and more pages, and then when a real 'ZONE_DMA' request comes in,
there's nothing left.

Later kernels have had some zone balancing changes which may fix this (or at
least keep things running albeit at a crawl until the memory usage backs off).
The changes however are massive, and not really an option for backporting to the
FC3 kernel, which is only going to get an update now if some really bad security
problem came up.
Comment 5 William Lovaton 2006-01-06 07:31:55 EST
Thanx for your insights... Any idea why the memory got so full? Never seen
something like this with these systems.  Is there a possibility of a DoS attack?
 Where should I look?
Comment 6 Dave Jones 2006-01-11 23:20:50 EST
httpd logs maybe ?
Comment 7 William Lovaton 2006-01-12 08:38:45 EST
How could that be?

Yes, our logs are huge, it gets about 8 millions hits per day and they are more
than 4 GB every week before rotate them (logrotate)... this is a very standard
FC3 server.

Is this a problem?

Note You need to log in before you can comment on or make changes to this bug.