Bug 245012 - system hangs without any obvious reasons
system hangs without any obvious reasons
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
7
x86_64 Linux
low Severity low
: ---
: ---
Assigned To: Dave Jones
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-06-20 11:02 EDT by Adrian Reber
Modified: 2015-01-04 17:29 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-09-16 05:18:31 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
lspci output (1.39 KB, text/plain)
2007-07-03 08:08 EDT, Paul Black
no flags Details

  None (edit)
Description Adrian Reber 2007-06-20 11:02:19 EDT
Since we upgraded our system to Fedora 7 both available kernels seem very 
unstable. We have tried 2.6.21-1.3194.fc7 as well as 2.6.21-1.3228.fc7.

Unfortunately we cannot describe the error in more detail but that it just hangs.
Over the serial connection the system is not responding anymore as well as on
the VGA console.

It usually happens after about 36 hours and then the system is not reachable
anymore.

The system is a quad amd dual-core system:
processor       : 7
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 65
model name      : Dual-Core AMD Opteron(tm) Processor 8214
stepping        : 2
cpu MHz         : 2200.283
cache size      : 1024 KB
physical id     : 3
siblings        : 2
core id         : 1
cpu cores       : 2

Dell Poweredge 6950 8GB RAM

The load is most of the time rather high. This is a mirror server and we are
pushing about 300Mbit/s as an average. With much higher peaks possible.
There is a fiber channel RAID with 4TB attached and 1TB iscsi volume.

We used bonding during FC6 but due to
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=241719
we disabled it.

We have a dual port e1000 for our main traffic and the internal bnx2 dual port
is used for the iscsi connection.

Our system is stable with the latest 2.6.18 from FC6 but newer FC6 kernels were
never tried because of
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=225399

We are writing some values to proc with sysctl for performance reasons:
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1800
net.core.wmem_max = 8388608
net.core.rmem_max = 8388608
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 87380 8388608
net.ipv4.tcp_max_syn_backlog = 4096
vm.min_free_kbytes = 65536

We have not tried changing these values because they have proofed to be good
with the 2.6.18 kernel. I have no idea if these values are the reason for our hangs.

I am aware that this is a bad bug report and you can close if you like. I just
wanted to report it. Currently we are happy with 2.6.18.
Comment 1 Paul Black 2007-07-03 08:08:50 EDT
Created attachment 158428 [details]
lspci output

I'm having a similar problem on a copule of FC7 machines, most recently on a
Dell Optiplex GX270 with kernel-2.6.21-1.3228.fc7 - complete lock up no output
on serial console. Has also happened at run level 3 (so no X). I've attached
the output of lspci for the GX270 as it might help correlate issues with
specific hardware.
Comment 2 Jeffrey Grace 2007-09-06 04:47:56 EDT
We've been having trouble with random system hangs on Acer Veriton 5800
workstations.

These have a pentium D (3.4Ghz) processor.  setting maxcpus=1 at boot, seems to
stop this from happening.  We;ve noticed the the same problem with FC6 machines
running a kernel later than 2.6.20.
Comment 3 Adrian Reber 2007-09-16 05:18:31 EDT
With 2.6.22.4-65.fc7 we have now a uptime of two weeks. Seems to be fixed.
Closing it.

Note You need to log in before you can comment on or make changes to this bug.