Bug 204489 - Computer slowing down and then freez
Computer slowing down and then freez
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.4
All Linux
medium Severity high
: ---
: ---
Assigned To: Konrad Rzeszutek
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-08-29 10:54 EDT by Frederic Medery
Modified: 2007-11-16 20:14 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-07-17 13:30:43 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Frederic Medery 2006-08-29 10:54:12 EDT
Description of problem:
Since 4.4. Everyday, my computer becomes very slow and then freeze. Nothing in
the dmesg or log/message. Every morning, when I try to unlock the station, it
slows down and the freeze. The problem could also happen during daywork time.
No problem when booting from 4.3 kernels

Version-Release number of selected component (if applicable):
kernel*-2.6.9-42.0*

How reproducible:


Steps to Reproduce:
1. booting with latest kernel
2. waiting several hours, most of the time more then 8 hours
3. Computer slows down and the freeze
4. Unable to do ctrl-sysrq-{t,m}  
Actual results:

Must do a hard rebood 

Expected results:


Additional info:
Application used : 
firefox
thunderbird
gnome-terminal

[mederyf@trieste ~]$ lspci
-bash: lspci: command not found
[mederyf@trieste ~]$ lspci
[mederyf@trieste ~]$ /sbin/lspci
00:00.0 Host bridge: Intel Corporation 82845G/GL[Brookdale-G]/GE/PE DRAM Control
ler/Host-Hub Interface (rev 01)
00:01.0 PCI bridge: Intel Corporation 82845G/GL[Brookdale-G]/GE/PE Host-to-AGP B
ridge (rev 01)
00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) U
SB UHCI Controller #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) U
SB UHCI Controller #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) U
SB UHCI Controller #3 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Co
ntroller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 81)
00:1f.0 ISA bridge: Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC Interface Br
idge (rev 01)
00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 01)
00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Cont
roller (rev 01)
00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH
4-L/ICH4-M) AC'97 Audio Controller (rev 01)
01:00.0 VGA compatible controller: nVidia Corporation NV17GL [Quadro4 200/400 NV
S] (rev a3)
02:08.0 Ethernet control


I do not use any no RH kernel module
Comment 1 David Herselman 2006-09-13 04:36:52 EDT
We have over 30 servers running RHEL4.4 now and close to all of them exibit
a massive slow down on their I/O performance. Servers aren't necessarily
crashing but the load average is *MUCH* higher with 'top' showing massive
iowait.

This occured after completing the RHEL4.4 update and then booting off the
new kernel...
Comment 2 David Herselman 2006-09-13 05:17:35 EDT
Most dramatic is the slow down on Serial ATA drives, most of which are running
on nVidia chipsets so utilising the sata_nv module.

----------------------------------[ lspci ]----------------------------------
00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2)
00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
----------------------------------[ lspci ]----------------------------------

-------------------[ grep -i 'sata\|sda' /var/log/dmesg ]-------------------
sata_nv 0000:00:07.0: version 0.8
ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xD800 irq 169
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xD808 irq 169
ata1: SATA link up 1.5 Gbps (SStatus 113)
nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
scsi0 : sata_nv
ata2: SATA link up 1.5 Gbps (SStatus 113)
nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
scsi1 : sata_nv
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
 sda:<4>nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
 sda1 sda2 sda3
nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
 sdb:<4>nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
ata3: SATA max UDMA/133 cmd 0x9E0 ctl 0xBE2 bmdma 0xC400 irq 177
ata4: SATA max UDMA/133 cmd 0x960 ctl 0xB62 bmdma 0xC408 irq 177
ata3: SATA link down (SStatus 0)
scsi2 : sata_nv
ata4: SATA link down (SStatus 0)
scsi3 : sata_nv
-------------------[ grep -i 'sata\|sda' /var/log/dmesg ]-------------------


When idle:
---------------------[ hdparm -t -T /dev/sda /dev/sdb ]---------------------
/dev/sda:
 Timing cached reads:   3068 MB in  2.00 seconds = 1533.47 MB/sec
 Timing buffered disk reads:  156 MB in  3.01 seconds =  51.75 MB/sec

/dev/sdb:
 Timing cached reads:   3068 MB in  2.00 seconds = 1533.47 MB/sec
 Timing buffered disk reads:  156 MB in  3.01 seconds =  51.75 MB/sec
---------------------[ hdparm -t -T /dev/sda /dev/sdb ]---------------------
Comment 3 Jason Corley 2006-10-03 09:35:13 EDT
I have a nagging suspicion that this is related to:
    http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=207244
The updated kernel in the bug report fixed the issue on our production mail server.
Comment 4 Frederic Medery 2006-10-06 08:09:51 EDT
No luck for me, the updated kernel did not fixe the problem.
So I went back to 2.6.9.32.0.2
The problem is just with IBM NetVista stations. All HP and Dell are ok
Comment 5 Konrad Rzeszutek 2006-10-12 12:53:34 EDT
Have you tried to pass in 'noapic' during bootup for the RHEL4 U4 kernel?
Comment 6 wilburn 2006-10-12 14:01:49 EDT
I have tried 'noapic'. It did not help.

I have 5 IBM NetVista machines with this problem
Comment 7 Konrad Rzeszutek 2006-10-12 14:44:14 EDT
Can you pass the model type of the NetVista machine, please. Also, try 'nolapic'
 bootup argumnent. Thanks
Comment 8 Frederic Medery 2006-10-12 15:25:57 EDT
I also have this problem with IBM Netvista computers:
Modele : 8307-81U

I will try with nolapic and noapic
Comment 9 wilburn 2006-10-12 16:23:20 EDT
The type is 8310-47U.
I have rebooted with 'nolapic'. I should know by monday if there is any change.
Comment 10 wilburn 2006-10-13 17:31:57 EDT
'nolapic' does not help.
Comment 11 Konrad Rzeszutek 2006-10-16 10:47:00 EDT
Darrick,

Do you have any idea who would have tested this machine for RHEl4 U4? Thank you.
Comment 12 Darrick Wong 2006-10-16 18:55:12 EDT
Nope.  There weren't any reports of slowdowns with System X hardware, but
NetVistas are IBM PC Division/Lenovo products, which means they have different
motherboards and different BIOSes.
Comment 13 Konrad Rzeszutek 2006-12-04 11:32:45 EST
Copying this from another BZ that deals also with NetVista machines:
"
This sounds like the BIOS bug seen on a number of Thinkcentre boxes (my desktop
included). Basically a chunk of SMM (BIOS) code runs and corrupts the local apic
registers that define the tick frequency, causing time to increase *very*
slowly. With the recent timeofday work (2.6.18+), time should continue to
increase properly, but increased latencies will be noticed. 

Booting w/ noapic will work around the problem, but the correct fix has been to
update the BIOS, but it seems the BIOS fix has not yet been implemented for this
hardware. The issue should be brought up w/ the hardware folks.

See OSDL bugs:
http://bugme.osdl.org/show_bug.cgi?id=2544
http://bugme.osdl.org/show_bug.cgi?id=6296

The last of which has a patch that functions as a workaround. I'm not sure
however if that patch should go mainline or not (the original developer of the
patch just blamed the BIOS and didn't want to push the patch).
"

However, using 'noapic' did not help you. I was wondering if you had
tried to use a more recent version of the kernel - 2.6.18 for example and see if
that makes the problem go away?
Comment 14 Konrad Rzeszutek 2007-07-17 10:26:22 EDT
ping?
Comment 15 Frederic Medery 2007-07-17 10:57:05 EDT
Hello,
Just to let you know that the noapci noacpi options resolved my problem.

Anyway we are migrating stations to RHEL5 now.
Comment 16 Konrad Rzeszutek 2007-07-17 13:30:43 EDT
OK. Closing BZ as WORKSFORME.

There was an update to RHEL5 U1 to solve a timer problem on the NetVista. I am
not sure of the BZ at this 

Note You need to log in before you can comment on or make changes to this bug.