Bug 125887

Summary: kernel 2.2.6-1.427 causes HARD system hangs / FREEZES!
Product: [Fedora] Fedora Reporter: Hans de Goede <hdegoede>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 2CC: chaghi, pfrields, philip.r.schaffner, waustin
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-04-16 04:13:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
last messages before crash with 2.6.6
none
same messages, this time without a crash with 2.6.5
none
/proc info none

Description Hans de Goede 2004-06-13 09:20:41 UTC
A few days ago my system completly froze while burning a cd, this was
with 2.2.6-1.406. I thought this was just bad luck, even with good
hardware once in a while a bit falls over. But when I tried to burn
the same CD again (this time running 2.2.6-1.406) my system froze
again. The third now running the official update 2.2.6-1.427 my system
froze again!

After downgrading to kernel-2.6.5-1.358 I've managed to already burn 3
CDR's succesfully without any problems.

On a related note I've also noticed that with the 2.6.6-1.xxx kernel
some commands take a very long time to complete, for example doing an
rpm -e on an old kernel after succesfully booting the new one takes
atleast 5 minutes (after 5 minutes I started doing soemthing else),
when this happened in top I had around 90% wa CPU usage.

My system is:
-Asus Athlon slot-A MB with irongate North bridge and VIA southbridge
-Slot-A Athlon 600 Mhz
-2x 128 Mb 100Mhz SDRAM's
-2x 6.5 Gigabyte ide disks, both master doing udma 33
-Adaptec AIC7860 single channel narrow ultra scsi controler without BIOS
-Sony CDU 76s 4x speed SCSI cdrom
-Yamaha CRW4416S 4x speed SCSI cdburner
-Matrox G200, using the default Xorg driver

I've just bought a new DVD-player, which also can play mp3-cd's and
when the frozes occured I was trying to copy an mp3 CD with Rock Ridge
extensions, to a new CDR with only Joliet extensions because the DVD
player doesn't seem to like RockRidge. I didn't want to make an image
for every  mp3 CD so I decided to do it on the fly at 2x speed since
my CDROM player can only do 4x speed, and doing 4x speed burning from
a 4x speed source while remastering seemed to much to me, this
resulted in the following command used to burn the CD:
mkisofs -J -V mp3-006 /misc/cdrom1 2> mkisofs.log | cdrecord -v -eject
speed=2 dev=/dev/cdrom -tao -

I was logged into GNOME as a normal user, "su -" to root in an xterm.
Also notice that to make things even more interesting /misc is an
autofs filesystem. I must admit that I'm pushing my luck, but that;s
why I'm running a real OS, and besides I've already "remastered" 3 mp3
CD's this way successfully with kernel-2.6.5-1.358 .

Which leaves me wondering why RH has decided to release an official
update to 2.6.6 incorperating the 2.6.7 anonvma changes which haven't
even been released yet? If the purpose whas to get NX support why not
just ad that to 2.6.5-1.358 ?

Comment 1 Hans de Goede 2004-06-13 09:22:29 UTC
p.s.

I've got my screensaver disabled, so there was no 3D screensaver
kicking in doing AGP, that would make things even more interesting,
but isn't the case.


Comment 2 Arjan van de Ven 2004-06-13 09:51:07 UTC
> Which leaves me wondering why RH has decided to release an official
> update to 2.6.6 incorperating the 2.6.7 anonvma changes which haven't
>  even been released yet? If the purpose whas to get NX support why 
> not just ad that to 2.6.5-1.358 ?

the purpose of the erratum was mostly to incorporate the hundreds and
hundreds of bugfixes made to the kernel since the 2.6.6-rc3 we shipped
in FC2. 

Comment 3 Hans de Goede 2004-06-13 10:11:33 UTC
Ok,

Couldn't you have splitted out and reversed the anonvma changes? If 
this one crashes for me you can wait for it to crash on others.

Anyways the reason for me to submit this bug was not to start a
discussion about RH's kernel release policy that was just a side note. 

What I really want is to get this bug fixed. Not just for me, I'm a
seasoned linux user and can live with using 2.6.5 for now, but as an
active Linux advocate I don't want to see Linux crashing on new users,
thats bad publicity.


Comment 4 Hans de Goede 2004-06-15 08:17:36 UTC
I think I've got some additional clues, the problem seems to be in the
aic7xxx driver. I'll attach a part of /var/log/messages containing the
last kernel-messages before a freeze. I've got 2 of these in my logs
(and I've had 3 freezes)

I'll also attach a simular message in 2.6.5, but this time the system
keeps running, I must notice that with 2.6.5 after this message my
CDrecorder was unusable, so I'm now running that latest FC1 kernel.


Comment 5 Hans de Goede 2004-06-15 08:20:11 UTC
Created attachment 101134 [details]
last messages before crash with 2.6.6

Comment 6 Hans de Goede 2004-06-15 08:20:59 UTC
Created attachment 101135 [details]
same messages, this time without a crash with 2.6.5

Comment 7 Mariano Draghi 2004-06-28 01:51:10 UTC
I can confirm this random freezings with kernel 2.6.6.1-427 & 1-435
They appear whith normal use of the box under Gnome. The system
becomes unresponsible, sometimes you make a click on a button and the
action takes several minutes to actually take place. If you drag a
windows over the screen, the screent doesn't refresh properly, and
everything becomes cluttered. It seems this affect Nautilus more than
other applications. When this kind-of-freez is taking place, if I go
to a text console and the go back to the graphic console, all the
icons from the opened windwos dissapear.
Sometimes it's possible to work normally within an already opened
gnome-terminal during the freezing.
I also note strange audio problems running the 435 kernel. If I point
to a little (28k) .au file in a Nautilus window, Nautilus will
"preview" the first 2 or 3 seconds, and start over again, in an
infinite loop (until I move the mouse pointer of course). If I boot
with the original 358 kernel, the .au file is played only once, and in
its full lenght.
I can go to a text console w/o problem, and work normally from there
during the freeze. There isn't any suspicious process eating resources.
Sometimes killing the X-server solves the problem for several minutes
(then it pops-up again).
Sometimes if you kill the X-server it won't start again (the gdm
session hangs).
The problem is very reproducible in my box: I only have to boot any of
these kernels, and do some work. After a few minutes, the freezing
takes place. Sometimes it goes away alone after a random period of
time (seconds or even minutes).
Nothing of this happens with te original FC2 kernel (2.6.5)

I've noted several posts in the fedora-list in the last two weeks;
from what I've read, this not affect Pentium IV boxes. Only AMD Athlon
ones.
Different people is reporting differnt problems, but all of the seem
related to something in the 2.6.6-1.* kernels. Here are some relevant
posts:
http://www.redhat.com/archives/fedora-list/2004-June/msg05631.html
http://www.redhat.com/archives/fedora-list/2004-June/msg06842.html

(it seems that some people solved the problem turning acpi off; this
is NOT my case. I've tried that, with no luck).

My configuration:
- AMD Athlon XP 1700+
- mobo: Asus A7N8X Deluxe (nVidia nForce2)
- HD: IDE ATA/133 80Gb (Maxtor)
- Video: Abit Siluro Geforce4 MX
- Video driver: Standard XOrg "nv" driver
- No 3rd party / unnofficial packages.
- FC2, fresh install, updates as of June 25th.

Comment 8 Hans de Goede 2004-06-28 06:07:48 UTC
Notes:
-I only get freezes when I use my cd drives which are attached to my
adaptec scsi card
-My cd drives also don't work as advertised with the 2.6.5 kernel
-I don't have acpi (according to the kernel my bios is too old)
-I do have an Athlon

So this bug really are 2 bugs:
-freezes on Athlon with 2.6.6
-problems with adaptec aic7xxx scsi, which are also there in 2.6.5 and
which trigger the generic Athlon freeze in 2.6.6


Comment 9 Rob Crowther 2004-06-28 21:54:23 UTC
I think the first bug might be to do with the nautilus and
gst-thumbnail.  If I open a directory that contains some .asf/.wmv
files I was getting issues similar to those described in #7.  I was
able to telnet in from another machine on several occassions and run
top and found either a large number of pdflush processes running or
one very memory hungry gst-thumbnail process - using 90% of my RAM and
90% of my swap.  I also have an Athlon processor on an ASUS motherboard.

Comment 10 Mariano Draghi 2004-06-28 22:33:13 UTC
I have experienced the freeze in any sort of directory, even in
directories with no multimedia content. In fact, I've experienced the
freeze with no opened Nautilus windows at all (f.i., writting an
e-mail in Evolution). I have failed to identify any suspicious process
during the freeze.
I played a bit more yesterday with the 435 kernel, and noticed that
many of the system sounds (gnome's event sounds) are changed whenever
I boot that kernel (i.e., the sound when I launch and application is
different (heavily distorted?) on a running 435 kernel). Weird. Very
weird!. This is a constant issue (and has no aparent relationship with
the actual freezings).

If it were Nautilus (or any of the Gnome components), how come it goes
away booting the 2.6.5 kernel? Unless some new code in the 2.6.6
kernel broke gnome.

All of us are using gnome?

I'll test the 435 kernel with KDE, and report later.

Comment 11 Mariano Draghi 2004-06-29 13:34:50 UTC
Ok, now if I start KDE the problem is quite different:
- I couldn't reproduce the random freezings I get with Gnome;
- I have no system sounds at all;
- Any application that deals with sound in some way, either hangs
completely or don't play any sound at all;

As with Gnome, if I boot the original FC2 2.6.5 kernel, KDE has sound
again, and everything work just fine, without having to touch any
configuration.

It would be interesting if any other with this problem can confirm a
different behavior using KDE...


Comment 12 Phil Schaffner 2004-06-29 14:07:20 UTC
Have also experienced frequent slow response in KDE, as well as hard
lockups overnight, with kernel-smp-2.6.6-1.435 on Tyan S2462 Thunder
K7 MB with Dual Athlon MP 1800+.  Not doing any multimedia normally
but sound works for .au and .wav files (despite ALSA error messages at
startup).  Screensaver is blank-screen.  Have a SCSI disks and CD-RW,
which has not been used lately but will bang on it and report any new
data.  Nothing obvious in logs after hard lockup.  Will provide any
additional info that might help on request.

[root@radar0 root]# lspci
00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P]
System Controller (rev 11)
00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P]
AGP Bridge
00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-766 [ViperPlus]
ISA (rev 02)
00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-766
[ViperPlus] IDE (rev 01)
00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-766 [ViperPlus] ACPI
(rev 01)
00:07.4 USB Controller: Advanced Micro Devices [AMD] AMD-766
[ViperPlus] USB (rev 07)
00:09.0 RAID bus controller: 3ware Inc 3ware 7000-series ATA-RAID (rev 01)
00:0c.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1
(rev 07)
00:0c.1 Input device controller: Creative Labs SB Live! MIDI/Game Port
(rev 07)
00:0d.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
00:0d.1 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
00:0f.0 Ethernet controller: 3Com Corporation 3c980-C 10/100baseTX NIC
[Python-T] (rev 78)
00:10.0 Ethernet controller: 3Com Corporation 3c980-C 10/100baseTX NIC
[Python-T] (rev 78)
01:05.0 VGA compatible controller: ATI Technologies Inc Radeon R200 QL
[Radeon 8500 LE]


Comment 13 Phil Schaffner 2004-06-29 15:25:52 UTC
Just finished burning FC2 CDs 1 - 4 at 20x on Yamaha CRW2200S using
K3B without incident.  Played the "horse race" sound on completion. 
SCSI CD writing seems not to be an issue for me.

Comment 14 Martin Andersen 2004-07-01 23:14:09 UTC
I just installed FC2 yesterday, fresh install. I did a
up2date and installed all updated packages, including kernel
2.6.6-1.435.2.1. The system runs fine, at least for now, in
kernel 2.6.5, but I had two freezes when trying to boot into
2.6.6.

The same hardware configuration ran fine for many months with
FC1.

I just tried 2 times to boot 2.6.6, so I haven't had any
KDE or Gnome running, I don't want any system corruption stuff..

Hardware :

Asus A7N8X motherboard, Athlon 2400+, 512MB samsung PC2700.
Geforce4 Ti 4200
Soundblaster Live!
Canon Lide 30 scanner
HP laserjet 4+
Logitech 300MX optical mouse
Logitech keyboard
3com netcard, driver 3c59x
Seagate barracuda IDE 80GB hdd
Asus CD-rom
Lite-On DVD burner.

Comment 15 William W. Austin 2004-07-29 12:41:18 UTC
This bug may be related to the one I posted in 126391 - it's hard to
tell from this end; however, both do involve problems on athlon
systems using the aic7xxx driver, and both include system hangs or
slowdowns.

So fwiw I was asked to post a note on this bug referencing the but I
had posted.

Comment 16 gene smith 2004-10-03 19:22:22 UTC
I have been having this problem since fc2 test. All kernels seem to
have the problem for me. Total X lockup. Only the mouse will move and
no keyboard response. Ctl-Alt-BkSp non-functional. I have no scsi,
just IDE drive and GeForce3 Ti 200 card using standard stock
fedora/xorg drivers. Can go to another computer and log in using vnc
with no problem and restart or shutdown system and/or X. Observe via
vnc the X process is running about 97% cpu with 3% memory when locked.
CPU is 500 Mhz athlon. Memtest works fine on overnight tests. See
nothing in logs indicating a problem. Ctl-Alt-(keypad)/ or (keypad)*
enabled in xorg.confg but do nothing during lock. Have reported this
several time on list. Have completely updated fc2. At times has seemed
to be fixed but alway occurs eventually (several times a day to once a
week when x heavily loaded with lots of windows to just a few). Have
seen with kde and gnome.

Comment 17 Vladimir Ivanovic 2004-10-06 05:41:07 UTC
Created attachment 104823 [details]
/proc info

Comment 18 Vladimir Ivanovic 2004-10-06 16:12:23 UTC
I forgot to add that when the X server is tying up a CPU, it is
looping thusly: 

ioctl(8, 0xc0286429, 0xfef9e980)        = -1 EBUSY (Device or resource
busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
ioctl(8, 0xc0286429, 0xfef9e980)        = -1 EBUSY (Device or resource
busy)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
ioctl(8, 0xc0286429, 0xfef9e980)        = -1 EBUSY (Device or resource
busy)
ioctl(8, 0xc0286429, 0xfef9e980)        = -1 EBUSY (Device or resource
busy)

and if I read it correctly, lsof says that file descriptor 8 is: 

X       10717 root    8u   CHR      226,0              65536
/dev/dri/card0

Comment 19 Dave Jones 2005-04-16 04:13:17 UTC
Fedora Core 2 has now reached end of life, and no further updates will be
provided by Red Hat.  The Fedora legacy project will be producing further kernel
updates for security problems only.

If this bug has not been fixed in the latest Fedora Core 2 update kernel, please
try to reproduce it under Fedora Core 3, and reopen if necessary, changing the
product version accordingly.

Thank you.