Bug 125887
Summary: | kernel 2.2.6-1.427 causes HARD system hangs / FREEZES! | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Hans de Goede <hdegoede> | ||||||||
Component: | kernel | Assignee: | Dave Jones <davej> | ||||||||
Status: | CLOSED NEXTRELEASE | QA Contact: | |||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 2 | CC: | chaghi, pfrields, philip.r.schaffner, waustin | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2005-04-16 04:13:17 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Hans de Goede
2004-06-13 09:20:41 UTC
p.s. I've got my screensaver disabled, so there was no 3D screensaver kicking in doing AGP, that would make things even more interesting, but isn't the case. > Which leaves me wondering why RH has decided to release an official
> update to 2.6.6 incorperating the 2.6.7 anonvma changes which haven't
> even been released yet? If the purpose whas to get NX support why
> not just ad that to 2.6.5-1.358 ?
the purpose of the erratum was mostly to incorporate the hundreds and
hundreds of bugfixes made to the kernel since the 2.6.6-rc3 we shipped
in FC2.
Ok, Couldn't you have splitted out and reversed the anonvma changes? If this one crashes for me you can wait for it to crash on others. Anyways the reason for me to submit this bug was not to start a discussion about RH's kernel release policy that was just a side note. What I really want is to get this bug fixed. Not just for me, I'm a seasoned linux user and can live with using 2.6.5 for now, but as an active Linux advocate I don't want to see Linux crashing on new users, thats bad publicity. I think I've got some additional clues, the problem seems to be in the aic7xxx driver. I'll attach a part of /var/log/messages containing the last kernel-messages before a freeze. I've got 2 of these in my logs (and I've had 3 freezes) I'll also attach a simular message in 2.6.5, but this time the system keeps running, I must notice that with 2.6.5 after this message my CDrecorder was unusable, so I'm now running that latest FC1 kernel. Created attachment 101134 [details]
last messages before crash with 2.6.6
Created attachment 101135 [details]
same messages, this time without a crash with 2.6.5
I can confirm this random freezings with kernel 2.6.6.1-427 & 1-435 They appear whith normal use of the box under Gnome. The system becomes unresponsible, sometimes you make a click on a button and the action takes several minutes to actually take place. If you drag a windows over the screen, the screent doesn't refresh properly, and everything becomes cluttered. It seems this affect Nautilus more than other applications. When this kind-of-freez is taking place, if I go to a text console and the go back to the graphic console, all the icons from the opened windwos dissapear. Sometimes it's possible to work normally within an already opened gnome-terminal during the freezing. I also note strange audio problems running the 435 kernel. If I point to a little (28k) .au file in a Nautilus window, Nautilus will "preview" the first 2 or 3 seconds, and start over again, in an infinite loop (until I move the mouse pointer of course). If I boot with the original 358 kernel, the .au file is played only once, and in its full lenght. I can go to a text console w/o problem, and work normally from there during the freeze. There isn't any suspicious process eating resources. Sometimes killing the X-server solves the problem for several minutes (then it pops-up again). Sometimes if you kill the X-server it won't start again (the gdm session hangs). The problem is very reproducible in my box: I only have to boot any of these kernels, and do some work. After a few minutes, the freezing takes place. Sometimes it goes away alone after a random period of time (seconds or even minutes). Nothing of this happens with te original FC2 kernel (2.6.5) I've noted several posts in the fedora-list in the last two weeks; from what I've read, this not affect Pentium IV boxes. Only AMD Athlon ones. Different people is reporting differnt problems, but all of the seem related to something in the 2.6.6-1.* kernels. Here are some relevant posts: http://www.redhat.com/archives/fedora-list/2004-June/msg05631.html http://www.redhat.com/archives/fedora-list/2004-June/msg06842.html (it seems that some people solved the problem turning acpi off; this is NOT my case. I've tried that, with no luck). My configuration: - AMD Athlon XP 1700+ - mobo: Asus A7N8X Deluxe (nVidia nForce2) - HD: IDE ATA/133 80Gb (Maxtor) - Video: Abit Siluro Geforce4 MX - Video driver: Standard XOrg "nv" driver - No 3rd party / unnofficial packages. - FC2, fresh install, updates as of June 25th. Notes: -I only get freezes when I use my cd drives which are attached to my adaptec scsi card -My cd drives also don't work as advertised with the 2.6.5 kernel -I don't have acpi (according to the kernel my bios is too old) -I do have an Athlon So this bug really are 2 bugs: -freezes on Athlon with 2.6.6 -problems with adaptec aic7xxx scsi, which are also there in 2.6.5 and which trigger the generic Athlon freeze in 2.6.6 I think the first bug might be to do with the nautilus and gst-thumbnail. If I open a directory that contains some .asf/.wmv files I was getting issues similar to those described in #7. I was able to telnet in from another machine on several occassions and run top and found either a large number of pdflush processes running or one very memory hungry gst-thumbnail process - using 90% of my RAM and 90% of my swap. I also have an Athlon processor on an ASUS motherboard. I have experienced the freeze in any sort of directory, even in directories with no multimedia content. In fact, I've experienced the freeze with no opened Nautilus windows at all (f.i., writting an e-mail in Evolution). I have failed to identify any suspicious process during the freeze. I played a bit more yesterday with the 435 kernel, and noticed that many of the system sounds (gnome's event sounds) are changed whenever I boot that kernel (i.e., the sound when I launch and application is different (heavily distorted?) on a running 435 kernel). Weird. Very weird!. This is a constant issue (and has no aparent relationship with the actual freezings). If it were Nautilus (or any of the Gnome components), how come it goes away booting the 2.6.5 kernel? Unless some new code in the 2.6.6 kernel broke gnome. All of us are using gnome? I'll test the 435 kernel with KDE, and report later. Ok, now if I start KDE the problem is quite different: - I couldn't reproduce the random freezings I get with Gnome; - I have no system sounds at all; - Any application that deals with sound in some way, either hangs completely or don't play any sound at all; As with Gnome, if I boot the original FC2 2.6.5 kernel, KDE has sound again, and everything work just fine, without having to touch any configuration. It would be interesting if any other with this problem can confirm a different behavior using KDE... Have also experienced frequent slow response in KDE, as well as hard lockups overnight, with kernel-smp-2.6.6-1.435 on Tyan S2462 Thunder K7 MB with Dual Athlon MP 1800+. Not doing any multimedia normally but sound works for .au and .wav files (despite ALSA error messages at startup). Screensaver is blank-screen. Have a SCSI disks and CD-RW, which has not been used lately but will bang on it and report any new data. Nothing obvious in logs after hard lockup. Will provide any additional info that might help on request. [root@radar0 root]# lspci 00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System Controller (rev 11) 00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] AGP Bridge 00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-766 [ViperPlus] ISA (rev 02) 00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-766 [ViperPlus] IDE (rev 01) 00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-766 [ViperPlus] ACPI (rev 01) 00:07.4 USB Controller: Advanced Micro Devices [AMD] AMD-766 [ViperPlus] USB (rev 07) 00:09.0 RAID bus controller: 3ware Inc 3ware 7000-series ATA-RAID (rev 01) 00:0c.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07) 00:0c.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 07) 00:0d.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01) 00:0d.1 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01) 00:0f.0 Ethernet controller: 3Com Corporation 3c980-C 10/100baseTX NIC [Python-T] (rev 78) 00:10.0 Ethernet controller: 3Com Corporation 3c980-C 10/100baseTX NIC [Python-T] (rev 78) 01:05.0 VGA compatible controller: ATI Technologies Inc Radeon R200 QL [Radeon 8500 LE] Just finished burning FC2 CDs 1 - 4 at 20x on Yamaha CRW2200S using K3B without incident. Played the "horse race" sound on completion. SCSI CD writing seems not to be an issue for me. I just installed FC2 yesterday, fresh install. I did a up2date and installed all updated packages, including kernel 2.6.6-1.435.2.1. The system runs fine, at least for now, in kernel 2.6.5, but I had two freezes when trying to boot into 2.6.6. The same hardware configuration ran fine for many months with FC1. I just tried 2 times to boot 2.6.6, so I haven't had any KDE or Gnome running, I don't want any system corruption stuff.. Hardware : Asus A7N8X motherboard, Athlon 2400+, 512MB samsung PC2700. Geforce4 Ti 4200 Soundblaster Live! Canon Lide 30 scanner HP laserjet 4+ Logitech 300MX optical mouse Logitech keyboard 3com netcard, driver 3c59x Seagate barracuda IDE 80GB hdd Asus CD-rom Lite-On DVD burner. This bug may be related to the one I posted in 126391 - it's hard to tell from this end; however, both do involve problems on athlon systems using the aic7xxx driver, and both include system hangs or slowdowns. So fwiw I was asked to post a note on this bug referencing the but I had posted. I have been having this problem since fc2 test. All kernels seem to have the problem for me. Total X lockup. Only the mouse will move and no keyboard response. Ctl-Alt-BkSp non-functional. I have no scsi, just IDE drive and GeForce3 Ti 200 card using standard stock fedora/xorg drivers. Can go to another computer and log in using vnc with no problem and restart or shutdown system and/or X. Observe via vnc the X process is running about 97% cpu with 3% memory when locked. CPU is 500 Mhz athlon. Memtest works fine on overnight tests. See nothing in logs indicating a problem. Ctl-Alt-(keypad)/ or (keypad)* enabled in xorg.confg but do nothing during lock. Have reported this several time on list. Have completely updated fc2. At times has seemed to be fixed but alway occurs eventually (several times a day to once a week when x heavily loaded with lots of windows to just a few). Have seen with kde and gnome. Created attachment 104823 [details]
/proc info
I forgot to add that when the X server is tying up a CPU, it is looping thusly: ioctl(8, 0xc0286429, 0xfef9e980) = -1 EBUSY (Device or resource busy) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) ioctl(8, 0xc0286429, 0xfef9e980) = -1 EBUSY (Device or resource busy) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) ioctl(8, 0xc0286429, 0xfef9e980) = -1 EBUSY (Device or resource busy) ioctl(8, 0xc0286429, 0xfef9e980) = -1 EBUSY (Device or resource busy) and if I read it correctly, lsof says that file descriptor 8 is: X 10717 root 8u CHR 226,0 65536 /dev/dri/card0 Fedora Core 2 has now reached end of life, and no further updates will be provided by Red Hat. The Fedora legacy project will be producing further kernel updates for security problems only. If this bug has not been fixed in the latest Fedora Core 2 update kernel, please try to reproduce it under Fedora Core 3, and reopen if necessary, changing the product version accordingly. Thank you. |