From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2) Gecko/20021126 Description of problem: I have a dual Athlon (AuthenticAMD AMD Athlon(tm) MP Processor 1800+ 1532 MHZ) on a Tyan MB. Video is MGA G550 AGP. Using Adaptec Raid-10 with several external disk drives. Since upgrading to the last 2 or 3 Red Hat kernels, including the one referenced above, the machine hangs hard (no ctrl-alt-del, blank screen) after a few hours of uptime. The load on this system is not heavy, and it typically fails when no one is using it. dmesg tells me: AMD errata #22 may apply: Add "noapic" to the command line if system instability. I have tried adding "noapic" to the boot parameters, and it had no effect. In fact, the message about "noapic" is still shown. After several hours of googling, I'm confused about the status of this problem. I have seen recommendations for adding "mem=nopentium" to the boot command line, and I've also seen a statement that it's not necessary with recent kernels. AMD's site says that 'noapic' is only necessary with older kernels. Version-Release number of selected component (if applicable): 2.4.18-19.7.xsmp How reproducible: Always Steps to Reproduce: 1.Boot and use the system 2.Wait for hang Actual Results: System hangs after 10-36 hours Expected Results: System continues to run Additional info: See attached dmesg output
Created attachment 89012 [details] dmesg output from the affected machine Note that "noapic" option is selected, and does not fix the problem.
Al, Arjan, Do you happen to see the following error messages in the logs: APIC error on CPU0: 08(08) ? I've built new RedHat 7.3 CDs with the latest errata applied, including kernel 2.4.18-19.7.x. I can't install on a Tyan Tiger MPX (S2466N-4M, beta BIOS 2466403m) with dual 2200+ CPU, the kernel 2.4.18-19.7.xBOOT won't stop spilling the error message posted above. The machine won't accept CTRL-ALT-DEL, only the reset button helps. Installing with the original kernel that came with the installer CDs (2.4.18-3BOOT) works, and so does 2.4.18-4BOOT. I tried adding boot options like nosmp and noapic at no avail. In the end I stumbled upon this message: http://hypermail.idiosynkrasia.net/linux-kernel/archived/2002/week29/0281.html Check the answer of Jack F. Vogel. I don't know if this has anything to do with the hangs you see, I just thought I'd add the problems I have right now with Athlon MP and kernel 2.4.18-19.7.x. Greetz Marc
I don't see the message you noted in the log. It appears (not confirmed yet) that the problem I was seeing may have been a hardware issue caused by heating. We added cooling to the machine and it now has a 6-day uptime. I'm willing to declare my problem solved if I get 10-day uptime.
OK, with over a month of uptime, I'm officially declaring this one a false alarm. The problem appears to have been cooling.