Bug 67737
Summary: | kernel hangs without oopses (block system?) | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Pekka Savola <pekkas> |
Component: | kernel | Assignee: | Arjan van de Ven <arjanv> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.3 | CC: | alvarezp, brian.t.brunner, csieh, mjeffery |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-09-30 15:39:43 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Pekka Savola
2002-07-01 11:33:59 UTC
My problem seems somewhat similar. Reproducible on two systems. Single PentMMX Kontron/Jumptec ETX-MGX board on an in-house carrier card, single IBM Travelstar lap-top drive. Kernel mods: IRQ0 is sharable, HZ=500. Stress test: cd /usr/src/linux-2.4.18-3;while(`true`);do make dep clean bzImage;done Runs between 9 and 18 hours, then a no-symptom, no-message, dead-to-the-world hang. ssh sessions hang up. System is headless (no vga console), pushbutton reset required. Attempting work-around: migrate to 2.4.20 kernel. Change fstab ext3->ext2. nmi_watchdog=1 Will happily add whatever other instrumentation might help, I just need instructons for dummies. Same board runs full-speed memtest86 for hours, DOS-based ISA device tests for *days*. Follow-up: ETX-format CPU (Kontron/Jumptec ETX-MGx) card is on a carrier that has a debug port@228 similar to (traditional) port80 Downloaded 2.4.20 ("latest stable"). menuconfig'd, built, lilo, booted fine. Kernel changes: HZ=500; irq0 sharable; mod to timer_interrupt: unsigned char intCount; ++intCount; outb_p(intCount,0x228); Started stress test (cd /usr/src/linux-2.4.20;while(`true`);do make clean bzImage;done) (n.b. run as user root) and kernel hung top(1) (run as user root) last said: 9:56pm up 1:33, 3 users, load average: 1.20, 1.26, 1.20 31 processes: 28 sleeping, 3 running, 0 zombie, 0 stopped CPU states: 95.4% user, 4.5% system, 0.0% nice, 0.0% idle Mem: 60696K av, 58460K used, 2236K free, 0K shrd, 4916K buff Swap: 152608K av, 116K used, 152492K free 29480K cached tail -f /var/log/messages last said (about 1 hour earlier, run as user root) May 17 20:27:04 mcut16 ntpd[519]: kernel time discipline status change 41 debug port shows timer interrupt NOT incrementing ergo kernel not running at any level. hdparm /dev/hda shows:/dev/hda: multcount = 16 (on) I/O support = 0 (default 16-bit) unmaskirq = 0 (off) using_dma = 1 (on) keepsettings = 0 (off) nowerr = 0 (off) readonly = 0 (off) readahead = 8 (on) geometry = 730/255/63, sectors = 11733120, start = 0 busstate = 1 (on) kernel .config available upon request (776 lines). dmesg available upon request (101 lines). Will perform whatever tests might help find this problem's source and cure it. Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/ |