Bug 161743
Summary: | System hard hang under disk I/O | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Craig McLean <craig> |
Component: | kernel | Assignee: | Dave Jones <davej> |
Status: | CLOSED DUPLICATE | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4 | CC: | pfrields, teicher-fedora, wtogami |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-07-08 00:58:48 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Craig McLean
2005-06-26 21:57:38 UTC
Created attachment 115994 [details]
Combined output from dmesg, lsmod, rpm -qa and lspci
Now on Kernel 2.6.12-1.1400_FC5. Same fault. Another simple (and more reliable) way to cause this problem is to issue 'find /etc | xargs grep blahblahblah'. The system will grind for a while, then I get the old flashing caps-lock. All those things are not strenous for the disk drive. They are just ordinary things. Try this under SuSE. for i in `seq 100` ; do find /usr/ -type f -exec md5sum \{\} \; done > /tmp/list_md5 cat /tmp/list_md5 | sort | uniq -c | sort | less It should all say 100. What harddrive are you using? Does smartctl -a say the harddrive is bad? What mobo do you have? what does `strace ls -lR /etc` say? `hdparm`? Fair enough, I assumed (I know, I know) that because the problem only occurred while the disk is thrashing (It's a laptop, so easy to hear the heads moving) that it was an I/O+CPU combo causing the problem. That script seems to run without error on either 2.4.24 or 2.6.12, I get 100 of everything. Anyhow, the disk is a TOSHIBA MK4025GAS, as stated in the dmesg attached to the original report. smartctl -a says the disk is ok, it's a toshiba tecra M2 as stated, so it's toshi's own motherboard. hdparm -I /dev/hda1 and smartctl -a /dev/hda1 in new attachment, I'll attach an strace when I can get one, but the box will panic so I'll need to get to init S and stick it on a usb drive or something. Created attachment 116231 [details]
output from smartctl -a and hdparm -I
Created attachment 116232 [details]
output from "strace -f -- find /etc | xargs grep youwontfindme"
This is the strace output from the panic'd system. This command hangs the
system and requires a power-cycle to clear. Interestingly, if "/" is remounted
with "-o sync" this problem seems not to occur.
Created attachment 116233 [details]
gzipped output from "strace -f -- find /etc | xargs grep youwontfindme"
This is the strace output from the panic'd system. This command hangs the
system and requires a power-cycle to clear. Interestingly, if "/" is remounted
with "-o sync" this problem seems not to occur.
In case of blinking LEDs the first priority should be getting the oops trace. Re-run the test with console in text mode. Sitting on the console shows a panic/oops string (about 2 pages), but this is a laptop with no rs232 ports, only USB. LKCD requires me to rebuild an old kernel, and netdump (i believe) won't support the ipw2100. Also, my nice (point-and-shoot) digital camera won't focus properly on the TFT screen, so I can only get pretty blurry info like that. Can you suggest a way of getting the oops output from the machine? I notice you have the nvidia module loaded. Some versions of their driver (maybe they still do, I havent looked) created a /dev node in /etc , which when read, would crash the system. The symptoms you report are in line with what we saw with earlier reports from users of similar set ups. Please reopen if you can reproduce without the nvidia driver present (and check your /etc for device nodes) *** This bug has been marked as a duplicate of 73733 *** |