Bug 162314
Summary: | Complete system lockup when using IDE with DMA-enabled on ServerWorks chipset | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Markus Hakansson <mh> | ||||
Component: | kernel | Assignee: | Dave Jones <davej> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4 | CC: | davej, pfrields, teicher-fedora, wtogami | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2006-05-05 01:16:23 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Markus Hakansson
2005-07-02 11:13:24 UTC
Updated the kernel to 2.6.12-1.1387_FC4 and removed the tainting kernel-modules, exactly the same issue. I also removed the ide1=ata66 from the boot-parameters, no difference: This is the dmesg output when probing IDE: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx SvrWks CSB5: IDE controller at PCI slot 0000:00:0f.1 SvrWks CSB5: chipset revision 146 SvrWks CSB5: not 100% native mode: will probe irqs later ide0: BM-DMA at 0x2000-0x2007, BIOS settings: hda:pio, hdb:pio ide1: BM-DMA at 0x2008-0x200f, BIOS settings: hdc:pio, hdd:pio Probing IDE interface ide0... hda: Compaq CRD-8402B, ATAPI CD/DVD-ROM drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hdc: HDS722516VLAT80, ATA DISK drive ide1 at 0x170-0x177,0x376 on irq 15 hdc: max request size: 1024KiB hdc: 321672960 sectors (164696 MB) w/7938KiB Cache, CHS=20023/255/63 hdc: cache flushes supported hdc: hdc1 hda: ATAPI 40X CD-ROM drive, 128kB Cache Noticed that there was an old configuration in rc.local that I had forgotten about: hdparm -d1 -Xudma4 /dev/hdc When I removed it, the device stays in the working (but extremely slow) PIO mode. I also tested with: hdparm -d 1 /dev/hdc Instead of freezing the machine gave these errors: attempt to access beyond end of device hdc1: rw=0, want=34197063424, limit=321669432 attempt to access beyond end of device hdc1: rw=0, want=34197210440, limit=321669432 attempt to access beyond end of device <... snip ...> hdc1: rw=0, want=30201610288, limit=321669432 attempt to access beyond end of device hdc1: rw=0, want=17181966336, limit=321669432 EXT3-fs error (device hdc1): ext3_readdir: bad entry in directory #131089: rec_len is smaller than minimal - offset=0, inode=1179647, rec_len=2, name_len=12 Aborting journal on device hdc1. ext3_abort called. EXT3-fs error (device hdc1): ext3_journal_start_sb: Detected aborted journal Remounting filesystem read-only hdc: DMA disabled __journal_remove_journal_head: freeing b_committed_data When booting without ide1=ata66 and issuing 'hdparm -d1 /dev/hdc' the system cannot find a filesystem on the disk. If I instead use 'hdparm -d1 -X66 /dev/hdc' I can mount the device but the system freezes when copying a file. I also tried booting with noacpi, same result. The BIOS appears to have selected PIO. We honour the BIOS by default in this case. Yes, the BIOS does not have any way to change this setting. With the earlier kernels I could override the BIOS-settings by passing 'ide1=ata66' and then setting 'hdparm -d1 -X66 /dev/hdc'. When using FC2 this worked. [This comment has been added as a mass update for all FC4 kernel bugs. If you have migrated this bug from an FC3 bug today, ignore this comment.] Please retest your problem with todays 2.6.12-1.1398_FC4 update. If your problem involved being unable to boot, or some hardware not being detected correctly, please make sure your /etc/modprobe.conf is correct *BEFORE* installing any kernel updates. If in doubt, you can recreate this file using.. mv /etc/sysconfig/hwconf /etc/sysconfig/hwconf.bak mv /etc/modprobe.conf /etc/modprobe.conf.bak kudzu Thank you. FWIW: I can get a lockup on 2.6.12-1.1398_FC4 pretty fast using the C version of scimark, compiled with gcc 2.95.3, -O2... nothing logged... don't know if this is related or not. Further, an attempt to recompile gcc2.95.3, just in case that was required, resulted in failure on a make bootstrtap looking for a now non-existant header file: /home/software/gcc-2.95.3/gcc/xgcc -B/home/software/gcc-2.95.3/gcc/ -B/usr/local/i686-pc-linux-gnu/bin/ -c -g -O2 -I. -I. -D_IO_MTSAFE_IO iogetline.c In file included from libio.h:167, from iolibio.h:1, from libioP.h:47, from iogetline.c:26: /usr/include/bits/stdio-lock.h:24: lowlevellock.h: No such file or directory make[2]: *** [iogetline.o] Error 1 make[2]: Leaving directory `/home/software/gcc-2.95.3/i686-pc-linux-gnu/libio' make[1]: *** [all-target-libio] Error 2 make[1]: Leaving directory `/home/software/gcc-2.95.3' make: *** [bootstrap] Error 2 Created attachment 117329 [details]
Scimark2, C version compiled with gcc2.95.3
Here is the scimark2 binary compiled with gcc2.95.3,
built under FC3.
The compiler was built under FC2 or FC3, I don't recall
which.
Interestingly, the scimark2 binary provided does NOT cause a lockup on an otherwise identical FC4 software installation on a similair machine. The major differences are: system that locks up: ECS [Apollo KT266/A/333]/AMD XP3000/ Nvidia GeForce 4/ti4200 AGP running X with Nvidia 7667 drivers and using the system AGPGART, AGP 4x mode. system that does not lock up: Foxconn [KT400/KT600 AGP]/ AMD XP2600/PCI S3 86c764/765 [Trio32/64/64V+] not running X I have reported this to Nvidia in their forum and via email on the outside chance that it involves their driver. I think the lowlevellock.h missing file should be addressed. Interestingly, disabling ACPI helps when gcc 2.95.3 scimark2 is run in the console (X not running), but does not solve the problem... it just takes more runs to lock up. Having X running means it locks up at the first run. After a new boot, at console mode, the nvidia kernel module is not loaded, so this cannot involve their driver or other libraries. I noticed via the nvidia log file that the 2.6.12-1.1398_FC4 kernel was compiled with gcc 4.0.0... I am going to try to rebuild with the current system 4.0.1 in private mail, it transpired that Charles issues were nothing to do with serverworks IDE, so ignore the last few comments related to this bug. I today retested this with 2.6.12-1.1447_FC4 with the same result. I passed the ide1=dma66 to the kernel and ran hdparm -d1 -Xudma4 /dev/hdc I then tried to copy some files and i hung within 10 seconds. Mass update to all FC4 bugs: An update has been released (2.6.13-1.1526_FC4) which rebases to a new upstream kernel (2.6.13.2). As there were ~3500 changes upstream between this and the previous kernel, it's possible your bug has been fixed already. Please retest with this update, and update this bug if necessary. Thanks. Tested with kernel 2.6.13-1.1526_FC4 and the problem still exists. 2.6.14-1.1637_FC4 has been released as an update for FC4. Please retest with this update, as a large amount of code has been changed in this release, which may have fixed your problem. Thank you. This is a mass-update to all currently open kernel bugs. A new kernel update has been released (Version: 2.6.15-1.1830_FC4) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. Thank you. Closing per previous comment. |