From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de-DE; rv:1.8) Gecko/20051202 Fedora/1.5-1 Firefox/1.5 Description of problem: after updating to kernel-2.6.15-1.1829_FC4 and udev-071-0.FC4.2 dma is not working for my ide harddisk on boot. (error messages see url) but it works if I enable it using hdparm. with 1824 and udev-058 I don't had this problems. Version-Release number of selected component (if applicable): kernel-2.6.15-1.1829_FC4 How reproducible: Always Steps to Reproduce: 1. boot 2. look at dmesg or hdparm /dev/hda Actual Results: no dma at boot + error messages Expected Results: dma should be enabled at boot + no error messages Additional info: 00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3) 00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2) 00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2) 00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3) 00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev a2) 00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3) 00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2) 00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3) 00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:07.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 0a) 01:07.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 0a) 01:09.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 80) 01:0a.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 13) 05:00.0 VGA compatible controller: nVidia Corporation GeForce 7800 GTX (rev a1)
This is a mass-update to all currently open kernel bugs. A new kernel update has been released (Version: 2.6.15-1.1830_FC4) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. Thank you.
this bug still exists in 2.6.15-1.1830_FC4 (which is almost the same as 1829).
could this bug be more udev related than kernel? Alan Cox wrote on the fedora-test-list that some app is trying to read beyond the device. If I look at the dmesg output I see this: --- Probing IDE interface ide0... hda: ST340823A, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hdc: LITE-ON LTR-52246S, ATAPI CD/DVD-ROM drive hdd: LITE-ON DVDRW LDW-811S, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: Host Protected Area detected. current capacity is 78156288 sectors (40016 MB) native capacity is 78156289 sectors (40016 MB) hda: Host Protected Area disabled. hda: 78156289 sectors (40016 MB) w/1024KiB Cache, CHS=65535/16/63, UDMA(100) hda: cache flushes not supported hda: hda1 hdc: ATAPI 52X CD-ROM CD-R/RW drive, 2048kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.20 hdd: ATAPI 40X DVD-ROM CD-R/RW drive, 2048kB Cache, UDMA(33) ide-floppy driver 0.99.newide --- and at the end after selinux init: ----------------- SELinux: initialized (dev bdev, type bdev), uses genfs_contexts SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=78156288, sector=78156288 ide: failed opcode was: unknown hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=78156288, sector=78156288 ide: failed opcode was: unknown hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } [...] ----- is this correct or did I miss something? should this bug be reasigned to udev?
the same problem exist when I use 2.6.16-rc5 (vanilla)
http://bugzilla.kernel.org/show_bug.cgi?id=6162 seems like ide is somehow broken on my box? have anything changed about ide from udev 058 -> 071 ?
I googled for it and found this: http://forums.gentoo.org/viewtopic.php?t=372550 (no solution) could lvm be buggy? (I don't use it)
According to the Seagate manual for your drive it should have 78,165,360 sectors. http://www.seagate.com/support/disc/manuals/ata/u5pmb01.pdf The figure for your ST340823A drive doesn't match. It is missing 9072 sectors (78165360 vs 78156288). I don't know where the difference comes from. Two possibilities:- 1) Enable LBA mode in your BIOS. What does "hdparm -i /dev/hda" report? It should tell you about the current LBA status and number of LBA sectors. 2) The drive might be losing some capacity due to some failing sectors. Could you try inspecting the SMART stats of the drive and get it to run a thorough disk test? # smartctl -s on /dev/hda # smartctl -a /dev/hda # smartctl -t long /dev/hda I did a Google search for kernel boot logs and found a couple of references where people have that same drive working on a 2.6.15 kernel with the correct number of sectors, e.g. http://kerneltrap.org/node/6306 ... Linux version 2.6.15.5 (root@dadslinux) (gcc version 3.3.6) #1 PREEMPT Tue Mar 7 09:24:36 CST 2006 ... hda: ST340823A, ATA DISK drive hdb: ST340823A, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hdc: CD-W58E, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: Host Protected Area detected. current capacity is 78165360 sectors (40020 MB) native capacity is 78165361 sectors (40020 MB) hda: Host Protected Area disabled. hda: 78165361 sectors (40020 MB) w/512KiB Cache, CHS=65535/16/63, UDMA(66) hda: cache flushes not supported hda: hda1 hda2 < hda5 hda6 hda7 hda8 hda9 hda10 hda11 > hdb: max request size: 128KiB hdb: 78165360 sectors (40020 MB) w/1024KiB Cache, CHS=65535/16/63, UDMA(66)
(In reply to comment #7) > According to the Seagate manual for your drive it should have 78,165,360 > sectors. http://www.seagate.com/support/disc/manuals/ata/u5pmb01.pdf > > The figure for your ST340823A drive doesn't match. It is missing 9072 sectors > (78165360 vs 78156288). I don't know where the difference comes from. Two > possibilities:- > > 1) Enable LBA mode in your BIOS. > What does "hdparm -i /dev/hda" report? It should tell you about the current LBA > status and number of LBA sectors. > /dev/hda: Model=ST340823A, FwRev=3.05, SerialNo=6EF0CAAP Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=unknown, BuffSize=1024kB, MaxMultSect=16, MultSect=1 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=78156288 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=no WriteCache=enabled Drive conforms to: Unspecified: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 * signifies the current active mode > 2) The drive might be losing some capacity due to some failing sectors. > Could you try inspecting the SMART stats of the drive and get it to run a > thorough disk test? > # smartctl -s on /dev/hda > # smartctl -a /dev/hda > # smartctl -t long /dev/hda > no drive seems to be ok: SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 15837 - # 2 Short offline Completed without error 00% 13627 - # 3 Short offline Completed without error 00% 13627 - # 4 Short offline Completed without error 00% 12063 - # 5 Short offline Completed without error 00% 12063 - # 6 Short captive Completed without error 00% 1 - > I did a Google search for kernel boot logs and found a couple of references > where people have that same drive working on a 2.6.15 kernel with the correct > number of sectors, e.g. http://kerneltrap.org/node/6306 > > ... > Linux version 2.6.15.5 (root@dadslinux) (gcc version 3.3.6) #1 PREEMPT Tue Mar 7 > 09:24:36 CST 2006 > ... > hda: ST340823A, ATA DISK drive > hdb: ST340823A, ATA DISK drive > ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > Probing IDE interface ide1... > hdc: CD-W58E, ATAPI CD/DVD-ROM drive > ide1 at 0x170-0x177,0x376 on irq 15 > hda: max request size: 128KiB > hda: Host Protected Area detected. > current capacity is 78165360 sectors (40020 MB) > native capacity is 78165361 sectors (40020 MB) > hda: Host Protected Area disabled. > hda: 78165361 sectors (40020 MB) w/512KiB Cache, CHS=65535/16/63, UDMA(66) > hda: cache flushes not supported > hda: hda1 hda2 < hda5 hda6 hda7 hda8 hda9 hda10 hda11 > > hdb: max request size: 128KiB > hdb: 78165360 sectors (40020 MB) w/1024KiB Cache, CHS=65535/16/63, UDMA(66) >
problem still exist in FC5 (2.6.16-1.2074_FC5)
What do the top level SMART stats about the drive look like? There should be a line like: SMART overall-health self-assessment test result: PASSED followed by a table showing the power on hours, number of remapped sectors etc. Could you write all the output from "smartctl -a /dev/hda" to a file and attach that to this bug? It may be possible for a drive to pass the tests even if it has marked 9000 sectors as bad. I had a new drive once which initially reported a bad sector, but it disappeared it was remapped by the firmwar once i'd wiped the drive. I believe drives only have a small number of reserved sectors available for remapping before you start losing real capacity from the drive. How about trying the disk diagnostic tools from Seagate? They ought to reliably detect the info about the number of sectors on the drive. They come as a bootable CD image so can run on anything which can boot from a CD-Rom. http://www.seagate.com/support/seatools/B7a.html I'm wondering whether the problem is with the kernel trying to access one more sector than the drive has available, or whether the drive claims to have a sector which it isn't able to read. It is interesting that the kernel HPA code appears to increment the drive sector count by 1, perhaps this is where the problem lies. Unfortunately I can't seems to see an easy user-visible way to switch this off without patching and recompiling the kernel. Note that the whole idea of clipping disks the disk capacity using these "host protected area" features was introduced around the time of the first 40GB disks (due to BIOS and hardware problems with disks > 33.8GB). I believe your disk is one of these very first 40GB disks to hit the market, so I wouldn't be too surprised if it was a firmware issue in the drive (or disk controller). The "large disk howto" http://www.tldp.org/HOWTO/html_single/Large-Disk-HOWTO/#s11 notes that your Seagate drive might have some drive specific quirk: ...For models ST-340016A, ST-340823A, ST-340824A, ST-360021A, ST-380021A: The ATA Set Features F1 sub-command will cause Identify Data words 60-61 to report the true full capacity. One fix might be to repartition the drive so that you don't use the last few MB of the drive.
It might be worth taking a look here http://www.ussg.iu.edu/hypermail/linux/kernel/0104.3/1190.html This is another user with the same drive as you where the drive doesn't seem to respond to the normal HPA command sequence, but does work with the "seagate.c" program which he attached to that message. Maybe it is worth trying that program and see if hdparm reports the full capacity afterwards?
Created attachment 126787 [details] smartctl output I will provide more info when I come home but for now I can tell you that I already do not use the last 8mb of the disk. smartctl output is attached
I believe the bug is actually in the official NVidia driver. After reinstalling FC5 a couple of times and both times getting system libraries corrupted because of this bug, I decided to get a new drive, while I wait for the new drive (assuming this was a linux kernel issue) I installed FreeBSD 6.1 on the ST340823A. The instalation went fine and the drive showed no signs of errors, that is until I installed the NVIDIA-FreeBSD-x86-1.0-8756 driver, after that, the same trying to read past the size of the disk showed up. At first I thought the drive was faulty, but then after realizing that it was working fine before installing the NVidia Driver, I decided to remove the driver, and everything is back to normal, the bug just has to be there.
[This comment added as part of a mass-update to all open FC4 kernel bugs] FC4 has now transitioned to the Fedora legacy project, which will continue to release security related updates for the kernel. As this bug is not security related, it is unlikely to be fixed in an update for FC4, and has been migrated to FC5. Please retest with Fedora Core 5. Thank you.
still happens with 2.6.17-1.2187_FC5
If its only reproducable with the Nvidia provided 3d drivers installed on the machine then you need to take it up with Nvidia instead, they have the Linux code we don't have theirs so only they can debug it.
for me it happens with and without the nvidia drivers.
Dave - did you turn on the GPT partitioning stuff ? Working through all of this: - The size issue is a red herring. Actual size depends on disk, firmware and geometry. - The error trace is an attempt to read the last valid sector. I suspect however we issued a 1K read for it. - Its almost the only drive type that has both odd sector counts *and* errors a 1K read for the last sector. That to me smells like the "last sector' peek stuff one of the partition table handlers use would be the logical trigger. Especially as the box then runs fine.
you may want to check out: bug 163418 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=163418 i filed this a while ago when i couldn't get DMA on my DVD drive what exact chipset are you using? eg Nforce 4, 5? FC5 has support for both of those, as Fedora has support for my 915 intel chipset. But Fedora kernel doesn't correctly load correct drivers for me. It loads IDE driver when it should load SATA (libata) driver i compile my own kernel with custom config so thats and i don't have problems with new fedora kernels i can add: hdc=noprobe combined_mode=libata consider looking into this.
no this isn't the problem. I have a nforce4 sli chipset and dma works fine for the cd drives (2) and I can enable dma using hdparm (and get no erros after doing this).
A new kernel update has been released (Version: 2.6.18-1.2200.fc5) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. In the last few updates, some users upgrading from FC4->FC5 have reported that installing a kernel update has left their systems unbootable. If you have been affected by this problem please check you only have one version of device-mapper & lvm2 installed. See bug 207474 for further details. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. If this bug has been fixed, but you are now experiencing a different problem, please file a separate bug for the new problem. Thank you.
This still happens :(
how can I disable lvm beeing started at boot? its seems that its causing this.
just wanted to note that 2.6.19-1.2888.fc6 still don't fix it.. maybe the old ide -> libata ide change in F7 will fix it? if someone has any idea which kind of info I can/should provide feel free to ask.
kernel-2.6.20-1.2925.fc6 => still the same :(
This may actually have to do with an underpowered power supply, I have an nVidia 6800GT card, 2 IDE DVD ROMS and 3 Hard drives hooked to a 400w power supply, I replaced the Seagate drive this bug refers to with a Western Digital Sata drive, it ran fine for a while but started giving similar problems, turns out it was hooked to the same power line as the video card, and apparently that causes fluctuations on the power line, I made it so the DVD ROMS are now the ones sharing the power line with the card, and now the HDs show no problems, though my DVD burner creates a lot of coasters.
I have a Tagan easycon 480W PSU and a 7800GTX but the videocard is attached to its own power cord (does not share with anything). the same drive also works fine with windows and all livecds I tested so far. (had not mounted it with a f7 livecd but I can try tomorrow if this helps)
why does the kernel even allow to read beyond the end of a device? and why does it disable dma in this case? wouldn't returning EOF be the better action that should be done in this case?
That would require an essay sized answer for the disk question.
I'm going to close this WONTFIX, simply because the required work to fix it in the old IDE layer is huge and would be risky. Realistically it won't happen. The SCSI core (used by libata) appears to already handle the odd sized media cases correctly.
this is indeed fixed in kernel-2.6.21-1.3194.fc7 (which uses libata)