Red Hat Bugzilla – Full Text Bug Listing
|Summary:||[pata_it821x] Hang on module load - IRQ routing ?|
|Product:||[Fedora] Fedora||Reporter:||Paul Smith <phhs80>|
|Component:||kernel||Assignee:||Alan Cox <alan>|
|Status:||CLOSED CURRENTRELEASE||QA Contact:||Brian Brock <bbrock>|
|Version:||7||CC:||cebbert, cvizitiu, davej, gotenks, jeff, lance.raymond, martin.vgagern, rje, stsp|
|Fixed In Version:||188.8.131.52-91.fc7||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2007-10-03 20:13:14 EDT||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Paul Smith 2007-06-02 08:01:17 EDT
The process of upgrading from FC6 to F7 gets deadlocked when "Loading pata_it821x driver". Thus, no upgrading of F7 is possible here.
Comment 1 Chris Lumens 2007-06-02 10:40:55 EDT
What do you see on tty3 when this happens?
Comment 2 Paul Smith 2007-06-02 16:04:46 EDT
> What do you see on tty3 when this happens? It happens after the tty3 stage. I get a blue screen with 4 installation options and I choose the first one. Then, I get another blue screen (text mode) and the reported problem when loading SCSI drivers. I do not know whether it matters, but I have a Pentium Dual Core and 3 IDE hard disks. It seems that a similar problem is happening to other people: http://marc.info/?l=fedora-list&m=118080895617949&w=2 Paul
Comment 3 john d 2007-06-03 15:20:38 EDT
the same happens to me when trying to do a fresh installation of fedora 7 on an empty harddisk (mb: asus p5gd2 - harddisk over SATA). john
Comment 4 Paul Smith 2007-06-03 15:24:06 EDT
The following links may be helpful: http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-05/msg06964.html http://lists.opensuse.org/opensuse-bugs/2007-04/msg00978.html
Comment 5 Alan Cox 2007-06-04 11:00:59 EDT
Can you tell me which firmware the IT8212 in question has loaded (it'll say in the BIOS messages it displays)
Comment 6 Paul Smith 2007-06-04 11:30:57 EDT
> Can you tell me which firmware the IT8212 in question has loaded (it'll say in > the BIOS messages it displays) Could you please give me some details about how to obtain the information that you are asking me for? Paul
Comment 7 Rob Emanuele 2007-06-04 15:53:59 EDT
On my system with the same issue: GigaRAID BIOS v 1.41
Comment 8 Paul Smith 2007-06-04 16:31:55 EDT
Here: GigaRAID ATAPI BIOS v 1.71
Comment 9 Beat 2007-06-05 12:03:26 EDT
Hello, I have the same problem on Asus P5GD2-Basic motherboard. Main bios V1007 Beta 2 ITE Bios V184.108.40.2061 I have a plextor PX-716A on master 1 and a harddisk on master 2. Every things works if the ITE8212 controller is disabled in bios. But for me this is not the solution, because I need the plextor drive. Regards Beat
Comment 10 john d 2007-06-05 13:26:59 EDT
hi, i have the same board as Beat - disabling ITE8212 is no option for me neither as my only HD is connected using the controller... regards john
Comment 11 Paul Smith 2007-06-07 12:31:39 EDT
For people who want to install F7, there is the following workaround that I have found: Install F7 with yum, as described in: http://fedoraproject.org/wiki/YumUpgradeFaq In order to have F7 booting, I had to switch (in grub) to the previous kernel. Paul
Comment 12 Alan Cox 2007-06-07 12:45:43 EDT
Paul, can you attach an lspci -vvxxx and a dmesg of the F7 boot with the kernel that works (ie the old one) so I can collect more data on what is still a bit of a mystery
Comment 15 Paul Smith 2007-06-07 13:16:08 EDT
They are attached. The problem with the new kernel seems to occur at "starting udev" stage. Paul
Comment 16 john d 2007-06-11 07:03:05 EDT
one question: why is this bug classified as severity and priority low? i`m not even able to use fedora due to this bug - at least for me this therefore is a little bit more than just a minor bug... john
Comment 17 Alan Cox 2007-06-12 09:48:31 EDT
Because that is how the original reporter rated it
Comment 18 Paul Smith 2007-06-12 11:49:08 EDT
I cannot change priority; I can only change severity. I have changed the severity to "urgent".
Comment 19 Jim Dishaw 2007-06-27 08:28:35 EDT
Same problem here. Unable to do a new install of F7. Motherboard is ASUS P5LD2, CPU is Intel Core 2 Duo 3.4 GHz. Other OS's installed and running well are are FC6 and StartCom Linux.
Comment 20 Chuck Ebbert 2007-06-28 16:10:20 EDT
*** Bug 245979 has been marked as a duplicate of this bug. ***
Comment 21 Alan Cox 2007-06-28 16:12:08 EDT
Been doing some testing on this one. On my test box it is now all working correctly in non-raid firmware mode with 2.6.22rc6. RAID mode needs some further poking to deal with bugs in the emulation but Tejun's latest patches should have that licked too
Comment 22 Ciprian 2007-07-02 10:33:06 EDT
Anyone googling for a workaround on how to make a fresh installation from DVD (using the iso available on 02 July 07) here's some tips; at least with Gigabyte GA81945P BIOS 1.75 do the following: 1. From BIOS disable the "On-chip primary/secondary PCI IDE" and also RAID 2. Physically disconnect any PATA device (e.g. DVD-ROM) from the motherboard (doesn't matter if it's the ITE821x controller or the chipset) 3. Make sure you have only one hdd connected to the motherboard on SATA0 ... problem "fixed"; you can now use an external USB DVD-ROM to install normally. As of 07 Jul '07 it still won't boot with any PATA device connected in parralel with SATAs (it freezes on starting Udev) but you can use the external DVD-ROM unit until hopefully this bug gets fixed.
Comment 23 David Marsh 2007-07-02 11:19:24 EDT
Same problem here using an Abit GD8 Motherboard with Intel 915P Chipset and ITE 8211F IDE. Infact the issue with this chipset became a problem sometime during fc5 's life at some update. The original fc5 always installed and booted fine (ignoring some later updates which break it). Also core 6 would hang at boot after clean install. Passing all-generic-ide to the kernel is a work around in fc6 to get the system booted after install but this work around fails on fc7 dvd install and fc7 live cd were the system hands at the loading pata_it821x screen or hangs at loading kernel in the live edition.
Comment 24 Alan Cox 2007-07-02 13:47:31 EDT
Ok with 2.6.22-rc7 all appears well with all the boards I can try and with both raid and non-raid mode.
Comment 25 john d 2007-07-12 12:33:50 EDT
(In reply to comment #24) > Ok with 2.6.22-rc7 all appears well with all the boards I can try and with both > raid and non-raid mode. > > is it possible to upgrade from fc6 in some way using this kernel? if so, is there a documented way to do so? thanks a lot! john
Comment 26 Paul Smith 2007-08-05 20:29:51 EDT
(In reply to comment #24) > Ok with 2.6.22-rc7 all appears well with all the boards I can try and with both > raid and non-raid mode. The bug persists here. Hardware: GigaRAID ATAPI BIOS v 1.71. Paul
Comment 27 Paul Smith 2007-08-05 20:34:21 EDT
(In reply to comment #26) > (In reply to comment #24) > > Ok with 2.6.22-rc7 all appears well with all the boards I can try and with both > > raid and non-raid mode. > > The bug persists here. Hardware: > > GigaRAID ATAPI BIOS v 1.71. With kernel 220.127.116.11-41.fc7. Paul
Comment 28 Jeff Norden 2007-08-08 16:46:50 EDT
I'll confirm this same problem with a Supermicro PDSBA+ motherboard. The system reports an ITE bios version of 18.104.22.168 The only pata device in the system is the cdrom. IT does *not* seem to be fixed by the 2.6.22 kernel. I've confirmed this a couple of ways: 1) I used pungi to spin a minimal distro with the 22.214.171.124-41.fc7.x86_64 kernel. Booting the resulting cd still hangs at the loading pata_ite821x step. The cd boots fine on other hardware, where I can confirm that the installer is using the new kernel. 2) The supermicro system is running FC6 right now. Under FC6, I built a vanilla 126.96.36.199 kernel with pata_it821x included (i,e, just unpack the kernel, no patches, the only change made to default settings is to add the pata_it821x module). With this kernel, the module does get loaded, but doesn't seem to do anything since my cdrom still shows up as /dev/hda. However, the cdrom seems to work just fine. 3) I then re-built the kernel using the .config file from kernel-188.8.131.52-41.fc7.src.rpm. With this kernel, the system freezes at the "starting udev" step, which is when the module will be loaded. 4) If I remove pata_it821x.ko from the appropriate subdirectory of /lib/modules/, then the kernel from (3) above will boot fine, but with no way to access my pata cdrom. I then copied the module back into its original location and did "modprobe pata_it821x". The module loaded, and it recognized the CD, and the system seemed to be ok for about 5 or 10 seconds, when it totally froze up, requiring a hard boot. The one possibly useful thing I noticed here is that the messages report that the cdrom is being set to use UDMA/33, while under the vanilla 184.108.40.206 kernel or the standard FC6 kernel, /proc/ide/hda/settings reports a speed of 66. I'm attaching 3 files below. The first is an lspci output, the second is the relevant dmesg output from the vanilla kernel, and the third is the output that results from "modprobe pata_it821x" before the system freezes. I'll try to figure this out some more if I can find some time. Any pointers on what to try would be great. Thanks, -Jeff
Comment 29 Jeff Norden 2007-08-08 16:49:59 EDT
Created attachment 160935 [details] lspci output for supermicro pdsba+
Comment 30 Jeff Norden 2007-08-08 16:53:59 EDT
Created attachment 160936 [details] portion of dmesg with vanilla 220.127.116.11 kernel
Comment 31 Jeff Norden 2007-08-08 16:56:33 EDT
Created attachment 160937 [details] messages after modprobe_it831x before crash
Comment 32 Chuck Ebbert 2007-08-08 17:04:02 EDT
(In reply to comment #28) > 2) The supermicro system is running FC6 right now. Under FC6, I built a > vanilla 18.104.22.168 kernel with pata_it821x included (i,e, just unpack the > kernel, no patches, the only change made to default settings is to add the > pata_it821x module). With this kernel, the module does get loaded, but > doesn't seem to do anything since my cdrom still shows up as /dev/hda. > However, the cdrom seems to work just fine. Add the kernel parameter "combined_mode=libata" to try to make the IDE driver get out of the way of the new driver.
Comment 33 Jeff Norden 2007-08-17 17:03:18 EDT
Ok, here is some more info, based on my supermicro MB setup. 1) The combined_mode option seems to have been removed in the 2.6.22 kernel. 2) I patched my kernel to add Alan Cox's pata_dma option, which I found here: http://www.redhat.com/archives/fedora-extras-commits/2007-July/msg00934.html Booting with libata.pata_dma=0 works fine, although with dma disabled for the cdrom. Hopefully, this patch will make it to fedora updates fairly quickly. 3) With dma enabled: If I use the trick of booting without the module in place, kill udevd, and then insert the module, then this phase goes fine. The cdrom is recognized and reported correctly, and no crash occurs. If I then do MAKEDEV scd0, any attempt to read from scd0 (even just reading one block using dd) will cause the system to freeze within a few seconds. I added lines to pata_it821x.c in order to trace the subroutines that are called. The last one called before the system freezes is it821x_passthru_bmdma_start(), so I guess this confirms that a lost interrupt is the problem (the bdma_start() subroutine does exit, so the system isn't freezing up in there). Interestingly, though, a successful pair of bmda_start() and bmda_stop() calls also occur earlier, when the module is first inserted, right before the cdrom drive is identified. I guess the real question is why this hardware seems to work fine with the older ide code, but fails under libata. Hope this helps, -Jeff
Comment 34 Alan Cox 2007-08-19 18:34:21 EDT
Interesting that turning off DMA is helpful, and possibly very important information from your testing. bmdma_start will get used first time for the command and then for data. If you stick a printk in at it821x_passthru_qc_issue_prot and print qc->tf.command it will show which command is being issued each time. Also if your box has >= 1GB RAM try booting with mem=900M. I don't think that will have any effect but just check it. I'll try and find a similar DVD drive here and test that see if it shows anything.
Comment 35 Alan Cox 2007-08-21 10:04:32 EDT
Progress - testing with a CD drive I can get a stuck IRQ off my IT821x which I don't get off a disk. Doing some more investigating.
Comment 36 Jeff Norden 2007-08-22 18:56:24 EDT
Some more info: My system has 4GB of memory, but mem=900M doesn't have any effect on this problem (as you thought). I added the extra printout of qc->tf.command - the command being issued is 0xA0, which I guess is ATA_CMD_PACKET. One additional piece of info that might help, is that it seems to take exactly 30 seconds for the system to freeze up after the read from the device is tried. (This is longer than I thought it was.) I'll attach a log of the debugging output, which shows a trace of all the it821x subroutines that get called, as well as a copy of the modified pata_it821x.c which produced the output. Thanks, -Jeff
Comment 37 Jeff Norden 2007-08-22 19:03:27 EDT
Created attachment 164589 [details] Debugging messages from modprobe and reading the device
Comment 38 Jeff Norden 2007-08-22 19:04:35 EDT
Created attachment 164590 [details] My copy of the module source, with debugging lines added.
Comment 39 Jeff Norden 2007-08-29 15:07:21 EDT
Here is a fix that works on my hardware. Add the following lines at the start of the it821x_check_atapi_dma() subroutine: /* Only use dma for transfers to/from the media. */ if (qc->nbytes < 2048) return -EOPNOTSUPP; After poking around quite a bit, I discovered that libata is a lot more aggressive about using dma than the older ide code is. The old code only uses dma for read or writes to or from the media in the drive, but libata always uses it, e.g: to read the name of the drive. In fact, libata calls bmdma_start() even when qc->nbytes is zero, which seems unnecessary. I first tried to cancel the dma when qc->nbytes==0, but this isn't sufficient to fix the problem. An alternate fix would be to check qc->scscicmd, but there are several different possible read and write commands, so checking the number of bytes seems simpler. Anything destined for the media will be at least 2048 bytes, and I haven't come across any smaller transfers that cause a problem. --- In the original code, the first dma call is from a GPCMD_INQUIRY command. This one succeeds, the sequence of calls is: it821x_passthru_bmdma_start() ata_interrupt() ata_bmdma_status() it821x_passthru_bmdma_stop() The second dma call is from a GPCMD_TEST_UNIT_READY command, and it821x_passthru_bmdma_start() is never followed by the corresponding ata_interrupt(). Instead, after 30 seconds, a call to ata_bmdma_freeze() occurs, which then executes the line: iowrite8(ap->ctl, ioaddr->ctl_addr); and the system immediately freezes up. (I don't think that is the intended effect of ata_bmdma_freeze(), but at least the subroutine is aptly named :-) I don't know why libata handles the missed interrupt so badly, but it might be worth trying to figure that out. When I first added the check for qc->nbytes==0, the system just froze in the same way at the first non-zero length transfer. --- I'll attach my current debugging version of pata_it821x.c, which does more than the previous one. It prints both qc->tf.command and qc->scsicmd, so you can tell more about what is happening. You can control several aspects of the behavior with parameter arguments, which makes testing things out easier (including some other fixes that I tried but don't seem to work at all for me). -Jeff
Comment 40 Jeff Norden 2007-08-29 15:13:45 EDT
Created attachment 179641 [details] Module source with lots of debugging added.
Comment 41 Alan Cox 2007-08-29 15:25:55 EDT
Interesting - there must be more revs of the chip/firmware than I ever realised. The qc->nbytes = 0 case is a libata bug as I read the spec. I doubt any hardware cares about it which I guess is why its never been noticed, but its most definitely a bug. With the fix you've got does it hang if you rip or play an audio CD (that ends up with a strange DMA size > 2048). I'm wondering if the needed check is something like length % 512, length % 2048 or >= as you have now ? After that I can submit a change - or better yet you could mail me a diff with the OSDL Signed-off-by: line on it, so you get the full credit you deserve as the author of the fixes
Comment 42 Jeff Norden 2007-08-31 12:34:53 EDT
I've tried out an audio CD with no problems. I think the problem has more to do with the underlying packet command than the size of the transfer. I wonder if the qc->nbytes=0 problem occurs with hard disks too, or just atapi devices. At some point I'm going to add a pata disk to the system, at which time I can try to see. I'll email you a patch shortly (I've located the Documentation/SubmittingPatches file, but haven't had time to read it through yet). Thanks -Jeff
Comment 43 Alan Cox 2007-09-10 11:29:49 EDT
*** Bug 242325 has been marked as a duplicate of this bug. ***
Comment 44 lance raymond 2007-09-10 21:57:24 EDT
Hey all, just looking for an update or help on this issue. Rerading from post 1, there are some hardcore guys here, so looking for the end all solution or at least a workaround. I saw a kernel patch link which lost me, so how is this thing looking. I will gladly probvide specs, logs, etc. even try some things to test (just need to hold my hand a bit) as some of the above is, well, deep :) Thanks all. Lance
Comment 45 Chris Stofberg 2007-09-12 09:53:54 EDT
Just discovered this thread. I tried to install F7 from DVD as soon as it became available. Tried everything I could think of; sometimes the installation would complete normally, but failed to boot. Eventually I copied the F7 DVD iso to an external USB drive, burned the Rescue iso to CD, booted that, and installed from HDD (USB). This method has worked for me over several installs on different computers, including an install on an old Dell Latitude laptop (266MHZ, 128MB). Each install was a piece of cake, like Fedora installs up to F7 have always been. Sorry I am unable to suggest another solution, but from now on I plan to use the above method.
Comment 46 Martin von Gagern 2007-09-13 04:59:15 EDT
I've has problems with pata_it821x for a while now, documented in http://bugzilla.kernel.org/show_bug.cgi?id=7507. The fix from comment 39 which seems to be included in 2.6.23-rc6 does not solve the issue for me. libata.pata_dma=0 from comment 33 does "solve" the issue, although hard disks without DMA are unacceptable in a production use. I've also had problems with the old it821x driver, and wrote about it in http://bugzilla.kernel.org/show_bug.cgi?id=7506. There seem to be some similarities. Both times it's DMA, although the old driver solves it by disabling DMA after waiting 30 seconds. This sounds a lot like the 30 seconds mentioned in comment 36. My hardware is an ASUS P5GDC-V Deluxe motherboard with an ITE8212. My system is no Red Hat, but as the discussion here seems more useful than anything I got on the kernel bugzilla so far, I'll cc here.
Comment 47 Martin von Gagern 2007-09-13 06:35:24 EDT
Created attachment 194461 [details] Console log from 2.6.23-rc6 including debug output This is a verbose log from my system, including module loading and the 30 second pause, all with debug messages. It was captured by using a serial console, a null modem cable, and a second machine with screen used to log the session. The log explains why the 2048-bytes-fix can't help me---looks like it821x_check_atapi_dma isn't called here at all. My revision is 0x13. The last command sent seems to be 0xc8.
Comment 48 Martin von Gagern 2007-09-13 06:40:06 EDT
Created attachment 194471 [details] Patch adding debugging code to the module This is an adaptation of Jeff's debugging code from comment 40, now as a patch against 2.6.23-rc6, which means module version 0.3.8. This is what I used to generate the debugging messages in comment 47.
Comment 49 Chuck Ebbert 2007-09-13 17:06:43 EDT
(In reply to comment #46) > I've has problems with pata_it821x for a while now, documented in > http://bugzilla.kernel.org/show_bug.cgi?id=7507. The fix from comment 39 which > seems to be included in 2.6.23-rc6 does not solve the issue for me. > libata.pata_dma=0 from comment 33 does "solve" the issue, although hard disks > without DMA are unacceptable in a production use. > Alan has a new patch that allows selectively disabling DMA for different device types (disks, ATAPI and CF.) It is already in rawhide and will go into F7 and FC6 next.
Comment 50 Martin von Gagern 2007-09-14 03:16:56 EDT
(In reply to comment #49) > selectively disabling DMA for different device types (disks, ATAPI and CF.) That won't help me much, as it's hard disks I want to use the IT8212 for, my optical drives are on another controller. And I would prefer to keep it that way if possible, to have the two drives connected to two different channels of the IT8212, while my optical drives share the single other channel.
Comment 51 Chuck Ebbert 2007-09-14 15:07:51 EDT
Fix is in kernel-22.214.171.124-81.fc7, appearing soon in updates-testing.
Comment 52 Paul Smith 2007-09-27 16:28:14 EDT
I would like to report that today's F7 kernel update fixes the problem. Maybe this new kernel should be included in the F7 installation dvd, as otherwise some people will not be able to install F7. Paul
Comment 53 Alan Cox 2007-09-27 17:13:41 EDT
Thst great news - although the it821x driver hasn't changed so it must be something else involved. Is this true for the other people with similar boards on this bug ?
Comment 54 Paul Smith 2007-09-27 17:22:46 EDT
Let me add that the successful kernel is the 126.96.36.199-85.fc7. Paul
Comment 55 Jeff Norden 2007-09-28 13:03:32 EDT
Alan: I checked the src rpm for 188.8.131.52-85.fc7, and it seems that the it821x driver *has* been fixed. There is a file named: linux-2.6-libata-pata_it821x-dma.patch which contains the patch. I did a Fedora7 re-spin last night which includes the new kernel. It boots fine on my problem system, although I haven't used it to do a full install yet. For anyone who needs it now, I've put it on our ftp server in: ftp://math.tntech.edu/fedora7-updated/ The bandwidth on our campus seems to vary from minute-to-minute so your luck in downloading the DVD iso may vary. The re-spin actually has updates of *all* the F7 packages through Sept 26 2007. It is pretty easy to do this now, but I couldn't find specific directions anywhere, so I wrote a short README file explaining how to use pungi to just create an updated install disk, and put it on our server also. I also "fixed" one other thing when I did the respin: I changed the setting for the emacs package from type="optional" to type="default". Just my two-cents :-)
Comment 56 Ian Gotenks 2007-09-29 07:38:24 EDT
Great ! Works perfect ! At least ... I was waiting for this fix quite some time :( 184.108.40.206-85.fc7 works good, but has problems when one has changed the boot disk order in bios. Fortunately 220.127.116.11-91.fc7 fixes it and now everything works great ... Many thanks.