Bug 151332

Summary: Boot hangs when loading sata_sil with connected NCQ drive
Product: [Fedora] Fedora Reporter: Aaron Kurtz <a.kurtz>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 5CC: a.tarasenko, jonstanley, lannet, nitind, pfrields, trevor, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard: MassClosed
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-01-20 04:38:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Fedora Core 3 without irqpoll option - hangs
none
dmesg output for FC3 with irqpoll option
none
FC4t1 - hangs
none
2.6.15-1.1833_FC4 with irqpoll Oops none

Description Aaron Kurtz 2005-03-17 01:01:13 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050308 Firefox/1.0.1 Fedora/1.0.1-5

Description of problem:
I'm using an Intel board with an 845G chipset and onboard Silicon Images 3112 SATA chip. I have a Seagate SATA drive that happens to have NCQ capability, although I realize that's not currently supported. Under FC3's latest kernel, 2.6.10-1.770_FC3, this works when booted with the irqpoll option. Under FC4t1's packaged kernel (at this time there is no rawhide update) kernel-2.6.11-1.1177_FC4, the boot hangs with or without the irqpoll option.

Version-Release number of selected component (if applicable):
kernel-2.6.11-1.1177_FC4

How reproducible:
Always

Steps to Reproduce:
1. Boot with SATA drive connected to SATA connector

Actual Results:  The boot hangs.

Expected Results:  The boot should not hang.

Additional info:

This isn't a cable issue - the drive will boot when booted into FC3 and then refuse to boot in FC4t1 without any hardware being touched.

I'll attempt to write down the logs of both kernels as they boot if needed.

The driver also borks on installation, but I could live with that if it'd work once installed.

Comment 1 Aaron Kurtz 2005-03-17 07:47:24 UTC
Created attachment 112076 [details]
Fedora Core 3 without irqpoll option - hangs

This was typed, but should be correct.

Comment 2 Aaron Kurtz 2005-03-17 07:48:50 UTC
Created attachment 112077 [details]
dmesg output for FC3 with irqpoll option

This works, although there are recurring
hdd: cdrom_pc_intr: The drive appears confused (ireason = 0x01)
messages.

Comment 3 Aaron Kurtz 2005-03-17 07:49:37 UTC
Created attachment 112078 [details]
FC4t1 - hangs

Comment 4 Aaron Kurtz 2005-03-17 07:54:52 UTC
While bug #151347 is similar, I don't have any problems installing or booting
FC4T1 when no hard drives are attached, even though sata_sil is loaded.

Comment 5 Aaron Kurtz 2005-03-20 00:08:23 UTC
I have updated to the latest rawhide kernel, kernel-2.6.11-1.1185_FC4 and no change.
I have added a PATA drive using a SATA adapter. It works in FC3, and does not
work in FC4t1 by itself or with the NCQ drive, so I guess NCQ is not the problem
here.

Comment 6 Trevor Cordes 2005-03-21 01:30:18 UTC
You mean you can install FC3 on a 3112 SATA-only system?  What's this irqpoll
option you mention?  I've not heard of that one before, not even in any of the
other sata_sil bugzillas.  Can I attempt FC3 install with "linux text irqpoll"
and you're saying I should get somewhere?  I'll try to test that later today.


Comment 7 Aaron Kurtz 2005-03-21 03:59:21 UTC
No, what I meant was when there are no SATA drives attached, sata_sil loads and
doesn't hang, unlike some other people who are always experiencing hangs. I
installed on PATA for both FC3 and 4t1.

As for irqpoll, I just realized this, but apparently FC3, in at least its latest
updated form incorporates Alan Cox's -ac patches which include special handling
of misbehaving IRQs, which makes my SATA controller work with drives attached.
This is turned on by the boot parameter "irqpoll" in the latest versions.
It might be included in the base install, but I'm not sure right now. If so, its
not a boot option but automatic. His patches also include the removed PWC driver,
which explains why my webcam stopped working in FC4t1 but that's easily taken 
care of.

FC4t1 and current rawhide do not incorporate the -ac patches, and so fixing the
bad behavior with 'irqpoll' is no longer an option. Unfortunately, the -ac
patch does not cleanly apply against the FC kernel or even 2.6.11.5 This seems
like something to annoy those at bugzilla.kernel.org with when I have time to
rebuild from stock and stock plus -ac patches.

Comment 8 Dave Jones 2005-06-27 23:23:23 UTC
Mass update of -test bugs to update version to fc4.
(Please retest on final release, and report results if you have not already done
so).

Thanks.

Comment 9 Dave Jones 2005-07-15 21:16:09 UTC
[This comment has been added as a mass update for all FC4 kernel bugs.
 If you have migrated this bug from an FC3 bug today, ignore this comment.]

Please retest your problem with todays 2.6.12-1.1398_FC4 update.

If your problem involved being unable to boot, or some hardware not being
detected correctly, please make sure your /etc/modprobe.conf is correct *BEFORE*
installing any kernel updates.
If in doubt, you can recreate this file using..

mv /etc/sysconfig/hwconf /etc/sysconfig/hwconf.bak
mv /etc/modprobe.conf /etc/modprobe.conf.bak
kudzu


Thank you.


Comment 10 Dave Jones 2005-09-30 06:26:08 UTC
Mass update to all FC4 bugs:

An update has been released (2.6.13-1.1526_FC4) which rebases to a new upstream
kernel (2.6.13.2). As there were ~3500 changes upstream between this and the
previous kernel, it's possible your bug has been fixed already.

Please retest with this update, and update this bug if necessary.

Thanks.


Comment 11 Arnaldo Viegas de Lima 2005-10-07 15:04:52 UTC
So far I've narrowed the issue to the Sil 3112 controler and kernels 2.6.11 and
2.6.12 that do not "support" the irqpoll. But 2.6.13 will do it, so the
2.6.13-1.1526_FC4 works (I'm using it right now on a machine with a Sil3112 and
irqpoll).

But the major issue is how to install FC4 on a machine with Sil3112, since the
boot kernet on the install CD/DVD is 2.6.11. The trick is to re-create the
install CD/DVD with a newer kernel (2.6.13-x). I've done it and it works as of
today (needs the "current" boot.iso file from the development branch, and that
changes on a daily basis).

Check: http://forums.fedoraforum.org/showthread.php?t=79116

Comment 12 Dave Jones 2005-11-10 19:25:47 UTC
2.6.14-1.1637_FC4 has been released as an update for FC4.
Please retest with this update, as a large amount of code has been changed in
this release, which may have fixed your problem.

Thank you.


Comment 13 Dave Jones 2006-02-03 05:30:05 UTC
This is a mass-update to all currently open kernel bugs.

A new kernel update has been released (Version: 2.6.15-1.1830_FC4)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO_REPORTER state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

Thank you.


Comment 14 Artyom Tarasenko 2006-03-27 14:07:52 UTC
Created attachment 126815 [details]
2.6.15-1.1833_FC4 with irqpoll Oops

With the 2.6.15-1.1830_FC4 kernel it doesn't hang, but still produces a "irq
10: nobody cared (try booting with the "irqpoll" option)" message. On my PC the
sata_sil driver works fine with the Seagate ST3200822AS drive and any kernel,
but if I plug a WD3200JD drive (alone, or as a second drive) then, the boot
hangs on older kenels, or produces the oops on the 2.6.15-1.1833_FC4 kernel
(see attachment). Also if I try to unload the sata_sil module, the system
freezes comletely.

The irqpoll option is specified in the kernel command line.

Comment 15 Aaron Kurtz 2006-04-12 20:40:07 UTC
On FC5, kernel version 2.6.16-1.2080_FC5, the Seagate SATA drive mounts without
issues. I haven't tried it during a FC5 install. If it works then, I'll close
this unless Artyom really wants to use this bug.

Comment 16 lannet 2006-05-02 00:49:38 UTC
I'm having this hang problem when trying to do an NFS install of FC5 via the
Rescue CD.

My config is a Silicon Image 3112 SATA card`with a pair of Maxtor 80Gb drives
(6V080E0).

I have tried doing NFS installs of FC4 and FC3 via the Rescue CD and they all
present similar symptoms, the only detectable difference being with FC3 the
kernel boot starts quicker than with FC4 and FC5 but it hangs on loading the
sata_sil driver

Comment 17 Artyom Tarasenko 2006-05-11 13:49:17 UTC
> On FC5, kernel version 2.6.16-1.2080_FC5, the Seagate SATA drive mounts without
> issues.

Two questions:

Are you sure that the module is loaded with no problems? On my machine with the
kernel 2.6.15-1.1830_FC4 and newer, as I wrote I get the  "nobody cared" Oops
during the boot, but the boot continues. One could easily miss this Oops, if not
looking at the console during the boot time.

I also can mount the disk, no problem, but the system is not stable. If I try to
unload module, the system always hangs. No wonder: Oops is oops...

Is the kernel naming the same for FC4 and FC5? Because with 2.6.16-1.2096_FC4 I
still have the same problem. Booting FC4 with FC5 kernel may be hard, but I'll
try this later.


Comment 18 Artyom Tarasenko 2006-05-11 15:34:27 UTC
Actually the "nobody cared" (aka "screaming irq") problem was already known to
the driver maintainer and there's a patch from Jeff Garzik, fixing it.
http://lkml.org/lkml/2005/12/3/3

I applied this patch (with a trivial sematics change) to the 2.6.16-1.2096_FC4
kernel, and now everything works perfectly.

So, may we count on inclusion of this patch in the next kernel build?

@Aaron: if you don't need this bug I'd use it.

Comment 19 Dave Jones 2006-09-17 01:59:48 UTC
[This comment added as part of a mass-update to all open FC4 kernel bugs]

FC4 has now transitioned to the Fedora legacy project, which will continue to
release security related updates for the kernel.  As this bug is not security
related, it is unlikely to be fixed in an update for FC4, and has been migrated
to FC5.

Please retest with Fedora Core 5.

Thank you.

Comment 20 Dave Jones 2006-10-16 18:04:38 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 21 Jon Stanley 2008-01-20 04:38:44 UTC
(this is a mass-close to kernel bugs in NEEDINFO state)

As indicated previously there has been no update on the progress of this bug
therefore I am closing it as INSUFFICIENT_DATA. Please re-open if the issue
still occurs for you and I will try to assist in its resolution. Thank you for
taking the time to report the initial bug.

If you believe that this bug was closed in error, please feel free to reopen
this bug.