Bug 118202 - [firewire] Badness in get_phy_reg due to mdelay() in irq
Summary: [firewire] Badness in get_phy_reg due to mdelay() in irq
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: athlon
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-03-13 12:42 UTC by keith adamson
Modified: 2007-11-30 22:10 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2004-05-16 09:13:33 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description keith adamson 2004-03-13 12:42:41 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040217

Description of problem:
Since kernel 2.6.3-1.118 (wasn't happening with 2.6.3-1.116 and is
still happening with 2.6.3-2.1.253) I get the following error message
during boot:

kernel: ohci1394: $Rev: 1172 $ Ben Collins <bcollins>
kernel: ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[5] 
MMIO=[e2084000-e20847ff]  Max Packet=[2048]
kernel: Badness in get_phy_reg at drivers/ieee1394/ohci1394.c:238
kernel: Call Trace:
kernel:  [<2099d109>] get_phy_reg+0x109/0x1db [ohci1394]
kernel:  [<2099df22>] ohci_devctl+0x48/0x798 [ohci1394]
kernel:  [<021d14b5>] __delay+0x9/0xa
kernel:  [<2099fe7c>] ohci_irq_handler+0x562/0x9e5 [ohci1394]
kernel:  [<0210f19e>] handle_IRQ_event+0x21/0x41
kernel:  [<0210f644>] do_IRQ+0x178/0x303
kernel:  [<0224317f>] ide_intr+0x2ff/0x450
kernel:  [<022490eb>] ide_dma_intr+0x0/0x7a
kernel:  [<0210f19e>] handle_IRQ_event+0x21/0x41
kernel:  [<0210f68a>] do_IRQ+0x1be/0x303
kernel:  =======================
kernel:  [<2099d3ac>] set_phy_reg+0x1d1/0x1d7 [ohci1394]
kernel:  [<2099dfae>] ohci_devctl+0xd4/0x798 [ohci1394]
kernel:  [<20a4a83c>] csr1212_generate_csr_image+0x1a5/0x1b4 [ieee1394]
kernel:  [<20a43950>] delayed_reset_bus+0x0/0xb6 [ieee1394]
kernel:  [<20a41112>] hpsb_reset_bus+0x17/0x22 [ieee1394]
kernel:  [<0212f2b2>] run_timer_softirq+0x217/0x32b
kernel:  [<022725c0>] net_rx_action+0x5d/0xcd
kernel:  [<0212b3c5>] __do_softirq+0x35/0x73
kernel:  [<021103d0>] do_softirq+0x46/0x4d
kernel:  =======================
kernel:  [<0210f7c3>] do_IRQ+0x2f7/0x303
kernel:  [<02107000>] _stext+0x0/0xa1
kernel:  [<02107000>] _stext+0x0/0xa1
kernel:  [<0210b03b>] default_idle+0x23/0x26
kernel:  [<0210b08c>] cpu_idle+0x1f/0x34
kernel:  [<0235b69b>] start_kernel+0x21d/0x220
kernel:
kernel: Badness in set_phy_reg at drivers/ieee1394/ohci1394.c:267
kernel: Call Trace:
kernel:  [<2099d2f3>] set_phy_reg+0x118/0x1d7 [ohci1394]
kernel:  [<2099dfae>] ohci_devctl+0xd4/0x798 [ohci1394]
kernel:  [<021d14b5>] __delay+0x9/0xa
kernel:  [<2099fe7c>] ohci_irq_handler+0x562/0x9e5 [ohci1394]
kernel:  [<0210f19e>] handle_IRQ_event+0x21/0x41
kernel:  [<0210f644>] do_IRQ+0x178/0x303
kernel:  [<0224317f>] ide_intr+0x2ff/0x450
kernel:  [<022490eb>] ide_dma_intr+0x0/0x7a
kernel:  [<0210f19e>] handle_IRQ_event+0x21/0x41
kernel:  [<0210f68a>] do_IRQ+0x1be/0x303
kernel:  =======================
kernel:  [<2099d3ac>] set_phy_reg+0x1d1/0x1d7 [ohci1394]
kernel:  [<2099dfae>] ohci_devctl+0xd4/0x798 [ohci1394]
kernel:  [<20a4a83c>] csr1212_generate_csr_image+0x1a5/0x1b4 [ieee1394]
kernel:  [<20a43950>] delayed_reset_bus+0x0/0xb6 [ieee1394]
kernel:  [<20a41112>] hpsb_reset_bus+0x17/0x22 [ieee1394]
kernel:  [<0212f2b2>] run_timer_softirq+0x217/0x32b
kernel:  [<022725c0>] net_rx_action+0x5d/0xcd
kernel:  [<0212b3c5>] __do_softirq+0x35/0x73
kernel:  [<021103d0>] do_softirq+0x46/0x4d
kernel:  =======================
kernel:  [<0210f7c3>] do_IRQ+0x2f7/0x303
kernel:  [<02107000>] _stext+0x0/0xa1
kernel:  [<02107000>] _stext+0x0/0xa1
kernel:  [<0210b03b>] default_idle+0x23/0x26
kernel:  [<0210b08c>] cpu_idle+0x1f/0x34
kernel:  [<0235b69b>] start_kernel+0x21d/0x220
kernel:
kernel: ohci1394: fw-host0: SelfID received outside of bus reset sequence

I don't have any firewire devices to plug into the port so I don't
know if the port is working or not.

# /sbin/lspci
00:00.0 Host bridge: nVidia Corporation nForce2 AGP (different
version?) (rev a2)
00:00.1 RAM memory: nVidia Corporation nForce2 Memory Controller 1
(rev a2)
00:00.2 RAM memory: nVidia Corporation nForce2 Memory Controller 4
(rev a2)
00:00.3 RAM memory: nVidia Corporation nForce2 Memory Controller 3
(rev a2)
00:00.4 RAM memory: nVidia Corporation nForce2 Memory Controller 2
(rev a2)
00:00.5 RAM memory: nVidia Corporation nForce2 Memory Controller 5
(rev a2)
00:01.0 ISA bridge: nVidia Corporation nForce2 ISA Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation nForce2 SMBus (MCP) (rev a2)
00:02.0 USB Controller: nVidia Corporation nForce2 USB Controller (rev a3)
00:02.1 USB Controller: nVidia Corporation nForce2 USB Controller (rev a3)
00:02.2 USB Controller: nVidia Corporation nForce2 USB Controller (rev a3)
00:04.0 Ethernet controller: nVidia Corporation nForce2 Ethernet
Controller (rev a1)
00:05.0 Multimedia audio controller: nVidia Corporation nForce
MultiMedia audio [Via VT82C686B] (rev a2)
00:06.0 Multimedia audio controller: nVidia Corporation nForce2 AC97
Audio Controler (MCP) (rev a1)
00:08.0 PCI bridge: nVidia Corporation nForce2 External PCI Bridge
(rev a3)
00:09.0 IDE interface: nVidia Corporation nForce2 IDE (rev a2)
00:0d.0 FireWire (IEEE 1394): nVidia Corporation nForce2 FireWire
(IEEE 1394) Controller (rev a3)
00:1e.0 PCI bridge: nVidia Corporation nForce2 AGP (rev a2)
01:08.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
03:00.0 VGA compatible controller: nVidia Corporation NV18 [GeForce4
MX - nForce GPU] (rev a3)


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Install FC2 test1
2.Update to rawhide
3.
    

Actual Results:  get error on ieee1394

Expected Results:  no error

Additional info:

Comment 1 keith adamson 2004-03-17 03:27:28 UTC
still happening with kernel-2.6.3-2.1.253.2.1

Comment 2 keith adamson 2004-03-18 14:39:16 UTC
After update today ... I now get a kernel panic during boot unless I
disable the on-board 1394 chip in BIOS.

I noticed on successful boot today before installing todays updates
that kudzu reported the 1394 was removed.  I responded to remove.  It
booted OK.  I did an /sbin/lspci and the 1394 wasn't listed.

After today's updates I rebooted and I got a panic while "Enabling
swap space":

kernel/timer.c:295:spin_lock(kernel/timer.c:0230e3a0) already locked
by kernel/timer.c/392
Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing

This panic was repeatable.  After disabling the on-board 1394 in the
BIOS the system now boots OK.

There is some early message about /etc/fstab being read-only.  But I
think this is unrelated.

I'm also running the new Xorg X server from rawhide.


Comment 3 keith adamson 2004-03-18 15:00:42 UTC
This mornings updates:

system-config-services-0.8.8-2      Thu 18 Mar 2004 08:56:54 AM EST
switchdesk-4.0.1-1.1                Thu 18 Mar 2004 08:56:53 AM EST
sendmail-8.12.11-4                  Thu 18 Mar 2004 08:56:48 AM EST
prelink-0.3.1-2                     Thu 18 Mar 2004 08:56:48 AM EST
openmotif-devel-2.2.3-1.9.1         Thu 18 Mar 2004 08:56:44 AM EST
librsvg2-devel-2.6.1-2              Thu 18 Mar 2004 08:56:43 AM EST
libgnomeui-devel-2.5.92-1           Thu 18 Mar 2004 08:56:41 AM EST
initscripts-7.48-1                  Thu 18 Mar 2004 08:56:40 AM EST
gnome-vfs2-smb-2.5.91-2             Thu 18 Mar 2004 08:56:39 AM EST
gnome-themes-2.5.92-1               Thu 18 Mar 2004 08:56:35 AM EST
gdm-2.5.90.2-3                      Thu 18 Mar 2004 08:56:27 AM EST
firstboot-1.3.10-1                  Thu 18 Mar 2004 08:56:25 AM EST
autofs-4.1.0-8                      Thu 18 Mar 2004 08:56:24 AM EST
gaim-0.75.99-20040318cvs.2          Thu 18 Mar 2004 08:56:21 AM EST
logwatch-5.1-3                      Thu 18 Mar 2004 08:56:19 AM EST
gnome-utils-2.5.90-2                Thu 18 Mar 2004 08:56:15 AM EST
gtk2-engines-2.2.0-5                Thu 18 Mar 2004 08:56:14 AM EST
gnome-vfs2-devel-2.5.91-2           Thu 18 Mar 2004 08:56:13 AM EST
openmotif-2.2.3-1.9.1               Thu 18 Mar 2004 08:56:11 AM EST
redhat-artwork-0.94-1               Thu 18 Mar 2004 08:56:01 AM EST
librsvg2-2.6.1-2                    Thu 18 Mar 2004 08:55:58 AM EST
libgnomeui-2.5.92-1                 Thu 18 Mar 2004 08:55:56 AM EST
gnome-vfs2-2.5.91-2                 Thu 18 Mar 2004 08:55:42 AM EST

kudzu was updated yesterday:
kudzu-1.1.53-1                      Wed 17 Mar 2004 09:17:05 AM EST




Comment 4 mark 2004-03-18 20:11:43 UTC
i get the same kernel panic with kernel-2.6.3-2.1.253.2.1. i have an
adaptec fireconnect 4300 pci firewire card. this error seems to be
introduced with yesterday's updates from rawhide; i did not get this
panic with any kernel prior to 2.6.3-2.1.253. btw, my platform is
i686; i don't think this problem is limited to the athlon platform.

Comment 5 Dave Jones 2004-03-19 11:30:57 UTC
Does it persist with the updated 2.6.4 kernel from 
http://people.redhat.com/arjanv/2.6/RPMS.kernel/

?

Comment 6 keith adamson 2004-03-19 14:55:13 UTC
Still same panic with both kernels with the 1394 on and both are OK
with the 1394 off.  But I tryed turning off kudzu:

# /sbin/chkconfig kudzu off

And both kernels boot with the 1394 enabled:

# uname -a
Linux family 2.6.4-1.275 #1 Mon Mar 15 13:48:30 EST 2004 i686 athlon
i386 GNU/Linux
 
# uname -a
Linux family 2.6.3-2.1.253.2.1 #1 Fri Mar 12 14:01:55 EST 2004 i686
athlon i386 GNU/Linux
 
$ /sbin/lspci
00:00.0 Host bridge: nVidia Corporation nForce2 AGP (different
version?) (rev a2)
00:00.1 RAM memory: nVidia Corporation nForce2 Memory Controller 1
(rev a2)
00:00.2 RAM memory: nVidia Corporation nForce2 Memory Controller 4
(rev a2)
00:00.3 RAM memory: nVidia Corporation nForce2 Memory Controller 3
(rev a2)
00:00.4 RAM memory: nVidia Corporation nForce2 Memory Controller 2
(rev a2)
00:00.5 RAM memory: nVidia Corporation nForce2 Memory Controller 5
(rev a2)
00:01.0 ISA bridge: nVidia Corporation nForce2 ISA Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation nForce2 SMBus (MCP) (rev a2)
00:02.0 USB Controller: nVidia Corporation nForce2 USB Controller (rev a3)
00:02.1 USB Controller: nVidia Corporation nForce2 USB Controller (rev a3)
00:02.2 USB Controller: nVidia Corporation nForce2 USB Controller (rev a3)
00:04.0 Ethernet controller: nVidia Corporation nForce2 Ethernet
Controller (rev a1)
00:05.0 Multimedia audio controller: nVidia Corporation nForce
MultiMedia audio [Via VT82C686B] (rev a2)
00:06.0 Multimedia audio controller: nVidia Corporation nForce2 AC97
Audio Controler (MCP) (rev a1)
00:08.0 PCI bridge: nVidia Corporation nForce2 External PCI Bridge
(rev a3)
00:09.0 IDE interface: nVidia Corporation nForce2 IDE (rev a2)
00:0d.0 FireWire (IEEE 1394): nVidia Corporation nForce2 FireWire
(IEEE 1394) Controller (rev a3)
00:1e.0 PCI bridge: nVidia Corporation nForce2 AGP (rev a2)
01:08.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
03:00.0 VGA compatible controller: nVidia Corporation NV18 [GeForce4
MX - nForce GPU] (rev a3)



Comment 7 keith adamson 2004-03-19 14:56:56 UTC
Forgot:  Also no "Badness in get_phy_reg"

Comment 8 Dave Jones 2004-03-19 15:37:33 UTC
ok, that kernel is very close to whats upstream right now (its based
on a kernel snapshot just before 2.6.5rc1). Working with Ben Collins
(his email address is in the first bug post) with the upstream
ieee1394 tree is probably your best bet.


Comment 9 Arjan van de Ven 2004-03-21 17:44:56 UTC
The cause is that the firewire code is calling mdelay() from the
hardirq handler, which is rather foul behavior wrt the low latency
goals of the kernel, and can even lead to clockskew with the HZ=1000
in 2.6

Comment 10 keith adamson 2004-03-23 07:34:29 UTC
No panic with kernel-2.6.4-1.279 (on-board 1394 enabled and kudzu
service on)


Comment 11 Sahil Verma 2004-03-29 03:00:08 UTC
Please close RAWHIDE. Same issue as 118771 which is also closed.

http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=118771

Comment 12 Arjan van de Ven 2004-03-29 09:08:13 UTC
firewire isn't fixed just off....

Comment 13 Florian Lagg 2004-04-09 23:58:04 UTC
Hello all!

Kindly tell me if I am in the wrong list. I'm new to linux.
I think I have the same bug as described above.

My Kernel Version is 2.6.3-2.1.253.2.1
I have to upgrade to fix this problem, right?

How can I upgrade the kernel if I cannot boot the old one? Or is 
there a way to boot anyway?

Thanks 
Florian Lagg

Comment 14 mark 2004-05-05 04:31:14 UTC
i can no longer produce this with the following setup (on my i686):

fedora-release-1.92-1
initscripts-7.50-1
SysVinit-2.85-25
kernel-2.6.3-2.1.253.2.1
kudzu-1.1.58-1

and in /etc/modprobe.conf:

alias scsi-hostadapter sbp2
alias ieee1394-controller ohci1394

even with /sbin/chkconfig kudzu on

could someone else please verify this????


Comment 15 Alexandre Oliva 2004-05-16 09:13:33 UTC
Firewire was broken after the 2.6.3 release, and remained severely
broken all the way up to 2.6.6.  2.6.6 may sort-of work, but it still
has the problem mentioned in bug 119262 (the crash at boot time
mentioned above), which should be fixed in the next rawhide bug.  The
bug originally reported here is fixed in FC2's kernel, if you enable
Firewire modules.

Comment 16 Tim Wright 2004-07-07 19:11:48 UTC
Hmmm... I'm running 2.6.6-1.435.2.3 dated July 1. That is the latest
Fedora Core 2 kernel that I can find and it is most definitely NOT
fixed. There are still calls to mdelay in the interrupt routines, and
those lovely stack traces are still there when you boot.
ASUS A7N8XDeluxe, attempting to load the ohci1394 driver.


Note You need to log in before you can comment on or make changes to this bug.