Bug 144201
Summary: | Badness in ohci_hw_csr_reg at drivers/ieee1394/ohci1394.c:3154 | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Sitsofe Wheeler <sitsofe> | ||||
Component: | kernel | Assignee: | Dave Jones <davej> | ||||
Status: | CLOSED RAWHIDE | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | rawhide | CC: | aviro, jdunn, kare.nuorteva, krynos, pb, pfrields, pigetak178, roland, ruth, stefan-r-rhbz, wtogami | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-11-07 00:57:21 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Sitsofe Wheeler
2005-01-04 23:43:55 UTC
did this get fixed in todays 2.6.10 update ? Nope, although the output has changed ever so slightly: md: ... autorun DONE. Badness in ohci_hw_csr_reg at drivers/ieee1394/ohci1394.c:3154 [<d09b9ff3>] ohci_hw_csr_reg+0x69/0x82 [ohci1394] [<d0a0aba6>] host_reset+0x64/0x106 [ieee1394] [<d0a0a5fd>] highlevel_host_reset+0x27/0x34 [ieee1394] [<d09b9067>] ohci_irq_handler+0x8b7/0x90d [ohci1394] [<c013b776>] handle_IRQ_event+0x21/0x4a [<c013b8c7>] __do_IRQ+0x128/0x221 [<c0104c74>] do_IRQ+0x5e/0x7a ======================= [<c01034a6>] common_interrupt+0x1a/0x20 [<c01203e8>] __do_softirq+0x2c/0x79 [<c0104d4c>] do_softirq+0x38/0x3f ======================= [<c0104c86>] do_IRQ+0x70/0x7a [<c01034a6>] common_interrupt+0x1a/0x20 [<d09b636b>] set_phy_reg+0x1ae/0x1b3 [ohci1394] [<d09b6f29>] ohci_devctl+0xd4/0x71b [ohci1394] [<d0a10290>] csr1212_generate_csr_image+0x1ab/0x1bb [ieee1394] [<d0a07121>] hpsb_reset_bus+0x1c/0x22 [ieee1394] [<c012cb20>] worker_thread+0x1eb/0x2f0 [<d0a094c8>] delayed_reset_bus+0x0/0xb6 [ieee1394] [<c0117643>] default_wake_function+0x0/0xc [<c02f4ea9>] schedule+0x453/0x4d6 [<c0117643>] default_wake_function+0x0/0xc [<c012c935>] worker_thread+0x0/0x2f0 [<c01314fd>] kthread+0x69/0x91 [<c0131494>] kthread+0x0/0x91 [<c01011dd>] kernel_thread_helper+0x5/0xb ieee1394: Host added: ID:BUS[0-00:1023] GUID[00004c0100002bc5] SELinux: initialized (dev ramfs, type ramfs), uses genfs_contexts I don't know why another ======================= has appeared but that's about the only change outside of addresses. Hmm this is still needinfo. I'm going to try changing the status and see what happens... (my attempt to change the status using the beta bugzilla did nothing but then again the drop down box only had a white space in it anyway) I've just gotten a report from an ipod owner that the firewire appears to be still working but has funny disk labels. I verify that this problem still occurs with: Linux jupiter.acf.aquezada.com 2.6.10-1.741_FC3 #1 Thu Jan 13 16:38:22 EST 2005i686 i686 i386 GNU/Linux I'm also getting this message with: Linux huxley 2.6.10-1.766_FC3 #1 Wed Feb 9 23:06:42 EST 2005 i686 athlon i386 GNU/Linux lspci: 00:09.0 FireWire (IEEE 1394): NEC Corporation uPD72874 IEEE1394 OHCI 1.1 3-port PHY-Link Ctrlr (rev 01) usb 2-1: new low speed USB device using uhci_hcd and address 2 ieee1394: Initialized config rom entry `ip1394' input: USB HID v1.00 Gamepad [Microsoft SideWinder Game Pad Pro USB version 1.0] on usb-0000:00:10.0-1 ohci1394: $Rev: 1223 $ Ben Collins <bcollins> PCI: Found IRQ 10 for device 0000:00:09.0 PCI: Sharing IRQ 10 with 0000:00:0d.0 PCI: Sharing IRQ 10 with 0000:00:10.1 ohci1394: fw-host0: Unexpected PCI resource length of 1000! ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[10] MMIO=[e3800000-e38007ff] Max Packet=[2048] md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. Badness in ohci_hw_csr_reg at drivers/ieee1394/ohci1394.c:3154 [<e0a0b130>] ohci_hw_csr_reg+0x69/0x82 [ohci1394] [<e0a97c42>] host_reset+0x64/0x106 [ieee1394] [<e0a97699>] highlevel_host_reset+0x27/0x34 [ieee1394] [<e0a0a15a>] ohci_irq_handler+0x907/0x95d [ohci1394] [<c013d4e6>] handle_IRQ_event+0x21/0x4a [<c013d64b>] __do_IRQ+0x13c/0x249 [<c0104e04>] do_IRQ+0x5e/0x7a ======================= [<c01035b2>] common_interrupt+0x1a/0x20 [<c0120b50>] __do_softirq+0x2c/0x79 [<c0104edc>] do_softirq+0x38/0x3f ======================= [<c0104e16>] do_IRQ+0x70/0x7a [<c01035b2>] common_interrupt+0x1a/0x20 [<e0a07393>] set_phy_reg+0x1c2/0x1c7 [ohci1394] [<e0a07f68>] ohci_devctl+0xd4/0x757 [ohci1394] [<e0a9d3d8>] csr1212_generate_csr_image+0x1ab/0x1bb [ieee1394] [<e0a94121>] hpsb_reset_bus+0x1c/0x22 [ieee1394] [<c012dd4b>] worker_thread+0x1ff/0x31b [<e0a96564>] delayed_reset_bus+0x0/0xb6 [ieee1394] [<c01178ea>] default_wake_function+0x0/0xc [<c02fd6bb>] schedule+0x45d/0x4ea [<c01178ea>] default_wake_function+0x0/0xc [<c012db4c>] worker_thread+0x0/0x31b [<c0132b05>] kthread+0x69/0x91 [<c0132a9c>] kthread+0x0/0x91 [<c01011dd>] kernel_thread_helper+0x5/0xb ieee1394: Host added: ID:BUS[0-00:1023] GUID[00004c0100003b1d] EXT3 FS on sda6, internal journal I'm experiencing similar issue. I'm running FC3 (2.6.11-1.14_FC3) with latest patches. I don't have any firewire devices. The bug randomly freezes the whole system (numlock doesn't respond). I'm also using nvidia and nvsound drivers, but I don't think they have anything to do with this issue since it appears for others too. Possible workaround: I've set acpi=off and added ohci1394 and ieee1394 to /etc/hotplug/blacklist and I've had a whole day without a crash. ieee1394: Initialized config rom entry `ip1394' ohci1394: $Rev: 1223 $ Ben Collins <bcollins> PCI: Setting latency timer of device 0000:00:0d.0 to 64 ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[9] MMIO=[e5084000-e50847ff] Max Packet=[2048] md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. Badness in get_phy_reg at drivers/ieee1394/ohci1394.c:238 (Tainted: P ) [<f89a01e1>] get_phy_reg+0x1e1/0x27a [ohci1394] [<f89a164c>] ohci_devctl+0x4ad/0x8c8 [ohci1394] [<c0103a72>] common_interrupt+0x1a/0x20 [<f89a35a4>] ohci_irq_handler+0x5a7/0xbff [ohci1394] [<c010878f>] timer_interrupt+0x80/0x22e [<c0149670>] handle_IRQ_event+0x31/0x62 [<c014975f>] __do_IRQ+0xbe/0x31f [<c010564f>] do_IRQ+0x4c/0x85 ======================= [<c0103a72>] common_interrupt+0x1a/0x20 [<c0126c7c>] __do_softirq+0x2c/0x8a [<c010575b>] do_softirq+0x39/0x40 ======================= [<c0105656>] do_IRQ+0x53/0x85 [<c0105656>] do_IRQ+0x53/0x85 [<c0103a72>] common_interrupt+0x1a/0x20 [<f89a0354>] set_phy_reg+0xda/0x26d [ohci1394] [<f89a163b>] ohci_devctl+0x49c/0x8c8 [ohci1394] [<f8c3ccb7>] csr1212_fill_cache+0xbb/0xf4 [ieee1394] [<f8c3ce6d>] csr1212_generate_csr_image+0x17d/0x1ba [ieee1394] [<f8c32133>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c0136812>] worker_thread+0x1c4/0x415 [<f8c34d18>] delayed_reset_bus+0x0/0xba [ieee1394] [<c011bac8>] default_wake_function+0x0/0xc [<c011bac8>] default_wake_function+0x0/0xc [<c013664e>] worker_thread+0x0/0x415 [<c013c960>] kthread+0x7c/0xa4 [<c013c8e4>] kthread+0x0/0xa4 [<c0101291>] kernel_thread_helper+0x5/0xb Badness in set_phy_reg at drivers/ieee1394/ohci1394.c:267 (Tainted: P ) [<f89a040e>] set_phy_reg+0x194/0x26d [ohci1394] [<f89a163b>] ohci_devctl+0x49c/0x8c8 [ohci1394] [<c0103a72>] common_interrupt+0x1a/0x20 [<f89a35a4>] ohci_irq_handler+0x5a7/0xbff [ohci1394] [<c010878f>] timer_interrupt+0x80/0x22e [<c0149670>] handle_IRQ_event+0x31/0x62 [<c014975f>] __do_IRQ+0xbe/0x31f [<c010564f>] do_IRQ+0x4c/0x85 ======================= [<c0103a72>] common_interrupt+0x1a/0x20 [<c0126c7c>] __do_softirq+0x2c/0x8a [<c010575b>] do_softirq+0x39/0x40 ======================= [<c0105656>] do_IRQ+0x53/0x85 [<c0105656>] do_IRQ+0x53/0x85 [<c0103a72>] common_interrupt+0x1a/0x20 [<f89a0354>] set_phy_reg+0xda/0x26d [ohci1394] [<f89a163b>] ohci_devctl+0x49c/0x8c8 [ohci1394] [<f8c3ccb7>] csr1212_fill_cache+0xbb/0xf4 [ieee1394] [<f8c3ce6d>] csr1212_generate_csr_image+0x17d/0x1ba [ieee1394] [<f8c32133>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c0136812>] worker_thread+0x1c4/0x415 [<f8c34d18>] delayed_reset_bus+0x0/0xba [ieee1394] [<c011bac8>] default_wake_function+0x0/0xc [<c011bac8>] default_wake_function+0x0/0xc [<c013664e>] worker_thread+0x0/0x415 [<c013c960>] kthread+0x7c/0xa4 [<c013c8e4>] kthread+0x0/0xa4 [<c0101291>] kernel_thread_helper+0x5/0xb ohci1394: fw-host0: SelfID received outside of bus reset sequence ieee1394: Host added: ID:BUS[0-00:1023] GUID[00e0180000803bb8] There are a bunch of ieee1394 patches in the updates-testing kernel. The latest one is at http://people.redhat.com/davej/kernels/Fedora/FC3 I'll be interested to hear if they fix this for you. I would just like to say that I cannot confirm the random freeze symptoms in comment #7. I do not want this bug closed as a nvidia duplicate and I would appreciate it if people having random crashes while using nvidia binary drivers file their problems in separate bugs or test without the nvidia drivers loaded before adding comments here. davej: I've just tried kernel-2.6.11-1.24_FC3.i686.rpm but I still see the following: Badness in ohci_hw_csr_reg at drivers/ieee1394/ohci1394.c:3154 (Not tainted) [<d09cdee3>] ohci_hw_csr_reg+0xad/0xaf [ohci1394] [<d0a1481c>] host_reset+0x11a/0x126 [ieee1394] [<d0a1407c>] highlevel_host_reset+0x31/0x3a [ieee1394] [<d09cc7fd>] ohci_irq_handler+0x800/0xbff [ohci1394] [<c010878f>] timer_interrupt+0x80/0x22e [<c0148670>] handle_IRQ_event+0x31/0x62 [<c014875f>] __do_IRQ+0xbe/0x31f [<c010564f>] do_IRQ+0x4c/0x85 ======================= [<d09a0788>] stall_callback+0x0/0x36f [uhci_hcd] [<c0103a72>] common_interrupt+0x1a/0x20 [<c0125c7c>] __do_softirq+0x2c/0x8a [<c010575b>] do_softirq+0x39/0x40 ======================= [<c0105656>] do_IRQ+0x53/0x85 [<c0105656>] do_IRQ+0x53/0x85 [<c0103a72>] common_interrupt+0x1a/0x20 [<d09c9354>] set_phy_reg+0xda/0x26d [ohci1394] [<d09ca63b>] ohci_devctl+0x49c/0x8c8 [ohci1394] [<d0a1acb7>] csr1212_fill_cache+0xbb/0xf4 [ieee1394] [<d0a1ae6d>] csr1212_generate_csr_image+0x17d/0x1ba [ieee1394] [<d0a10133>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c0135812>] worker_thread+0x1c4/0x415 [<d0a12d18>] delayed_reset_bus+0x0/0xba [ieee1394] [<c011aac8>] default_wake_function+0x0/0xc [<c011aac8>] default_wake_function+0x0/0xc [<c013564e>] worker_thread+0x0/0x415 [<c013b960>] kthread+0x7c/0xa4 [<c013b8e4>] kthread+0x0/0xa4 [<c0101291>] kernel_thread_helper+0x5/0xb ieee1394: Host added: ID:BUS[0-00:1023] GUID[00004c0100002bc5] I've tryed kernel-2.6.11-1.25_FC3.i686.rpm ieee1394: Initialized config rom entry `ip1394' ohci1394: $Rev: 1223 $ Ben Collins <bcollins> PCI: Found IRQ 11 for device 0000:00:09.0 PCI: Sharing IRQ 11 with 0000:00:0d.0 PCI: Sharing IRQ 11 with 0000:00:10.1 ohci1394: fw-host0: Unexpected PCI resource length of 1000! ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[11] MMIO=[e3800000-e38007ff] Max P acket=[2048] md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. Badness in ohci_hw_csr_reg at drivers/ieee1394/ohci1394.c:3154 (Not tainted) [<e0a90ee3>] ohci_hw_csr_reg+0xad/0xaf [ohci1394] [<e0af081c>] host_reset+0x11a/0x126 [ieee1394] [<e0af007c>] highlevel_host_reset+0x31/0x3a [ieee1394] [<e0a8f7fd>] ohci_irq_handler+0x800/0xbff [ohci1394] [<c010878f>] timer_interrupt+0x80/0x22e [<c0148640>] handle_IRQ_event+0x31/0x62 [<c014872f>] __do_IRQ+0xbe/0x31f [<c010564f>] do_IRQ+0x4c/0x85 ======================= [<c0103a72>] common_interrupt+0x1a/0x20 [<c0125c4c>] __do_softirq+0x2c/0x8a [<c010575b>] do_softirq+0x39/0x40 ======================= [<c0105656>] do_IRQ+0x53/0x85 [<c0105656>] do_IRQ+0x53/0x85 [<c0103a72>] common_interrupt+0x1a/0x20 [<e0a8c354>] set_phy_reg+0xda/0x26d [ohci1394] [<e0a8d63b>] ohci_devctl+0x49c/0x8c8 [ohci1394] [<e0af6cb7>] csr1212_fill_cache+0xbb/0xf4 [ieee1394] [<e0af6e6d>] csr1212_generate_csr_image+0x17d/0x1ba [ieee1394] [<e0aec133>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c01357e2>] worker_thread+0x1c4/0x415 [<e0aeed18>] delayed_reset_bus+0x0/0xba [ieee1394] [<c011aac8>] default_wake_function+0x0/0xc [<c011aac8>] default_wake_function+0x0/0xc [<c013561e>] worker_thread+0x0/0x415 [<c013b930>] kthread+0x7c/0xa4 [<c013b8b4>] kthread+0x0/0xa4 [<c0101291>] kernel_thread_helper+0x5/0xb ieee1394: Host added: ID:BUS[0-00:1023] GUID[00004c0100003b1d] EXT3 FS on sda6, internal journal The following things are on IRQ 11 also: 11: 25698 XT-PIC 3w-xxxx, uhci_hcd, uhci_hcd, ohci1394, eth0 I just tried 2.6.11-1.29_FC3 and am still having this problem: ieee1394: Initialized config rom entry `ip1394' ohci1394: $Rev: 1223 $ Ben Collins <bcollins> PCI: Enabling device 0000:00:10.0 (0014 -> 0016) PCI: IRQ 0 for device 0000:00:10.0 doesn't match PIRQ mask - try pci=usepirqmask PCI: setting IRQ 9 as level-triggered PCI: Assigned IRQ 9 for device 0000:00:10.0 ohci1394: fw-host0: Unexpected PCI resource length of 1000! ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[9] MMIO=[dd000000-dd0007ff] Max Packet=[2048] Badness in ohci_hw_csr_reg at drivers/ieee1394/ohci1394.c:3154 (Not tainted) [<f0998ee3>] ohci_hw_csr_reg+0xad/0xaf [ohci1394] [<f09e181c>] host_reset+0x11a/0x126 [ieee1394] [<f09e107c>] highlevel_host_reset+0x31/0x3a [ieee1394] [<f09977fd>] ohci_irq_handler+0x800/0xbff [ohci1394] [<c010878f>] timer_interrupt+0x80/0x22e [<c0148690>] handle_IRQ_event+0x31/0x62 [<c014877f>] __do_IRQ+0xbe/0x31f [<c010564f>] do_IRQ+0x4c/0x85 ======================= [<c0103a72>] common_interrupt+0x1a/0x20 [<c0125c9c>] __do_softirq+0x2c/0x8a [<c010575b>] do_softirq+0x39/0x40 ======================= [<c0105656>] do_IRQ+0x53/0x85 [<c0105656>] do_IRQ+0x53/0x85 [<c0103a72>] common_interrupt+0x1a/0x20 [<f0994354>] set_phy_reg+0xda/0x26d [ohci1394] [<f099563b>] ohci_devctl+0x49c/0x8c8 [ohci1394] [<f09e7cb7>] csr1212_fill_cache+0xbb/0xf4 [ieee1394] [<f09e7e6d>] csr1212_generate_csr_image+0x17d/0x1ba [ieee1394] [<f09dd133>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c0135832>] worker_thread+0x1c4/0x415 [<f09dfd18>] delayed_reset_bus+0x0/0xba [ieee1394] [<f0998ee3>] ohci_hw_csr_reg+0xad/0xaf [ohci1394] [<f09e181c>] host_reset+0x11a/0x126 [ieee1394] [<f09e107c>] highlevel_host_reset+0x31/0x3a [ieee1394] [<f09977fd>] ohci_irq_handler+0x800/0xbff [ohci1394] [<c010878f>] timer_interrupt+0x80/0x22e [<c0148690>] handle_IRQ_event+0x31/0x62 [<c014877f>] __do_IRQ+0xbe/0x31f [<c010564f>] do_IRQ+0x4c/0x85 ======================= [<c0103a72>] common_interrupt+0x1a/0x20 [<c0125c9c>] __do_softirq+0x2c/0x8a [<c010575b>] do_softirq+0x39/0x40 ======================= [<c0105656>] do_IRQ+0x53/0x85 [<c0105656>] do_IRQ+0x53/0x85 [<c0103a72>] common_interrupt+0x1a/0x20 [<f0994354>] set_phy_reg+0xda/0x26d [ohci1394] [<f099563b>] ohci_devctl+0x49c/0x8c8 [ohci1394] [<f09e7cb7>] csr1212_fill_cache+0xbb/0xf4 [ieee1394] [<f09e7e6d>] csr1212_generate_csr_image+0x17d/0x1ba [ieee1394] [<f09dd133>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c0135832>] worker_thread+0x1c4/0x415 [<f09dfd18>] delayed_reset_bus+0x0/0xba [ieee1394] [<c011ab18>] default_wake_function+0x0/0xc [<c011ab18>] default_wake_function+0x0/0xc [<c013566e>] worker_thread+0x0/0x415 [<c013b980>] kthread+0x7c/0xa4 [<c013b904>] kthread+0x0/0xa4 [<c0101291>] kernel_thread_helper+0x5/0xb ieee1394: Node added: ID:BUS[0-00:1023] GUID[000a270002983659] ieee1394: Host added: ID:BUS[0-01:1023] GUID[00004c0100003fbf] No surprises but this is still here in kernel-2.6.11-1.1369_FC4 I'm currently running 2.6.12-1.1387_FC4 and I don't have any firewire devices, but found the following in the log: Jul 5 17:29:32 localhost kernel: input: USB HID v1.11 Mouse [Logitech USB Receiver] on usb-0000:00:02.1-2 Jul 5 17:29:32 localhost kernel: Badness in get_phy_reg at drivers/ieee1394/ohci1394.c:238 (Not tainted) Jul 5 17:29:32 localhost kernel: [<f8a53288>] get_phy_reg+0x288/0x28d [ohci1394] Jul 5 17:29:32 localhost kernel: [<c028fc26>] freed_request+0x16/0x66 Jul 5 17:29:32 localhost kernel: [<c0290984>] __blk_put_request+0x5a/0x74 Jul 5 17:29:32 localhost kernel: [<f8a54856>] ohci_devctl+0x618/0xaf6 [ohci1394] Jul 5 17:29:32 localhost kernel: [<c0103c0e>] common_interrupt+0x1a/0x20 Jul 5 17:29:32 localhost kernel: [<c011295b>] delay_pmtmr+0xb/0x13 Jul 5 17:29:32 localhost kernel: [<f8a569a6>] ohci_irq_handler+0x655/0xce2 [ohci1394] Jul 5 17:29:32 localhost kernel: [<c02351f7>] acpi_ev_gpe_detect+0xc8/0xd3 Jul 5 17:29:32 localhost kernel: [<c01507fd>] handle_IRQ_event+0x2e/0x5a Jul 5 17:29:32 localhost kernel: [<c01508dc>] __do_IRQ+0xb3/0x347 Jul 5 17:29:32 localhost kernel: [<c0105b1d>] do_IRQ+0x4a/0x82 Jul 5 17:29:32 localhost kernel: ======================= Jul 5 17:29:32 localhost kernel: [<c0138c45>] delayed_work_timer_fn+0x0/0xa Jul 5 17:29:32 localhost kernel: [<c0103c0e>] common_interrupt+0x1a/0x20 Jul 5 17:29:32 localhost kernel: [<c01282bc>] __do_softirq+0x2c/0x8a Jul 5 17:29:32 localhost kernel: [<c0105c29>] do_softirq+0x3e/0x42 Jul 5 17:29:32 localhost kernel: ======================= Jul 5 17:29:32 localhost kernel: [<c0105b24>] do_IRQ+0x51/0x82 Jul 5 17:29:32 localhost kernel: [<c0103c0e>] common_interrupt+0x1a/0x20 Jul 5 17:29:32 localhost kernel: [<f8a5336e>] set_phy_reg+0xe1/0x270 [ohci1394] Jul 5 17:29:32 localhost kernel: [<c0156d2b>] __free_pages_ok+0x9d/0xc4 Jul 5 17:29:32 localhost kernel: [<f8a54843>] ohci_devctl+0x605/0xaf6 [ohci1394] Jul 5 17:29:32 localhost kernel: [<f8ae1c33>] csr1212_fill_cache+0xdf/0x106 [ieee1394] Jul 5 17:29:32 localhost kernel: [<f8ae1da1>] csr1212_generate_csr_image+0x147/0x23e [ieee1394] Jul 5 17:29:32 localhost kernel: [<f8ad7133>] hpsb_reset_bus+0x20/0x26 [ieee1394] Jul 5 17:29:32 localhost kernel: [<c0138e52>] worker_thread+0x198/0x457 Jul 5 17:29:32 localhost kernel: [<c03701dd>] schedule+0x31d/0x7b3 Jul 5 17:29:32 localhost kernel: [<c011be46>] __wake_up_common+0x39/0x59 Jul 5 17:29:32 localhost kernel: [<f8ad9bd0>] delayed_reset_bus+0x0/0xc0 [ieee1394] Jul 5 17:29:32 localhost kernel: [<c011be01>] default_wake_function+0x0/0xc Jul 5 17:29:32 localhost kernel: [<c0138cba>] worker_thread+0x0/0x457 Jul 5 17:29:32 localhost kernel: [<c013f57b>] kthread+0x87/0x8b Jul 5 17:29:32 localhost kernel: [<c013f4f4>] kthread+0x0/0x8b Jul 5 17:29:32 localhost kernel: [<c01012ad>] kernel_thread_helper+0x5/0xb Jul 5 17:29:32 localhost kernel: Badness in set_phy_reg at drivers/ieee1394/ohci1394.c:267 (Not tainted) Jul 5 17:29:32 localhost kernel: [<f8a534f8>] set_phy_reg+0x26b/0x270 [ohci1394] Jul 5 17:29:32 localhost kernel: [<c028fc26>] freed_request+0x16/0x66 Jul 5 17:29:32 localhost kernel: [<c0290984>] __blk_put_request+0x5a/0x74 Jul 5 17:29:32 localhost kernel: [<f8a54868>] ohci_devctl+0x62a/0xaf6 [ohci1394] Jul 5 17:29:32 localhost kernel: [<c0103c0e>] common_interrupt+0x1a/0x20 Jul 5 17:29:32 localhost kernel: [<c011295b>] delay_pmtmr+0xb/0x13 Jul 5 17:29:32 localhost kernel: [<f8a569a6>] ohci_irq_handler+0x655/0xce2 [ohci1394] Jul 5 17:29:32 localhost kernel: [<c02351f7>] acpi_ev_gpe_detect+0xc8/0xd3 Jul 5 17:29:32 localhost kernel: [<c01507fd>] handle_IRQ_event+0x2e/0x5a Jul 5 17:29:32 localhost kernel: [<c01508dc>] __do_IRQ+0xb3/0x347 Jul 5 17:29:32 localhost kernel: [<c0105b1d>] do_IRQ+0x4a/0x82 Jul 5 17:29:32 localhost kernel: ======================= Jul 5 17:29:32 localhost kernel: [<c0138c45>] delayed_work_timer_fn+0x0/0xa Jul 5 17:29:32 localhost kernel: [<c0103c0e>] common_interrupt+0x1a/0x20 Jul 5 17:29:32 localhost kernel: [<c01282bc>] __do_softirq+0x2c/0x8a Jul 5 17:29:32 localhost kernel: [<c0105c29>] do_softirq+0x3e/0x42 Jul 5 17:29:32 localhost kernel: ======================= Jul 5 17:29:32 localhost kernel: [<c0105b24>] do_IRQ+0x51/0x82 Jul 5 17:29:32 localhost kernel: [<c0103c0e>] common_interrupt+0x1a/0x20 Jul 5 17:29:32 localhost kernel: [<f8a5336e>] set_phy_reg+0xe1/0x270 [ohci1394] Jul 5 17:29:32 localhost kernel: [<c0156d2b>] __free_pages_ok+0x9d/0xc4 Jul 5 17:29:32 localhost kernel: [<f8a54843>] ohci_devctl+0x605/0xaf6 [ohci1394] Jul 5 17:29:32 localhost kernel: [<f8ae1c33>] csr1212_fill_cache+0xdf/0x106 [ieee1394] Jul 5 17:29:32 localhost kernel: [<f8ae1da1>] csr1212_generate_csr_image+0x147/0x23e [ieee1394] Jul 5 17:29:32 localhost kernel: [<f8ad7133>] hpsb_reset_bus+0x20/0x26 [ieee1394] Jul 5 17:29:32 localhost kernel: [<c0138e52>] worker_thread+0x198/0x457 Jul 5 17:29:32 localhost kernel: [<c03701dd>] schedule+0x31d/0x7b3 Jul 5 17:29:32 localhost kernel: [<c011be46>] __wake_up_common+0x39/0x59 Jul 5 17:29:32 localhost kernel: [<f8ad9bd0>] delayed_reset_bus+0x0/0xc0 [ieee1394] Jul 5 17:29:32 localhost kernel: [<c011be01>] default_wake_function+0x0/0xc Jul 5 17:29:32 localhost kernel: [<c0138cba>] worker_thread+0x0/0x457 Jul 5 17:29:32 localhost kernel: [<c013f57b>] kthread+0x87/0x8b Jul 5 17:29:32 localhost kernel: [<c013f4f4>] kthread+0x0/0x8b Jul 5 17:29:32 localhost kernel: [<c01012ad>] kernel_thread_helper+0x5/0xb Jul 5 17:29:32 localhost kernel: ohci1394: fw-host0: SelfID received outside of bus reset sequence Jul 5 17:29:32 localhost kernel: ACPI: Power Button (FF) [PWRF] An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which may contain a fix for your problem. Please update to this new kernel, and report whether or not it fixes your problem. If you have updated to Fedora Core 4 since this bug was opened, and the problem still occurs with the latest updates for that release, please change the version field of this bug to 'fc4'. Thank you. I see this on FC4, kernel-2.6.12-1.1398_FC4 (identical log to the one from 2.6.12-1.1387_FC4 above). I have a firewire controller but no firewire devices attached. Same problem here, FC4, kernel 2.6.12-1.1390, had the problem throughout all of FC3. The only 1394 devices on the bus are a pair of SBP2's (four drive enclosure with two controllers, one drive on each controller atm). From what I've seen on the net, this is a long running bug with the Linux kernel in general, probably with regards to multiple LUNs on a single device - when was the last time they updated the Firewire drivers, anyways? The funny part is that the drives seem to work fine anyways. Badness in get_phy_reg at drivers/ieee1394/ohci1394.c:238 (Not tainted) [<e0a69288>] get_phy_reg+0x288/0x28d [ohci1394] [<c028fda6>] freed_request+0x16/0x66 [<c0290b04>] __blk_put_request+0x5a/0x74 [<e0a6a856>] ohci_devctl+0x618/0xaf6 [ohci1394] [<c0103c0e>] common_interrupt+0x1a/0x20 [<c0112959>] delay_pmtmr+0x9/0x13 [<e0a6c9a6>] ohci_irq_handler+0x655/0xce2 [ohci1394] [<c0108c4a>] timer_interrupt+0x7c/0x1b5 [<c02ba58e>] ide_dma_intr+0x0/0xb5 [<c015096d>] handle_IRQ_event+0x2e/0x5a [<c0150a4c>] __do_IRQ+0xb3/0x347 [<c0105b1d>] do_IRQ+0x4a/0x82 ======================= [<c0138c45>] delayed_work_timer_fn+0x0/0xa [<c0103c0e>] common_interrupt+0x1a/0x20 [<c01282bc>] __do_softirq+0x2c/0x8a [<c0105c29>] do_softirq+0x3e/0x42 ======================= [<c0105b24>] do_IRQ+0x51/0x82 [<c0103c0e>] common_interrupt+0x1a/0x20 [<e0a6936e>] set_phy_reg+0xe1/0x270 [ohci1394] [<c01515b6>] free_irq+0x85/0x1f6 [<e0a6a843>] ohci_devctl+0x605/0xaf6 [ohci1394] [<e0ac1c33>] csr1212_fill_cache+0xdf/0x106 [ieee1394] [<e0ac1da1>] csr1212_generate_csr_image+0x147/0x23e [ieee1394] [<e0ab7133>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c0138e52>] worker_thread+0x198/0x457 [<c03705ad>] schedule+0x31d/0x7b3 [<c011be46>] __wake_up_common+0x39/0x59 [<e0ab9bd0>] delayed_reset_bus+0x0/0xc0 [ieee1394] [<c011be01>] default_wake_function+0x0/0xc [<c0138cba>] worker_thread+0x0/0x457 [<c013f57b>] kthread+0x87/0x8b [<c013f4f4>] kthread+0x0/0x8b [<c01012ad>] kernel_thread_helper+0x5/0xb Badness in set_phy_reg at drivers/ieee1394/ohci1394.c:267 (Not tainted) [<e0a694f8>] set_phy_reg+0x26b/0x270 [ohci1394] [<c028fda6>] freed_request+0x16/0x66 [<c0290b04>] __blk_put_request+0x5a/0x74 [<e0a6a868>] ohci_devctl+0x62a/0xaf6 [ohci1394] [<c0103c0e>] common_interrupt+0x1a/0x20 [<c0112959>] delay_pmtmr+0x9/0x13 [<e0a6c9a6>] ohci_irq_handler+0x655/0xce2 [ohci1394] [<c0108c4a>] timer_interrupt+0x7c/0x1b5 [<c02ba58e>] ide_dma_intr+0x0/0xb5 [<c015096d>] handle_IRQ_event+0x2e/0x5a [<c0150a4c>] __do_IRQ+0xb3/0x347 [<c0105b1d>] do_IRQ+0x4a/0x82 ======================= [<c0138c45>] delayed_work_timer_fn+0x0/0xa [<c0103c0e>] common_interrupt+0x1a/0x20 [<c01282bc>] __do_softirq+0x2c/0x8a [<c0105c29>] do_softirq+0x3e/0x42 ======================= [<c0105b24>] do_IRQ+0x51/0x82 [<c0103c0e>] common_interrupt+0x1a/0x20 [<e0a6936e>] set_phy_reg+0xe1/0x270 [ohci1394] [<c01515b6>] free_irq+0x85/0x1f6 [<e0a6a843>] ohci_devctl+0x605/0xaf6 [ohci1394] [<e0ac1c33>] csr1212_fill_cache+0xdf/0x106 [ieee1394] [<e0ac1da1>] csr1212_generate_csr_image+0x147/0x23e [ieee1394] [<e0ab7133>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c0138e52>] worker_thread+0x198/0x457 [<c03705ad>] schedule+0x31d/0x7b3 [<c011be46>] __wake_up_common+0x39/0x59 [<e0ab9bd0>] delayed_reset_bus+0x0/0xc0 [ieee1394] [<c011be01>] default_wake_function+0x0/0xc [<c0138cba>] worker_thread+0x0/0x457 [<c013f57b>] kthread+0x87/0x8b [<c013f4f4>] kthread+0x0/0x8b [<c01012ad>] kernel_thread_helper+0x5/0xb ohci1394: fw-host0: SelfID received outside of bus reset sequence I also confirm that this still occurs under: Linux jupiter.acf.aquezada.com 2.6.12-1.1372_FC3 #1 Fri Jul 15 00:59:10 EDT 2005 i686 i686 i386 GNU/Linux The following is the stack trace: Badness in ohci_hw_csr_reg at drivers/ieee1394/ohci1394.c:3185 (Not tainted) [<f0851070>] ohci_hw_csr_reg+0xad/0xaf [ohci1394] [<f08e64bc>] host_reset+0x11a/0x126 [ieee1394] [<f08e5d1c>] highlevel_host_reset+0x31/0x3a [ieee1394] [<f084f92d>] ohci_irq_handler+0x800/0xbff [ohci1394] [<c01088bc>] timer_interrupt+0x7a/0x1a1 [<c014e410>] handle_IRQ_event+0x31/0x62 [<c014e4ff>] __do_IRQ+0xbe/0x2ff [<c010590b>] do_IRQ+0x4c/0x85 ======================= [<c013750c>] queue_work+0x1e/0x31 [<c0103aea>] common_interrupt+0x1a/0x20 [<c012797c>] __do_softirq+0x2c/0x8a [<c0105a17>] do_softirq+0x39/0x40 ======================= [<c0105912>] do_IRQ+0x53/0x85 [<c0105912>] do_IRQ+0x53/0x85 [<c0103aea>] common_interrupt+0x1a/0x20 [<f084c354>] set_phy_reg+0xda/0x26d [ohci1394] [<f084d68f>] ohci_devctl+0x525/0xa1d [ohci1394] [<f08ec88b>] csr1212_fill_cache+0xb4/0xed [ieee1394] [<f08e2133>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c0137752>] worker_thread+0x1c4/0x415 [<f08e4a40>] delayed_reset_bus+0x0/0xba [ieee1394] [<c011bb8c>] default_wake_function+0x0/0xc [<c011bb8c>] default_wake_function+0x0/0xc [<c013758e>] worker_thread+0x0/0x415 [<c013da70>] kthread+0x7c/0xa4 [<c013d9f4>] kthread+0x0/0xa4 [<c01012c1>] kernel_thread_helper+0x5/0xb ieee1394: Host added: ID:BUS[0-00:1023] GUID[00004c0100003fbf] There are no firewire devices attached to the system when it first boots up. As the other reporters have mentioned this is still here in 2.6.12-1.1398_FC4. Changing version field to fc4 as requested. Same problem here with FC4 and latest updates: ohci1394: $Rev: 1250 $ Ben Collins <bcollins> ACPI: PCI Interrupt 0000:03:04.0[A] -> Link [LNKA] -> GSI 10 (level, low) -> IRQ 10 ohci1394: fw-host0: OHCI-1394 1.0 (PCI): IRQ=[10] MMIO=[dfefe800-dfefefff] Max Packet=[2048] ieee1394: SelfIDs failed root check ieee1394: Error in SelfID stage, resetting Badness in get_phy_reg at drivers/ieee1394/ohci1394.c:238 (Not tainted) [<e09d6288>] get_phy_reg+0x288/0x28d [ohci1394] [<e09d7856>] ohci_devctl+0x618/0xaf6 [ohci1394] [<c0121748>] printk+0x1b/0x1f [.....] [<c042070a>] start_kernel+0x15d/0x1b7 [<c04202f7>] unknown_bootoption+0x0/0x1b6 ieee1394: SelfIDs failed root check ieee1394: Error in SelfID stage, resetting ieee1394: Error parsing configrom for node 0-00:1023 ieee1394: SelfIDs failed root check ieee1394: Error in SelfID stage, resetting ieee1394: SelfIDs failed root check ieee1394: Error in SelfID stage, resetting ieee1394: SelfIDs failed root check [.....] lspci shows the following output for my firewire card: 03:04.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 46) (prog-if 10 [OHCI]) Subsystem: VIA Technologies, Inc.: Unknown device e8e9 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 (8000ns max), Cache Line Size 08 Interrupt: pin A routed to IRQ 10 Region 0: Memory at dfefe800 (32-bit, non-prefetchable) [size=2K] Region 1: I/O ports at c800 [size=128] Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1- D2+ AuxCurrent=0mA PME(D0-,D1-,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- It is really nasty, since with FC2, it was no problem at all to grab the DV stream from my digital camera. With FC3 and FC4, I only get those errors, and there are already three DV tapes waiting for being processed :-( Still occurring with: Linux jupiter.acf.aquezada.com 2.6.12-1.1447_FC4 #1 Fri Aug 26 20:29:51 EDT 2005 i686 i686 i386 GNU/Linux Mass update to all FC4 bugs: An update has been released (2.6.13-1.1526_FC4) which rebases to a new upstream kernel (2.6.13.2). As there were ~3500 changes upstream between this and the previous kernel, it's possible your bug has been fixed already. Please retest with this update, and update this bug if necessary. Thanks. Still occurring with: Linux jupiter.acf.aquezada.com 2.6.13-1.1526_FC4 #1 Wed Sep 28 19:15:10 EDT 2005 i686 i686 i386 GNU/Linux 2.6.14-1.1637_FC4 has been released as an update for FC4. Please retest with this update, as a large amount of code has been changed in this release, which may have fixed your problem. Thank you. Still occurring with: Linux jupiter.acf.aquezada.com 2.6.14-1.1637_FC4 #1 Wed Nov 9 18:19:37 EST 2005 i686 i686 i386 GNU/Linux Ding dong the witch is dead... I was about ready to ask dave to set this to some special state that only requires restesting once every two major kernel changes but unbelieveably this problem is no longer showing up here. Once again, all hail davej... Can I close this? it's still problematic for Julian (see comment #24), so I think this needs to remain open for a while. Fair enough. Bear in mind when you come to set this NEEDINFO next time around that I won't knock it back into the NEW state (often with this bug other people would test new changes before I did but the bug would not move out of NEEDINFO until I replied). Bug is still around in 2.6.14-1.1637_FC4. Is there *any* chance to get it fixed in decent time and/or to increase its severity and/or priority? Remember - FW DID work properly on the 2.4 series.... (First up if you are going to start asking questions it is customary to cc yourself on the bug in question so that people know you will see their replies) Christoph: As far as I know the warning itself is mostly harmless. I have reports of people still having working firewire despite seeing this (e.g. see Comment #16 ). Asking for ETAs on bug fixes to open big issues is a bit much... However you meant that you have developed a patch resolving firewire issues please disregard the last part of this comment. It's not clear from your comment if this is the case. I'm currently running 2.6.13-1.1532_FC4smp and was running 2.6.14-1.1637_FC4smp. I backed down one to see if the problem was there, too. I did get the badness message at first, but I also got a warning that the kernel was tainted with VMWare, so I shutdown VMWare, turned of the rc.init entries, and rebooted. I did *not* get the badness message after I shutdown VMWare, but my digital videocam is not recognized and I end with this from dmesg: ieee1394: Node changed: 0-02:1023 -> 0-01:1023 ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0800460101b304c7] ieee1394: SelfIDs failed monotony check with 1 ieee1394: Error in SelfID stage, resetting ohci1394: fw-host0: SelfID received, but NodeID invalid (probably new bus reset occurred): 0800FFC0 ohci1394: fw-host0: SelfID received, but NodeID invalid (probably new bus reset occurred): 0800FFC0 ohci1394: fw-host0: Error in reception of SelfID packets [0x80180004/0x0018bcfc] (count: 0) ohci1394: fw-host0: SelfID received, but NodeID invalid (probably new bus reset occurred): 0800FFC0 ohci1394: fw-host0: Error in reception of SelfID packets [0x80200004/0x0020bd3f] (count: 1) ieee1394: Node changed: 0-01:1023 -> 0-02:1023 ieee1394: Node resumed: ID:BUS[0-00:1023] GUID[0800460101b304c7] ieee1394: Node changed: 0-02:1023 -> 0-01:1023 ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0800460101b304c7] ieee1394: SelfIDs failed monotony check with 2 ieee1394: Error in SelfID stage, resetting ohci1394: fw-host0: SelfID received, but NodeID invalid (probably new bus reset occurred): 0800FFC0 ohci1394: fw-host0: SelfID received, but NodeID invalid (probably new bus reset occurred): 0800FFC0 ohci1394: fw-host0: Error in reception of SelfID packets [0x802e0004/0x002ec9cd] (count: 0) This is a Sony DCR-TRV310 digital videocamera. I also have an Epson 2450 scanner connected via Firewire and it *is* recognized with SCSI emulation. The camera works in my wife's iMac, but she chases me off that :-) In other words, the Firewire controller seems fine since other devices work with it and the camera seems fine since it works with my wife's iMac. After a reboot into 2.6.14-1.1637_FC4smp, I *am* getting the "badness" message when I plug in the firewire videocam. ieee1394: SelfIDs failed monotony check with 1 ieee1394: Error in SelfID stage, resetting Badness in get_phy_reg at drivers/ieee1394/ohci1394.c:238 (Not tainted) [<f8f3910e>] get_phy_reg+0x10e/0x113 [ohci1394] [<f8f3a274>] ohci_devctl+0x41f/0x5f7 [ohci1394] [<c0120c4f>] printk+0x1b/0x1f [<f8f893c2>] check_selfids+0x20f/0x216 [ieee1394] [<c0120c4f>] printk+0x1b/0x1f [<f8f8913d>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<f8f3bfc3>] ohci_irq_handler+0x792/0x7b3 [ohci1394] [<c0111d31>] mark_offset_tsc+0x1cc/0x318 [<f8eb3851>] snd_intel8x0_interrupt+0x215/0x23c [snd_intel8x0] [<c01296bd>] do_timer+0x43/0xea [<c01438cd>] handle_IRQ_event+0x2e/0x5a [<c0143984>] __do_IRQ+0x8b/0xe7 [<c0105b1e>] do_IRQ+0x4e/0x86 ======================= [<c011625c>] smp_apic_timer_interrupt+0xc1/0xca [<c0104426>] common_interrupt+0x1a/0x20 [<c0215f75>] acpi_processor_idle+0x0/0x291 [<c0216069>] acpi_processor_idle+0xf4/0x291 [<c0101a24>] cpu_idle+0x4e/0x63 [<c03ec87f>] start_kernel+0x173/0x1ca [<c03ec304>] unknown_bootoption+0x0/0x1b6 ieee1394: Node changed: 0-01:1023 -> 0-02:1023 ieee1394: Node changed: 0-02:1023 -> 0-03:1023 ieee1394: sbp2: Error reconnecting to SBP-2 device - reconnect failed ieee1394: sbp2: Error logging into SBP-2 device - login timed-out ieee1394: sbp2: Failed to reconnect to sbp2 device! ieee1394: Node added: ID:BUS[0-00:1023] GUID[0800460101b304c7] scsi2 : SCSI emulation for IEEE-1394 SBP-2 Devices ieee1394: sbp2: Logged into SBP-2 device ieee1394: Node 0-02:1023: Max speed [S400] - Max payload [2048] Vendor: EPSON Model: GT-9700 Rev: 1.05 Type: Processor ANSI SCSI revision: 04 Attached scsi generic sg0 at scsi2, channel 0, id 0, lun 0, type 3 ieee1394: raw1394: /dev/raw1394 device initialized (In reply to comment #29) > As far as I know the warning itself is mostly harmless. I have reports of people > still having working firewire despite seeing this (e.g. see Comment #16 ). > Asking for ETAs on bug fixes to open big issues is a bit much... "Working" is a bit of an overstatement... it occasionally works. For example, with 2.6.13-1.1526_FC4 I was unable to get the kernel to recognize any firewire devices; it would simply say something like "root node is not bus master capable, select another root node" and fail to work. It's better with the new kernel (2.6.14-1.1637_FC4) but occasionally the machine will lock up. I'm not sure that the lockups are related to this problem, but seeing a kernel stack dump doesn't give me confidence about the stability of the kernel. *Some* firewire devices work. My videocamera does *not* even though in a previous release it did. Sorry, I can't pinpoint what release it did work in, but at one point I did use kino to retrieve video and control the camera and even earlier I used some of the command-line tools to do the same. At this point, there is nothing in the log to indicate the camera is recognized and kino pointedly says no camera is attached. I'd be happy to poke around with any tests someone cares to name, am I'm willing to believe this warning is not particularly deadly, but it would seem to indicate a real problem since the camera is not recognized or usable and the warning happens when I turn on the connected camera.... I'm going to have to retract my statement that the videocamera doesn't work. I needed to manually load module video1394 after which is does work. I've got some performance issues with kino, but I can't say where those are right now. This is a mass-update to all currently open kernel bugs. A new kernel update has been released (Version: 2.6.15-1.1830_FC4) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. Thank you. *** Bug 179923 has been marked as a duplicate of this bug. *** still fails with kernel-2.6.15-1.1830_FC4 ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[3] MMIO=[ee084000-ee0847ff] Max Packet=[2048] Badness in get_phy_reg at drivers/ieee1394/ohci1394.c:238 (Not tainted) [<f89c110e>] get_phy_reg+0x10e/0x113 [ohci1394] [<f89c2272>] ohci_devctl+0x41f/0x5f7 [ohci1394] [<c01038ba>] common_interrupt+0x1a/0x20 [<c011098e>] delay_pmtmr+0xb/0x13 [<f89c3e08>] ohci_irq_handler+0x5e2/0x7a9 [ohci1394] [<c013cccd>] handle_IRQ_event+0x2e/0x5a [<c013cd77>] __do_IRQ+0x7e/0xd7 [<c0104f3a>] do_IRQ+0x4a/0x82 ======================= [<c01038ba>] common_interrupt+0x1a/0x20 [<c0316c3a>] _spin_unlock_irqrestore+0xa/0xc [<f89c225d>] ohci_devctl+0x40a/0x5f7 [ohci1394] [<f8a41a03>] csr1212_fill_cache+0xdf/0x106 [ieee1394] [<f8a41b50>] csr1212_generate_csr_image+0x126/0x249 [ieee1394] [<f8a38135>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c012beef>] worker_thread+0x182/0x22a [<f8a3a22c>] delayed_reset_bus+0x0/0xc0 [ieee1394] [<c0119a67>] default_wake_function+0x0/0xc [<c012bd6d>] worker_thread+0x0/0x22a [<c012f8eb>] kthread+0x87/0x8b [<c012f864>] kthread+0x0/0x8b [<c0101309>] kernel_thread_helper+0x5/0xb Badness in set_phy_reg at drivers/ieee1394/ohci1394.c:267 (Not tainted) [<f89c1221>] set_phy_reg+0x10e/0x113 [ohci1394] [<f89c2286>] ohci_devctl+0x433/0x5f7 [ohci1394] [<c01038ba>] common_interrupt+0x1a/0x20 [<c011098e>] delay_pmtmr+0xb/0x13 [<f89c3e08>] ohci_irq_handler+0x5e2/0x7a9 [ohci1394] [<c013cccd>] handle_IRQ_event+0x2e/0x5a [<c013cd77>] __do_IRQ+0x7e/0xd7 [<c0104f3a>] do_IRQ+0x4a/0x82 ======================= [<c01038ba>] common_interrupt+0x1a/0x20 [<c0316c3a>] _spin_unlock_irqrestore+0xa/0xc [<f89c225d>] ohci_devctl+0x40a/0x5f7 [ohci1394] [<f8a41a03>] csr1212_fill_cache+0xdf/0x106 [ieee1394] [<f8a41b50>] csr1212_generate_csr_image+0x126/0x249 [ieee1394] [<f8a38135>] hpsb_reset_bus+0x20/0x26 [ieee1394] [<c012beef>] worker_thread+0x182/0x22a [<f8a3a22c>] delayed_reset_bus+0x0/0xc0 [ieee1394] [<c0119a67>] default_wake_function+0x0/0xc [<c012bd6d>] worker_thread+0x0/0x22a [<c012f8eb>] kthread+0x87/0x8b [<c012f864>] kthread+0x0/0x8b [<c0101309>] kernel_thread_helper+0x5/0xb ohci1394: fw-host0: SelfID received outside of bus reset sequence ieee1394: Host added: ID:BUS[0-00:1023] GUID[00301b301bae557d] Created attachment 124143 [details]
dmesg with startup and stack trace of the error
my entire dmesg for reference
What the badness message and stackdump mean: Ohci1394 performs busy wait loops while it has interrupts disabled. It happens on different occasions, most notably on FireWire bus resets (basically, when the host adapter is initialized and when devices are plugged in or out). The "badness" of it is that other interrupt handlers cannot run while this is happening. This is against the general goal of achieving low latency in the kernel. Whether it directly influences e.g. the IEEE 1394 kernel subsystem's bus reset handling in a negative way is unknown. When this bug will be fixed: The badness message is only shown by RedHat/ FC kernels and newer -mm kernels. OTOH the offending code has been in all kernels in ohci1394 since day one AFAICT. The issue was first reported to the linux1394-devel or -user mailinglist in 12/2005 and was first discussed there in 01/2006. Right now I believe it is rather low on the todo lists of the few active linux1394 developers. Do not expect a fix soon. I have opened a bug at bugzilla.kernel.org to keep track of the issue ( http://bugzilla.kernel.org/show_bug.cgi?id=6070 ). I will certainly remember to report here once a patch becomes available. After updating the machine to FC5 this problem reappeared for me. Updating version (perhaps the version should be set to devel?) Also adding external kernel.org bugzilla reference. Seen this bug for FC5 kernel 2.6.17-1.2174_FC5: hub 5-0:1.0: 8 ports detected BUG: warning at drivers/ieee1394/ohci1394.c:3181/ohci_hw_csr_reg() (Not tainted) <f8b7c22b> ohci_hw_csr_reg+0x85/0xa3 [ohci1394] <f8b7c1a6> ohci_hw_csr_reg+0x0/0xa3 [ohci1394] <f8bb6a9f> host_reset+0x6e/0x114 [ieee1394] <f8bb5863> highlevel_host_reset+0x21/0x42 [ieee1394] <f8b7d285> ohci_irq_handler+0x6b9/0x73d [ohci1394] <c0418f85> scheduler_tick+0x1a/0x2af <c043db72> handle_IRQ_event+0x23/0x4c <c043dc17> __do_IRQ+0x7c/0xd1 <c0405035> do_IRQ+0x63/0x80 ======================= <c04036f2> common_interrupt+0x1a/0x20 <c0602af9> _spin_unlock_irqrestore+0xa/0xc <f8b79f31> ohci_devctl+0x5d/0x5de [ohci1394] <f8bb301c> hpsb_reset_bus+0x1c/0x22 [ieee1394] <c04296d1> run_workqueue+0x85/0xc5 <c042c424> remove_wait_queue+0xaf/0xb9 <f8bb4e58> delayed_reset_bus+0x0/0xca [ieee1394] <c0429af0> worker_thread+0x0/0x10c <c0429bcb> worker_thread+0xdb/0x10c <c04186d5> default_wake_function+0x0/0xc <c042c0b7> kthread+0x91/0xbd <c042c026> kthread+0x0/0xbd <c0401005> kernel_thread_helper+0x5/0xb ieee1394: Error parsing configrom for node 0-00:1023 ieee1394: Host added: ID:BUS[0-01:1023] GUID[00c0d00001f79969] The attached firewire disk does appear to work. However, it's worrying. Ruth Whether this is something to worry about depends on if you have hard requirements to system responsiveness. See comment #39. The cause of the warning (the busy wait loop in interrupt handler context) cannot cause data corruption AFAICT, except for e.g. gaps in fixed-datarate streams like uncompressed audio streams etc. (DaveJ, correct me if I missed something.) Aside from that, this usage of busy wait loops _may_ be related to the SelfID errors seen in some reports at this bug entry, thus be detrimental to recognition of FireWire devices, but I am not sure about that. I am sorry that nothing has been done about it since this was reported at linux1394-devel. It is a matter of resources and priorities. Cf. http://bugzilla.kernel.org/buglist.cgi?product=Drivers&component=IEEE1394&bug_status=NEW&bug_status=ASSIGNED&bug_status=NEEDINFO A new kernel update has been released (Version: 2.6.18-1.2200.fc5) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. In the last few updates, some users upgrading from FC4->FC5 have reported that installing a kernel update has left their systems unbootable. If you have been affected by this problem please check you only have one version of device-mapper & lvm2 installed. See bug 207474 for further details. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. If this bug has been fixed, but you are now experiencing a different problem, please file a separate bug for the new problem. Thank you. The upstream bug hasn't been fixed yet. Linux yorky.danobrien.ws 2.6.18-1.2200.fc5 #1 Sat Oct 14 16:59:26 EDT 2006 i686 No goodness: BUG: warning at drivers/ieee1394/ohci1394.c:234/get_phy_reg() (Not tainted) [<c0403f10>] dump_trace+0x69/0x1af [<c040406e>] show_trace_log_lvl+0x18/0x2c [<c04045e9>] show_trace+0xf/0x11 [<c0404673>] dump_stack+0x15/0x17 [<f89a164e>] get_phy_reg+0x8b/0x105 [ohci1394] [<f89a2a7e>] ohci_devctl+0x49/0x5de [ohci1394] [<f89a4f03>] ohci_irq_handler+0x378/0x786 [ohci1394] [<c0440c82>] handle_IRQ_event+0x23/0x49 [<c0440d2a>] __do_IRQ+0x82/0xde [<c040536d>] do_IRQ+0x9a/0xb8 ======================= BUG: warning at drivers/ieee1394/ohci1394.c:263/set_phy_reg() (Not tainted) [<c0403f10>] dump_trace+0x69/0x1af [<c040406e>] show_trace_log_lvl+0x18/0x2c [<c04045e9>] show_trace+0xf/0x11 [<c0404673>] dump_stack+0x15/0x17 [<f89a1564>] set_phy_reg+0x99/0xf8 [ohci1394] [<f89a2a92>] ohci_devctl+0x5d/0x5de [ohci1394] [<f89a4f03>] ohci_irq_handler+0x378/0x786 [ohci1394] [<c0440c82>] handle_IRQ_event+0x23/0x49 [<c0440d2a>] __do_IRQ+0x82/0xde [<c040536d>] do_IRQ+0x9a/0xb8 ======================= ohci1394: fw-host0: SelfID received outside of bus reset sequence ieee1394: Host added: ID:BUS[0-00:1023] GUID[00301b301bae557d] A slightly different form... probably the same bug. This is on an add-on PCI card running on an Athlon XP/ VIA KT600 ACPI: PCI Interrupt 0000:00:0e.0[A] -> GSI 17 (level, low) -> IRQ 209 ohci1394: fw-host0: OHCI-1394 1.0 (PCI): IRQ=[209] MMIO=[f2000000-f20007ff] Max Packet=[1024] IR/IT contexts=[4/4] BUG: warning at drivers/ieee1394/ohci1394.c:3187/ohci_hw_csr_reg() (Not tainted) [<c0403f10>] dump_trace+0x69/0x1af [<c040406e>] show_trace_log_lvl+0x18/0x2c [<c04045e9>] show_trace+0xf/0x11 [<c0404673>] dump_stack+0x15/0x17 [<f89d111d>] ohci_hw_csr_reg+0x85/0xa3 [ohci1394] [<f89f94d7>] host_reset+0x6e/0x114 [ieee1394] [<f89f87e5>] highlevel_host_reset+0x24/0x48 [ieee1394] [<f89d228d>] ohci_irq_handler+0x702/0x786 [ohci1394] [<c0440c82>] handle_IRQ_event+0x23/0x49 [<c0440d2a>] __do_IRQ+0x82/0xde [<c040536d>] do_IRQ+0x9a/0xb8 ======================= ieee1394: Error parsing configrom for node 0-00:1023 ieee1394: Host added: ID:BUS[0-01:1023] GUID[00c0d00001f79969] floppy0: no floppy controllers found Yes, there are a few different code paths which are implemented similarly badly (i.e. using a busy-wait loop in an interrupt handler's context). The one in comment 46 is indeed the one which this bugzilla entry is named after. :-) I don't know yet whether all paths can be fixed in one go or if this will involve several parts to be rewritten. *** Bug 195826 has been marked as a duplicate of this bug. *** Still happen using kernel-2.6.20-1.2933.fc6. @reporter: could you please update version to "fc6"? Seen during booting: BUG: warning at drivers/ieee1394/ohci1394.c:233/get_phy_reg() (Not tainted) [<f89f361d>] get_phy_reg+0x88/0x105 [ohci1394] [<f89f4d4b>] ohci_devctl+0x49/0x5ae [ohci1394] [<c04049b0>] apic_timer_interrupt+0x28/0x30 [<c04ec4ef>] delay_tsc+0xb/0x13 [<f89f707c>] ohci_irq_handler+0x378/0x785 [ohci1394] [<c044f0ed>] handle_IRQ_event+0x1a/0x3f [<c045008e>] handle_fasteoi_irq+0x72/0xa7 [<c045001c>] handle_fasteoi_irq+0x0/0xa7 [<c04062be>] do_IRQ+0xb5/0xdb [<c04048f3>] common_interrupt+0x23/0x28 [<c042b8ff>] __do_softirq+0x54/0xba [<c04061b1>] do_softirq+0x59/0xb1 [<c0419df6>] smp_apic_timer_interrupt+0x76/0x80 [<c04049b0>] apic_timer_interrupt+0x28/0x30 [<c061ffca>] _spin_unlock_irqrestore+0xa/0xc [<f89f4d5f>] ohci_devctl+0x5d/0x5ae [ohci1394] [<f8a76ab5>] csr1212_generate_csr_image+0x310/0x365 [ieee1394] [<f8a6e01c>] hpsb_reset_bus+0x1c/0x22 [ieee1394] [<c0434842>] run_workqueue+0x85/0x125 [<c061ffaf>] _spin_lock_irqsave+0x9/0xd [<f8a6fe4e>] delayed_reset_bus+0x0/0xcd [ieee1394] [<c04351b4>] worker_thread+0xf9/0x124 [<c04226ab>] default_wake_function+0x0/0xc [<c04350bb>] worker_thread+0x0/0x124 [<c04377c7>] kthread+0xb0/0xd9 [<c0437717>] kthread+0x0/0xd9 [<c0404b33>] kernel_thread_helper+0x7/0x10 ======================= BUG: warning at drivers/ieee1394/ohci1394.c:262/set_phy_reg() (Not tainted) [<f89f3536>] set_phy_reg+0x97/0xf6 [ohci1394] [<f89f4d5f>] ohci_devctl+0x5d/0x5ae [ohci1394] [<c04049b0>] apic_timer_interrupt+0x28/0x30 [<c04ec4ef>] delay_tsc+0xb/0x13 [<f89f707c>] ohci_irq_handler+0x378/0x785 [ohci1394] [<c044f0ed>] handle_IRQ_event+0x1a/0x3f [<c045008e>] handle_fasteoi_irq+0x72/0xa7 [<c045001c>] handle_fasteoi_irq+0x0/0xa7 [<c04062be>] do_IRQ+0xb5/0xdb [<c04048f3>] common_interrupt+0x23/0x28 [<c042b8ff>] __do_softirq+0x54/0xba [<c04061b1>] do_softirq+0x59/0xb1 [<c0419df6>] smp_apic_timer_interrupt+0x76/0x80 [<c04049b0>] apic_timer_interrupt+0x28/0x30 [<c061ffca>] _spin_unlock_irqrestore+0xa/0xc [<f89f4d5f>] ohci_devctl+0x5d/0x5ae [ohci1394] [<f8a76ab5>] csr1212_generate_csr_image+0x310/0x365 [ieee1394] [<f8a6e01c>] hpsb_reset_bus+0x1c/0x22 [ieee1394] [<c0434842>] run_workqueue+0x85/0x125 [<c061ffaf>] _spin_lock_irqsave+0x9/0xd [<f8a6fe4e>] delayed_reset_bus+0x0/0xcd [ieee1394] [<c04351b4>] worker_thread+0xf9/0x124 [<c04226ab>] default_wake_function+0x0/0xc [<c04350bb>] worker_thread+0x0/0x124 [<c04377c7>] kthread+0xb0/0xd9 [<c0437717>] kthread+0x0/0xd9 [<c0404b33>] kernel_thread_helper+0x7/0x10 ======================= Peter: Since you asked so nicely... Done. (Can anyone confirm this on FC7?) I do wonder whether we would be better off just marking this against devel until the alleged rewrite of the entire firewire stack happens... Quick note: Linux localhost.localdomain 2.6.20-1.2933.fc6 #1 SMP Mon Mar 19 11:38:26 EDT 2007 i686 It has been awhile since I tested f/w. I have an external disk enclosure that does both USB2.0 and F/W and the F/W has never worked, but today when I tested it, it does. I still get the "badness" stack on boot as the comment above mine suggests. I run rsnapshot backups on this external disk. I'll run it awhile using the F/W connection instead of the USB2.0 and see how it goes. Dmesg when I plugged it in: ieee1394: Node added: ID:BUS[0-00:1023] GUID[00309995e055c082] ieee1394: Node changed: 0-00:1023 -> 0-01:1023 scsi16 : SBP-2 IEEE-1394 ieee1394: sbp2: Logged into SBP-2 device ieee1394: sbp2: Node 0-00:1023: Max speed [S400] - Max payload [2048] scsi 16:0:0:0: Direct-Access-RBC SAMSUNG SP0802N PQ: 0 ANSI: 4 SCSI device sdb: 156368016 512-byte hdwr sectors (80060 MB) sdb: Write Protect is off sdb: Mode Sense: 11 00 00 00 SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA SCSI device sdb: 156368016 512-byte hdwr sectors (80060 MB) sdb: Write Protect is off sdb: Mode Sense: 11 00 00 00 SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: sdb1 sd 16:0:0:0: Attached scsi disk sdb sd 16:0:0:0: Attached scsi generic sg4 type 14 kjournald starting. Commit interval 5 seconds EXT3 FS on sdb1, internal journal EXT3-fs: mounted filesystem with ordered data mode. SELinux: initialized (dev sdb1, type ext3), uses xattr Update to my comment #39 and comment #44: There is nobody working on the upstream bug at the moment. (See link at the bottom of this page.) In fact, there is nobody working on the IEEE 1394 drivers anymore in general, except me doing occasional work mostly outside of ohci1394, and Kristian Høgsberg doing a great job with his new replacement drivers. It is currently more likely that ohci1394 will be replaced by Kristian's fw-ohci, rather than that this bug in ohci1394 will be fixed. Either way, the bug will apparently stay with us for a while. Stefan: Thanks for the heads up. I thought I'd heard of a replacement ieee1394 project... Marking this against devel because after two years of updating this I think it's fair to say it could be a while before it's finally fixed... These replacement drivers seem to be included in FC test kernels (don't ask me where, I'm not a FC user) and they are in Andrew Morton's -mm patchset via kernel.org's linux1394-2.6.git development tree. It's almost feature-complete now. IP over 1394 support (called eth1394 in the old stack) is the only bigger missing feature. (In reply to comment #51) > Quick note: > > Linux localhost.localdomain 2.6.20-1.2933.fc6 #1 SMP Mon Mar 19 11:38:26 EDT > 2007 i686 Well, after a few hours, the F/W usage of this otherwise working external disk failed miserably: eth0: link down. ieee1394: Node changed: 0-01:1023 -> 0-00:1023 ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[00309995e055c082] eth0: link up. Synchronizing SCSI cache for disk sdb: FAILED status = 0, message = 00, host = 1, driver = 00 <7>ieee1394: Node changed: 0-00:1023 -> 0-01:1023 scsi 16:0:0:0: rejecting I/O to dead device Buffer I/O error on device sdb1, logical block 1545 lost page write due to I/O error on sdb1 scsi 16:0:0:0: rejecting I/O to dead device Buffer I/O error on device sdb1, logical block 0 lost page write due to I/O error on sdb1 Well, this doesn't have anything to do with the ohci1394 "badness", but anyway: In what way has it failed? I can only see from these messages that the disk was unplugged. (In reply to comment #56) > Well, this doesn't have anything to do with the ohci1394 "badness", but anyway: > > In what way has it failed? I can only see from these messages that the disk was > unplugged. The disk is plugged in all the time for every four hour rsnapshots. There was some kind of failure at noon causing the disk to appear "unplugged" and the 4PM snapshot failed since the disk was unmounted. I wasn't even home at the time so there was no manual intervention. Ok, ignore my previous entry then. Thanks. Sorry, checking the log timestamps and my power connections, there must have been a power glitch and the external disk enclosure took a power dip (not on UPS). My bad... We don't ship these drivers any more. Chuck: Hurrah! All hail Chuck, the new DaveJ (I'm so fickle) for including the drivers! Stefan: Thanks for posting to this bug report. You might want to mention the new drivers in your upstream bug report (http://bugzilla.kernel.org/show_bug.cgi?id=6070 ). Yes, this is one of those bugs in the old drivers which will obviously end up as WILL_NOT_FIX. Or actually replacement by the new drivers is the fix. (Thanks to the Fedora developers and users by the way.) |