kernel-2.6.24.4-64.fc8.x86_64 kernel-2.6.24.5-85.fc8.x86_64 Both of the above kernels do not boot on my box. System freezes during after activating /etc/fstab swaps and before entering non-interactive startup. INIT NEVER enters runlevel 2. I haven't even been able to get a console at runlevel 1. This always happens. It never happened before and I am currently using kernel-2.6.24.3-50.fc8.x86_64 without any issues. There is at least one more person having this problem as per the bodhi feedback - https://admin.fedoraproject.org/updates/F8/FEDORA-2008-2871 https://admin.fedoraproject.org/updates/F8/FEDORA-2008-3260 This may be the same issue - https://bugzilla.redhat.com/show_bug.cgi?id=441161 - the only wireless hardware I have on my machine is a Bluetooth dongle. Any ideas?
Can you try addign "nmi_watchdog=1" or "nmi_watchdog=2" to the kernel boot options and see if you get a stack trace when it locks up? Let it sit for a few minutes when it freezes to give the watchdog time to activate.
No such luck. I tried both the kernels mentioned above as well as the F9-Preview 2008-04-17 live image. None of them do anything even with nmi_watchdog set to 1 or 2. At one point, I even left the desktop for about an hour with no stack trace, just a hard hang. No response from the keyboard or mouse and only a cold reboot back to kernel-2.6.24.3-50 works. What else can I try? Thanks for your help.
The problem persists with kernel-2.6.24.7-92.fc8.x86_64 and kernel-2.6.25-3.fc9.i686 from the live image released with Fedora 9. https://admin.fedoraproject.org/updates/F8/FEDORA-2008-3873
Did you try the workarounds? http://fedoraproject.org/wiki/KernelCommonProblems
No luck with the workarounds so far. I have been focusing on section 1.4 Crashes/Hangs and no set of options has yielded a different result except for acpi=off which causes a kernel panic (not syncing: videodev: bad unregister) even earlier than the usual hang.
An interesting thing I've noticed while rebooting to test all these times is that when the system hangs, they keyboard completely stops responding BUT the kernel seems to be alive to some extent in that USB device attach and detach messages appear on the console. This is without any special kernel options being used. I noticed this when trying to use a live USB image of Fedora 9, hanging on boot and pulling out the USB device before hard resetting the machine.
Possibly related issue - https://bugzilla.redhat.com/show_bug.cgi?id=446763 http://bugzilla.kernel.org/show_bug.cgi?id=10796
Same issue with kernel-2.6.25.4-10.fc8 which just hit stable. https://admin.fedoraproject.org/updates/F8/FEDORA-2008-4484
Hmm... can you try adding: pci=rom to the kernel boot options? The default was changed between -50 and -64...
pci=rom does not make any difference. What is the new default?
Based on helpful information in https://bugzilla.redhat.com/show_bug.cgi?id=446763 I was able to narrow down the problem to my ALi Corporation M5253 P1394 OHCI 1.1 Controller. This is a generic PCI card that provides both USB2 and FW1 ports in which the Firewire component has NEVER worked in Fedora and the USB worked fine. After disabling the Firewire OHCI driver by renaming /lib/modules/2.6.25.4-10.fc8/kernel/drivers/firewire/firewire-ohci.ko I was able to get past the "Enabling /etc/fstab swaps: OK" message and boot into runlevel 5. So it appears that something was changed in firewire-ohci between kernel-2.6.24.3-50 and kernel-2.6.24.4-64 that causes my nonfunctional (in Fedora) Firewire card to prevent the entire system from booting. I hope that at the end of this process I will have both a booting machine without renaming stock driver files AND my Firewire ports will work. One can dream, no? Thanks for all your help, Chuck!
I have since upgraded to Fedora 9, which needed a nofirewire kernel option to get anaconda going. With the latest released kernel-2.6.25.6-55.fc9, my keyboard stops responding during boot up. This happens when the system is "Starting udev" and before it says "[OK]" for that item. I cannot type anything afterwards, but can Ctrl-Alt-Delete on one of the ttys to trigger a reboot. Instead of renaming the firewire-ohci.ko file, I added 'blacklist firewire-ohci' to modprobe.conf and can boot successfully into the newest kernel. The 'ALi Corporation M5253 P1394 OHCI 1.1 Controller' of course, does not work.
Maybe we should just add a temporary workaround, printing an error and skipping initialization when we hit one of these devices?
proposed fix for part of the problem: patch "firewire: deadline for PHY config transmission" http://marc.info/?l=linux1394-devel&m=121372642105480 see also https://bugzilla.redhat.com/show_bug.cgi?id=446763#c38
I will attempt to put a test kernel together this evening and post it somewhere for folks to try out. Also planning to acquire an ALi card to beat on myself...
x86_64 test kernel w/patch in comment #14 here: http://people.redhat.com/jwilson/kernels/2.6.25.7-64.fw.fc9/
I got a Belkin F5U508 PCI card with ALi M5271 now. lspci says: 05:00.4 FireWire (IEEE 1394) [0c00]: ALi Corporation M5253 P1394 OHCI 1.1 Controller [10b9:5253] (prog-if 10 [OHCI]) Subsystem: Belkin Unknown device [1799:0519] firewire-ohci starts fine with it. I have the deadline patch applied. When I plug something in, no bus reset IRQ happens. I.e. the PHY is completely dead. However, ohci1394 does work; I can access an SBP-2 disk through it. When I then "modprobe firewire-ohci debug=7" with the disk already plugged in, I get: firewire_ohci: Added fw-ohci device 0000:02:02.0, OHCI version 1.10 firewire_ohci: IRQ 00010010 selfID AR_req firewire_ohci: 1 selfIDs, generation 1, local node ID ffc0 firewire_ohci: selfID 0: 807fcc56, phy 0 [---] beta gc=63 -3W Lci firewire_ohci: AR evt_bus_reset, generation 1 (this was from a FireWire 800 card) firewire_ohci: Added fw-ohci device 0000:05:00.4, OHCI version 1.10 firewire_ohci: IRQ 00000010 AR_req firewire_ohci: AR evt_bus_reset, generation 1 firewire_ohci: IRQ 00010000 selfID firewire_ohci: 2 selfIDs, generation 1, local node ID ffc0 firewire_ohci: selfID 0: 807f8c66, phy 0 [-p-] S400 gc=63 -3W Lci firewire_ohci: selfID 0: 817f8470, phy 1 [-c.] S400 gc=63 -3W L (this is the ALi card, i.e. selfID reception now works, probably thanks to ohci1394's twiddling with it) firewire_ohci: Added fw-ohci device 0000:05:04.0, OHCI version 1.10 firewire_ohci: IRQ 00000010 AR_req firewire_ohci: AR evt_bus_reset, generation 1 firewire_ohci: IRQ 00010000 selfID firewire_ohci: 1 selfIDs, generation 1, local node ID ffc0 firewire_ohci: selfID 0: 807f8952, phy 0 [--.] S400 gc=63 +15W Lci (this is an onboard VT6307) firewire_core: created device fw0: GUID 080028560000319b, S800 firewire_core: created device fw1: GUID 0030bd051800064f, S400 firewire_ohci: AT spd 0 tl 17, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_core: created device fw2: GUID 0010dc5600fed2d4, S400 firewire_ohci: AT spd 0 tl 18, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_ohci: AT spd 0 tl 19, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_ohci: AT spd 0 tl 1a, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_ohci: AT spd 0 tl 1b, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_ohci: AT spd 0 tl 1c, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_ohci: AT spd 0 tl 1d, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_ohci: AT spd 0 tl 1e, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_ohci: AT spd 0 tl 1f, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_ohci: AT spd 0 tl 00, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_ohci: AT spd 0 tl 01, ffc0 -> ffc1, pending/cancelled, QR req, fffff0000400 firewire_core: giving up on config rom for node id ffc1 (these mean split transaction timeouts when fw-device.c attempts to read the SBP-2 disk's config ROM) firewire_core: phy config: card 1, new root=ffc0, gap_count=5 (now fw-card.c tries to select an IRM capable root node and to perform gap count optimization) ------------[ cut here ]------------ WARNING: at drivers/firewire/fw-transaction.c:352 fw_card_bm_work+0x176/0x380 [firewire_core]() Modules linked in: firewire_ohci firewire_core crc_itu_t i915 drm cpufreq_ondemand acpi_cpufreq freq_table snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device nfsd lockd sunrpc exportfs coretemp w83627ehf hwmon_vid sg sd_mod usbhid hid ehci_hcd ata_piix processor libata snd_hda_intel snd_pcm yenta_socket rsrc_nonstatic pcmcia_core thermal_sys hwmon uhci_hcd snd_timer e1000 usbcore snd snd_page_alloc dock rtc [last unloaded: ieee1394] Pid: 9, comm: events/0 Tainted: G W 2.6.26-rc6 #26 [<c0121e7f>] warn_on_slowpath+0x5f/0x90 [<c02f62e5>] _spin_lock_irqsave+0x45/0x60 [<c012b197>] lock_timer_base+0x27/0x60 [<c012b197>] lock_timer_base+0x27/0x60 [<c02f662a>] _spin_unlock_irqrestore+0x2a/0x50 [<c012b218>] try_to_del_timer_sync+0x48/0x50 [<c012b22e>] del_timer_sync+0xe/0x20 [<c02f42d2>] schedule_timeout+0x52/0xd0 [<c02f6255>] _spin_lock_irq+0x35/0x40 [<c02f6676>] _spin_unlock_irq+0x26/0x40 [<c02f38b0>] wait_for_common+0x120/0x180 [<c011af50>] default_wake_function+0x0/0x10 [<f92a1f4b>] fw_send_phy_config+0xab/0xf0 [firewire_core] [<f92a07d6>] fw_card_bm_work+0x176/0x380 [firewire_core] [<f92a18e0>] transmit_complete_callback+0x0/0xa0 [firewire_core] [<f92a2fb0>] fw_device_init+0x0/0x2a0 [firewire_core] [<c0119fde>] __wake_up+0x3e/0x60 [<f92a2fb0>] fw_device_init+0x0/0x2a0 [firewire_core] [<f92a2f8b>] fw_device_release+0x5b/0x80 [firewire_core] [<f92a31a9>] fw_device_init+0x1f9/0x2a0 [firewire_core] [<c02f6255>] _spin_lock_irq+0x35/0x40 [<f92a0660>] fw_card_bm_work+0x0/0x380 [firewire_core] [<c0131d79>] run_workqueue+0x159/0x1f0 [<c0131d21>] run_workqueue+0x101/0x1f0 [<c0135570>] autoremove_wake_function+0x0/0x50 [<c0132818>] worker_thread+0x98/0xf0 [<c0135570>] autoremove_wake_function+0x0/0x50 [<c0132780>] worker_thread+0x0/0xf0 [<c0135262>] kthread+0x42/0x70 [<c0135220>] kthread+0x0/0x70 [<c0103d6f>] kernel_thread_helper+0x7/0x18 ======================= ---[ end trace 03dad1d6d51fa423 ]--- (the PHY config packet transmission from fw-card.c timed out)
I posted an update of the patch which improves it for working controllers but is effectively unchanged for non-working controllers. http://marc.info/?l=linux1394-devel&m=121380606431945
From lspci -vvv output - 04:01.4 FireWire (IEEE 1394): ALi Corporation M5253 P1394 OHCI 1.1 Controller (prog-if 10 [OHCI]) Subsystem: ALi Corporation M5253 P1394 OHCI 1.1 Controller Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64 (750ns max), Cache Line Size: 64 bytes Interrupt: pin C routed to IRQ 9 Region 0: Memory at dccf8800 (32-bit, non-prefetchable) [size=2K] Expansion ROM at dcc00000 [disabled] [size=64K] Capabilities: [80] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Kernel modules: firewire-ohci
(In reply to comment #16) > x86_64 test kernel w/patch in comment #14 here: > > http://people.redhat.com/jwilson/kernels/2.6.25.7-64.fw.fc9/ I cold booted with this kernel and 'options firewire-ohci debug=7' in modprobe.conf and nothing plugged into the firewire bus of the card. Here's what I see - firewire_ohci: Added fw-ohci device 0000:04:01.4, OHCI version 1.10 firewire_ohci: IRQ 00000010 AR_req firewire_ohci: IRQ 00010000 selfID firewire_ohci: 1 selfIDs, generation 1, local node ID ffc0 firewire_core: created device fw0: GUID 0090e639000000f4, S400 firewire_ohci: IRQ 00200000 cycle64Seconds firewire_ohci: IRQ 00200000 cycle64Seconds firewire_ohci: IRQ 00200000 cycle64Seconds
Another cold boot, now with an external Firewire hub already plugged into the card - ACPI: PCI Interrupt 0000:04:01.4[C] -> GSI 19 (level, low) -> IRQ 19 firewire_ohci: Added fw-ohci device 0000:04:01.4, OHCI version 1.10 firewire_ohci: IRQ 00000010 AR_req AR evt_bus_reset, generation 1 firewire_ohci: IRQ 00010000 selfID firewire_ohci: 2 selfIDs, generation 1, local node ID ffc1 selfID 0: 803f8c64, phy 0 [-p-] S400 gc=63 -3W c selfID 0: 817f8872, phy 1 [-c.] S400 gc=63 +0W Lci firewire_core: created device fw0: GUID 0090e639000000f4, S400 firewire_core: phy config: card 0, new root=ffc1, gap_count=5 ------------[ cut here ]------------ WARNING: at drivers/firewire/fw-transaction.c:350 fw_send_phy_config+0xdf/0xeb [firewire_core]() (Not tainted) Modules linked in: v4l2_common tveeprom dcdbas parport_pc parport i2c_i801 i2c_core pcspkr iTCO_wdt iTCO_vendor_support sg snd_intel8x0 snd_ac97_codec ac97_bus firewire_ohci shpchp firewire_core snd_seq_dummy crc_itu_t pata_pdc2027x snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep tg3 hci_usb button pwc joydev snd compat_ioctl32 videodev v4l1_compat bluetooth soundcore ahci dm_snapshot dm_zero dm_mirror dm_mod ata_piix pata_acpi ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan] Pid: 9, comm: events/0 Not tainted 2.6.25.7-64.fw.fc9.x86_64 #1 Call Trace: [<ffffffff81033571>] warn_on_slowpath+0x60/0x91 [<ffffffff8103cc79>] ? process_timeout+0x0/0xb [<ffffffff8128d5f9>] ? schedule_timeout+0x88/0xb4 [<ffffffff8128d48c>] ? wait_for_common+0x10b/0x137 [<ffffffff8102b55f>] ? default_wake_function+0x0/0xf [<ffffffff881fdc04>] :firewire_core:fw_send_phy_config+0xdf/0xeb [<ffffffff881fc52a>] :firewire_core:fw_card_bm_work+0x356/0x3c9 [<ffffffff81043718>] ? insert_work+0x5b/0x5f [<ffffffff81043aff>] ? __queue_work+0x36/0x3f [<ffffffff81043b8b>] ? queue_work+0x47/0x50 [<ffffffff81043ffd>] ? queue_delayed_work+0x33/0x4f [<ffffffff81044045>] ? schedule_delayed_work+0x2c/0x33 [<ffffffff881ff2c2>] ? :firewire_core:fw_device_init+0x255/0x273 [<ffffffff881fc1d4>] ? :firewire_core:fw_card_bm_work+0x0/0x3c9 [<ffffffff8104351d>] run_workqueue+0x84/0x10c [<ffffffff81043682>] worker_thread+0xdd/0xee [<ffffffff81046b83>] ? autoremove_wake_function+0x0/0x38 [<ffffffff810435a5>] ? worker_thread+0x0/0xee [<ffffffff81046863>] kthread+0x49/0x76 [<ffffffff8100ccf8>] child_rip+0xa/0x12 [<ffffffff8104681a>] ? kthread+0x0/0x76 [<ffffffff8100ccee>] ? child_rip+0x0/0x12 ---[ end trace 0f103775ac4c7d8a ]--- firewire_ohci: IRQ 00200000 cycle64Seconds firewire_ohci: IRQ 00200000 cycle64Seconds firewire_ohci: IRQ 00200000 cycle64Seconds
(In reply to comment #14) > proposed fix for part of the problem: > patch "firewire: deadline for PHY config transmission" > http://marc.info/?l=linux1394-devel&m=121372642105480 > > see also https://bugzilla.redhat.com/show_bug.cgi?id=446763#c38 Your patch does fix the locked up keyboard problem nicely. With kernel-2.6.25.7-64.fw.fc9 I no longer need to blacklist firewire-ohci. Thanks!
upstream bug status: 10796 - closed, fixed in v2.6.26-rc7 10935 - open, fw-ohci's bus reset tasklet is unable to finish the first bus reset event
kernel-2.6.25.9-76.fc9 has been submitted as an update for Fedora 9
kernel-2.6.25.9-76.fc9 has been pushed to the Fedora 9 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update kernel'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-5893
kernel-2.6.25.9-76.fc9 has been pushed to the Fedora 9 stable repository. If problems still persist, please make note of it in this bug report.