Bug 446763 - PS/2 Keyboard dead on boot
PS/2 Keyboard dead on boot
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
8
x86_64 Linux
low Severity urgent
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-15 18:12 EDT by J
Modified: 2008-07-05 11:39 EDT (History)
4 users (show)

See Also:
Fixed In Version: 2.6.25.9-40.fc8
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-02 23:15:00 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg for working kernel (30.81 KB, text/plain)
2008-05-24 06:50 EDT, J
no flags Details
hwconf file for working kernel (9.97 KB, application/octet-stream)
2008-05-24 06:50 EDT, J
no flags Details
kudzu -p for working kernel (9.97 KB, application/octet-stream)
2008-05-24 06:52 EDT, J
no flags Details
lsmod for working kernel (3.89 KB, application/octet-stream)
2008-05-24 06:52 EDT, J
no flags Details
lspci for working kernel (1.91 KB, application/octet-stream)
2008-05-24 06:52 EDT, J
no flags Details
dmesg for NOT working kernel (29.10 KB, text/plain)
2008-05-24 06:53 EDT, J
no flags Details
hwconf file for NOT working kernel (9.97 KB, application/octet-stream)
2008-05-24 06:54 EDT, J
no flags Details
lsmod for NOT working kernel (3.77 KB, text/plain)
2008-05-24 06:54 EDT, J
no flags Details
lspci for NOT working kernel (1.91 KB, application/octet-stream)
2008-05-24 06:55 EDT, J
no flags Details
diff between 2.6.24.7 stock kernel and 2.6.24.7-92.fc8 config files (164.10 KB, application/octet-stream)
2008-05-28 16:33 EDT, J
no flags Details
lspci -vvv on firewire patched kernel 2.6.25.4-10 (17.09 KB, text/plain)
2008-06-19 15:53 EDT, J
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Linux Kernel 10796 None None None Never

  None (edit)
Description J 2008-05-15 18:12:16 EDT
Description of problem:
PS/2 keyboard becomes dead on boot with kernel > 2.6.24.3-50.fc8.

When booting with a new kernel the keyboard is present in bios > grub > and
through the boot process but then stops working just before switching to the
console.  This is true for init 3 as the setting in inittab.  If you type
continuously (not using i) through the boot you can still use the mouse but no
keyboard.

ctrl + alt + del DOES STILL WORK though?

Version-Release number of selected component (if applicable):

This behaviour is not seen in 2.6.24.3-50 but has been present in every kernel
update since then. Including 2.6.24.7-92.fc8.

How reproducible:
Everytime.

Steps to Reproduce:
1. Install new kernel
2. Reboot
3. Select new kernel in grub
4. watch kernel boot
5. stop being able to type
4. do ctrl + alt + del to reboot  

Actual results:
No working keyboard

Expected results:
A working keyboard and mouse

Additional info:

Although the LEDs go out if you hit caplock during the boot process before the
system looses the keyboard the caplock light will remain lit but can not be
turned off.

Also problem made harder to chase by intermittent failure at swap section of
boot process.

Have tried booting with kernel parameters of rhgb and quiet removed along with
acpi=off with no luck.

Using ASUS ncch-dl motherboard, dual xenon 3 GhZ processes PS/2 mouse and keyboard.

Similar bug reported here:
https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/106289
Comment 1 J 2008-05-16 02:38:13 EDT
Booting into interactive mode by pressing I revealed that no particular service
starts before the loss of keyboard functionality. As I made it to load network
on one run and to restcond on a second.

It would appear to be after a certain period has elapsed during the boot process.
Comment 2 J 2008-05-24 04:57:10 EDT
The keyboard stops working after a finite period of time!

Today whilst trying to get this bug sorted I boot and fsck started on one of my
drives.  I continued to press the caps key until it stopped responding. Clearly
at this point no other services had been able to start.

Could this be a timing issue with the clock becoming unstable? Shot in the dark!

Work so far:
See above +
disabled legacy usb support in bios - did not help
added to kernel boot - did not help

I was able to get onto another machine and ssh into the "keyboard less" machine
today and get a dmesg dump, kudzu dump and lsmod these are attached along with
the same from a working kernel.
Comment 3 J 2008-05-24 06:50:34 EDT
Created attachment 306566 [details]
dmesg for working kernel

dmesg for working kernel
Comment 4 J 2008-05-24 06:50:58 EDT
Created attachment 306567 [details]
hwconf file for working kernel

hwconf file for working kernel
Comment 5 J 2008-05-24 06:52:02 EDT
Created attachment 306568 [details]
kudzu -p for working kernel

kudzu -p for working kernel. Note kudzu -p DOES NOT work for kernel
2.6.24.7-92.fc8
Comment 6 J 2008-05-24 06:52:24 EDT
Created attachment 306569 [details]
lsmod for working kernel

lsmod for working kernel
Comment 7 J 2008-05-24 06:52:44 EDT
Created attachment 306570 [details]
lspci for working kernel

lspci for working kernel
Comment 8 J 2008-05-24 06:53:31 EDT
Created attachment 306571 [details]
dmesg for NOT working kernel

dmesg for NOT working kernel-2.6.24.7-92.fc8
Comment 9 J 2008-05-24 06:54:33 EDT
Created attachment 306572 [details]
hwconf file for NOT working kernel

hwconf file for NOT working kernel. kudzu -p just hangs when booted into "not
working kernel". Likewise kudzu -s
Comment 10 J 2008-05-24 06:54:59 EDT
Created attachment 306573 [details]
lsmod for NOT working kernel

lsmod for NOT working kernel
Comment 11 J 2008-05-24 06:55:29 EDT
Created attachment 306574 [details]
lspci for NOT working kernel

lspci for NOT working kernel
Comment 12 J 2008-05-24 09:44:16 EDT
built kernel 2.6.25 from kernel.org from scratch using default .config barring
change to x86_64 to support older processors.

Installed and booted! No keyboard issues???
Comment 13 J 2008-05-25 19:00:05 EDT
Building kernel 2.6.24.4 directly from the .kernel.org website using the make
instructions from the readme and the config info inherited from 2.4.6.24.3-50 -
not altering anything in the menuconfig stage, gave me a kernel which I could
boot with no keyboard issues and make nvidia drivers for - a usable kernel.

I will try kernel 2.4.6.24.7 from the website and if that works it must be down
to a fedora related patch/mod.
Comment 14 J 2008-05-26 06:58:25 EDT
I have now built the 2.6.24.7 kernel from kernel.org and booted successfully.
This would therefore point the problem at one of the fedora patches added to the
2.6.24.7-90 kernel.

I will now try removing the patches from the fedora 2.6.24.7-90 src rpm until I
can find the one(s) which causes the problem. If I can?

Does anyone have a suggestion as to which patches are most likely to have caused
the problem?
Comment 15 Naveed Hasan 2008-05-27 13:43:30 EDT
I am having a similar issue, however I cannot even Ctrl-Alt-Del as the PS/2
keyboard is completely unresponsive after "Enabling /etc/fstab swaps: OK" and
can only hard reset the box. See https://bugzilla.redhat.com/show_bug.cgi?id=444694
Comment 16 Naveed Hasan 2008-05-27 14:29:22 EDT
This problem started occuring for me with kernel-2.6.24.4-64.fc8 and I suspect
you would also see it with that release. If that is the case, you can start with
source patches / configuration changes introduced betweeen
kernel-2.6.24.3-50.fc8 and kernel-2.6.24.4-64.fc8 to locate the culprit.
Comment 17 Naveed Hasan 2008-05-27 16:46:01 EDT
After looking at your attachments and comparing it to my system, two devices pop
out as potential suspects -

1) ALi Corporation M5253 P1394 OHCI 1.1 Controller / ALi Corporation USB 2.0
Controller / ALi Corporation USB 1.1 Controller

I have this exact device listed. My PCI card is a generic combo USB2 / FW1 in
which the Firewire component has NEVER worked in Fedora. The USB ports work fine.

2) AVerTV A801

I have a WinTV PVR 500 PCI card which shows up as device "Internext Compression
Inc iTVC16 (CX23416) MPEG-2 Encoder" and works perfectly with the built in
kernel modules and a downloaded firmware. This device has some module overlap
with your DVB device, i.e. i2c_i801, i2c_core, nvidia.
Comment 18 J 2008-05-28 16:31:51 EDT
I think your possibly on to something Naveed. I've tried building the kernel
with IC2_PIIX4 and that made no difference (the first change after 50.fc8 rpm).

So I've had a luck through the two config files I've used. See attached - there
is at least one difference for the OHCI:

# IEEE 1394 (FireWire) support
CONFIG_FIREWIRE_OHCI_DEBUG=y

Fedora have debug turned on?
Comment 19 J 2008-05-28 16:33:00 EDT
Created attachment 306983 [details]
diff between 2.6.24.7 stock kernel and 2.6.24.7-92.fc8 config files
Comment 20 J 2008-05-28 16:34:00 EDT
I'll try turning the option on and building a new kernel and see if it breaks it!
Comment 21 J 2008-05-31 04:48:26 EDT
(In reply to comment #20)
> I'll try turning the option on and building a new kernel and see if it breaks it!

The option isn't in the stock kernel! No modifying the lastest fedora src kernel
and removing the patch relating to firewire git update.
Comment 22 J 2008-06-07 10:51:16 EDT
Tried many build options can't track down the problem. It is something with
Fedora as it isn't in the stock kernel.

I've just tried in using kernel-2.6.25.4-10 and the first boot failed jammed at
the swap issue the second boot worked after hanging for ages at the HAL startup.

I had a working keyboard was able to login as root to build the nvidia drivers
but they failed to build - something I had experienced with the same kernel from
the kernel.org website. BUT just after that the keyboard stopped responding again.

This:

Jun  7 15:16:13 xray acpid: client connected from 3087[68:68]
Jun  7 15:16:56 xray kernel: 
Jun  7 15:16:56 xray kernel: floppy driver state
Jun  7 15:16:56 xray kernel: -------------------
Jun  7 15:16:56 xray kernel: now=4294777408 last interrupt=4294697384 diff=80024
last called handler=ffffffff880cda82
Jun  7 15:16:56 xray kernel: timeout_message=lock fdc
Jun  7 15:16:56 xray kernel: last output bytes:
Jun  7 15:16:56 xray kernel: 8 80 4294687064
Jun  7 15:16:56 xray kernel: 8 80 4294687064
Jun  7 15:16:56 xray kernel: 8 80 4294687064
Jun  7 15:16:56 xray kernel: 8 80 4294687064
Jun  7 15:16:56 xray kernel: 12 80 4294697378
Jun  7 15:16:56 xray kernel: 0 90 4294697378
Jun  7 15:16:56 xray kernel: 13 90 4294697378
Jun  7 15:16:56 xray kernel: 0 90 4294697378
Jun  7 15:16:56 xray kernel: 1a 90 4294697378
Jun  7 15:16:56 xray kernel: 0 90 4294697378
Jun  7 15:16:56 xray kernel: 3 90 4294697378
Jun  7 15:16:56 xray kernel: c1 90 4294697378
Jun  7 15:16:56 xray kernel: 10 90 4294697378
Jun  7 15:16:56 xray kernel: 7 90 4294697378
Jun  7 15:16:56 xray kernel: 0 90 4294697378
Jun  7 15:16:56 xray kernel: 8 81 4294697378
Jun  7 15:16:56 xray kernel: f 80 4294697379
Jun  7 15:16:56 xray kernel: 0 90 4294697379
Jun  7 15:16:56 xray kernel: 1 91 4294697379
Jun  7 15:16:56 xray kernel: 8 81 4294697384
Jun  7 15:16:56 xray kernel: last result at 4294697384
Jun  7 15:16:56 xray kernel: last redo_fd_request at 4294697404
Jun  7 15:16:56 xray kernel: 20  1 
Jun  7 15:16:56 xray kernel: status=80
Jun  7 15:16:56 xray kernel: fdc_busy=1
Jun  7 15:16:56 xray kernel: floppy_work.func=ffffffff880cb871
Jun  7 15:16:56 xray kernel: cont=ffffffff880d2b40
Jun  7 15:16:56 xray kernel: current_req=0000000000000000
Jun  7 15:16:56 xray kernel: command_status=-1
Jun  7 15:16:56 xray kernel: 
Jun  7 15:16:56 xray kernel: floppy0: floppy timeout called
Jun  7 15:16:56 xray kernel: floppy.c: no request in request_done

Proceeded the crash by about 13 minutes judging by the shutdown stamp.

What is going on with Fedora kernels? I now have too headaches, a crappy fedora
kernel and if I use the 25 series kernel the inability to build my nvidia drivers!
Comment 23 J 2008-06-07 11:02:09 EDT
"25 series kernel the inability to build my nvidia drivers!"

This maybe resolved with the new Nvidia download assuming my keyboard works long
enough to type?
Comment 24 Naveed Hasan 2008-06-07 16:10:14 EDT
One way to rule out a ALi Corporation M5253 P1394 OHCI 1.1 Controller issue is
to boot a Fedora 2.6.24.4 or newer kernel with the card out of the machine... I
haven't had a chance to try this yet. I couldn't figure out a kernel option like
nousb for Firewire (Turbolinux appears to have no1394) to do this without
altering hardware.
Comment 25 J 2008-06-08 15:05:33 EDT
I appreciate the commnet but rather preversely I don't have the time to strip
the card out as I use the PC all the time and it is easy to leave a kernel
building whilst working on other stuff.

I think but I'm NOT 100% certain that I've tracked it down to the following patches:
# Various low-impact patches to aid debugging.
# ApplyPatch linux-2.6-debug-sizeof-structs.patch
# ApplyPatch linux-2.6-debug-nmi-timeout.patch
# ApplyPatch linux-2.6-debug-taint-vm.patch
# ApplyPatch linux-2.6-debug-spinlock-taint.patch
%if !%{debugbuildsenabled}
ApplyPatch linux-2.6-debug-no-quiet.patch
%endif
# ApplyPatch linux-2.6-debug-vm-would-have-oomkilled.patch

I've taken more out and I've been slowly putting them back in. I'm currently
testing the lirc patch but so far the above patches when removed have always
allowed my machine to boot with the 2.6.25.4-10.fc8-local kernel.  The last
kernel I made included:

ApplyPatch linux-2.6-firewire-git-update.patch
ApplyPatch linux-2.6-firewire-git-pending.patch

Which I thought was going to be the killer judging by our previous comunications
but it would appear NOT to be the case.  Although adding them did slow down the
HAL startup and gave me strange PID failures on reboot?

The stock kernel.org (vanilla) kernel works fine I've tested that previously so
hopefully I can narrow it down somemore and have an answer once and for all.

To recap the stock Fedora 2.6.25.4-10.fc8 kernel does NOT work and if I can get
it past hanging at enabling fstab SWAP I loose the keyboard shortly after
logging in.
Comment 26 Chuck Ebbert 2008-06-08 23:39:39 EDT
(In reply to comment #25)
> I appreciate the commnet but rather preversely I don't have the time to strip
> the card out as I use the PC all the time and it is easy to leave a kernel
> building whilst working on other stuff.
> 

You can just disable the driver for any card by renaming it.

1. Find the driver under /lib/modules/<version>
2. Rename it, for example rename fw-ohci.ko to fw-ohci.ko.disabled

If you remove the kernel with rpm or yum that file will be left behind, so
renaming everything back before deleting/upgrading the kernel would be a good idea.
Comment 27 J 2008-06-09 18:17:22 EDT
Unfortunately adding the LIRC patch back into the kernel caused the kernel to
behaviour badly.  Then going to some of the earlier kernels I found that the
issue did seem to be present?  I'll need to check my notes to find which kernel
had which patches and then try different boot-ups to see wheather it is related
to cold or warm reboots.
Comment 28 Naveed Hasan 2008-06-09 23:55:04 EDT
(In reply to comment #26)
> You can just disable the driver for any card by renaming it.
> 
> 1. Find the driver under /lib/modules/<version>
> 2. Rename it, for example rename fw-ohci.ko to fw-ohci.ko.disabled

Great! This gets me past the initial hang that was occurring with any released
kernel newer than kernel-2.6.24.3-50. As I suspected, the ALi Corporation M5253
P1394 OHCI 1.1 Controller is (at least one of) the culprit(s) on my desktop
machine. The offending driver file is
file/lib/modules/2.6.25.4-10.fc8/kernel/drivers/firewire/firewire-ohci.ko and if
I rename it, the system gets past "Enabling /etc/fstab swaps: OK" and completes
booting into runlevel 5.

The very first time I did this, however, the keyboard stopped responding
completely in X within a matter of minutes. The mouse could be moved, but the
already logged in session would not accept clicks or any sort of keystrokes. I
had to login from another machine and telinit 3 -> telinit 5. So I'm not sure if
the problem is completely gone because this issue was probably not just a
coincidence. As of now, I am still able to type. I will monitor to see what
might be triggering the input problem in newer kernels.

Thanks for your help Chuck & J.
Comment 29 Jarod Wilson 2008-06-10 15:41:52 EDT
One thing that could help is to enable firewire debugging messages, to see what
sort of state the firewire driver/controller is in. Adding 'options
firewire-ohci debug=7' to /etc/modprobe.conf should turn on the relevant bits.
However, if you're still getting lockups without the firewire drivers even
loading, I'd hesitate to blame firewire. It could certainly help expose the
problem sooner though, if the card is particularly chatty on initialization...
Comment 30 Stefan Richter 2008-06-10 16:40:13 EDT
Re comment #18:  This is a blind Kconfig variable which is always turned on,
unless manually disabled by means other than the usual configuration generators.
The option adds some code to the firewire-ohci driver but does not influence its
default runtime behavior.  It adds the implementation of the option which Jarod
mentions in comment #29.
Comment 31 Naveed Hasan 2008-06-13 17:10:38 EDT
(In reply to comment #29)
> However, if you're still getting lockups without the firewire drivers even
> loading, I'd hesitate to blame firewire. It could certainly help expose the
> problem sooner though, if the card is particularly chatty on initialization...

This was a false alarm related to X and nVidia binary driver setup. There have
been no lockups since I renamed the firewire-ohci.ko file.
Comment 32 Naveed Hasan 2008-06-13 17:20:14 EDT
(In reply to comment #29)
> One thing that could help is to enable firewire debugging messages, to see what
> sort of state the firewire driver/controller is in. Adding 'options
> firewire-ohci debug=7' to /etc/modprobe.conf should turn on the relevant bits.

I have since upgraded to Fedora 9 (which required a nofirewire kernel argument,
btw) and am now experience exactly the same symptoms as the original bug report.
With the latest unmodified released kernel 2.6.25.6-55.fc9.x86_64, my PS/2
keyboard stops responding during boot up. This happens when the system is
"Starting udev" and before it says "[OK]" for that. I cannot type anything after
that, BUT I can Ctrl-Alt-Delete on one of the ttys to trigger a reboot. Which is
very strange.

Adding the specified line to /etc/modprobe.conf has not yielded any new
information in /var/log/messages. I get one of two things re: firewire during
bootup -

firewire_ohci: Added fw-ohci device 0000:04:01.4, OHCI version 1.10
firewire_core: created device fw0: GUID 0090e639000000f4, S400
firewire_core: phy config: card 0, new root=ffc1, gap_count=5

OR

firewire_ohci: Added fw-ohci device 0000:04:01.4, OHCI version 1.10
firewire_core: IRM has link off, making local node (ffc0) root.
firewire_core: phy config: card 0, new root=ffc0, gap_count=5

In both cases, the keyboard is not working when it reaches the login phase.

So now, instead of renaming the firewire-ohci.ko file, I have added 'blacklist
firewire-ohci' to my modprobe.conf file and can boot successfully in the newest
kernel without messing around with renaming. The 'ALi Corporation M5253 P1394
OHCI 1.1 Controller' of course, does not work.
Comment 33 Stefan Richter 2008-06-13 18:10:15 EDT
I suppose you don't get any extra messages despite fw-ohci's debug=7 because the
syslogd or/and console aren't set up to capture kern.debug level messages.

Just an observation though:  There are two nodes on the FireWire bus right from
the start. Is there anything connected, e.g. an internal hub?  (Or maybe the
card has got two PHYs.)
Comment 34 Stefan Richter 2008-06-14 03:03:05 EDT
To capture kernel debug messages at least on the console,
# dmesg -n 8
is necessary.  I don't know what the equivalent for Fedora log files is and
whether Fedora boot screens show console messages.
Comment 35 Stefan Richter 2008-06-15 17:13:52 EDT
My guess would be that the AT-request DMA of the controller is inoperable,
fw_send_phy_config() waits indefinitely, and thus blocks the events kernel thread.
Comment 36 J 2008-06-17 11:53:40 EDT
Why is this only present in the FC8 version of the kernel though. The vanilla
versions from kernel.org work perfectly?
Comment 37 Jarod Wilson 2008-06-17 12:10:18 EDT
Newer firewire stack in play. Fedora kernels have been tracking the upstream
linux1394-2.6.git tree, which in the majority of cases, performs better than
what's been pushed to linus' tree, but alas, not always...
Comment 38 Stefan Richter 2008-06-17 13:07:33 EDT
>> Why is this only present in the FC8 version of the kernel though.
>> The vanilla versions from kernel.org work perfectly?
>
> Newer firewire stack in play. Fedora kernels have been tracking the
> upstream linux1394-2.6.git tree, which in the majority of cases,
> performs better than what's been pushed to linus' tree, but alas,
> not always...

I suspect that the reason for this regression is the patch "firewire: wait until
PHY configuration packet was transmitted (fix bus reset loop)",
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2a0a2590498be7b92e3e76409c9b8ee722e23c8f
which, as you can see, is now present in vanilla too (since 2.6.26-rc1 IIRC).

I think we have to
  1. improve fw_send_phy_config() to time out if necessary,
  2. fix AT DMA (or probably PHY initialization) on ALi M5253.

Patch for 1. comes in an hour or so.  Patch for 2. will depend on Jarod or me
getting our hands on an ALi controller with the same problem and being able to
spend some time for experiments with it.  This controller has also been
problematic with the older linux1394 stack in kernel 2.4 and 2.6.  Today we had
another report which is presumably about ALi M5253 too:
http://marc.info/?l=linux1394-user&m=121366127230497
Comment 39 Naveed Hasan 2008-06-17 13:23:05 EDT
(In reply to comment #33)
> Just an observation though:  There are two nodes on the FireWire bus right from
> the start. Is there anything connected, e.g. an internal hub?  (Or maybe the
> card has got two PHYs.)

Those two sets of messages are from *different* bootups. I just pulled them out
of /var/log/messages later when I had firewire-ohci blacklisted. So it's
interesting to note that the current state of the driver in Fedora yields
different messages on separate powerons with the M5253. It always messes up the
keyboard unless the driver is blacklisted.

My card has two Firewire 400 ports and two USB 2 ports. One of the Firewire
ports is connected to an external Firewire / USB hub, if that helps. Nothing on
that bus works, however the USB parts of the card work fine.
Comment 40 Naveed Hasan 2008-06-17 13:27:49 EDT
(In reply to comment #25)
> The last kernel I made included:
> 
> ApplyPatch linux-2.6-firewire-git-update.patch
> ApplyPatch linux-2.6-firewire-git-pending.patch
> 
> Which I thought was going to be the killer judging by our previous comunications
> but it would appear NOT to be the case.  Although adding them did slow down the
> HAL startup and gave me strange PID failures on reboot?

J, does blacklisting firewire-ohci with the Fedora kernel resolve your problems?
Comment 41 Stefan Richter 2008-06-17 14:16:26 EDT
proposed fix for part of the problem:
patch "firewire: deadline for PHY config transmission"
http://marc.info/?l=linux1394-devel&m=121372642105480
Comment 42 Jarod Wilson 2008-06-17 16:01:53 EDT
I will attempt to put a test kernel together this evening and post it somewhere
for folks to try out. Also planning to acquire an ALi card to beat on myself...
Comment 43 J 2008-06-17 18:13:33 EDT
(In reply to comment #40)
> J, does blacklisting firewire-ohci with the Fedora kernel resolve your problems?

Yep.  With the line 

blacklist firewire-ohci 

added to my /etc/modprobe.d/blacklist file I am now able to boot with keyboard
into the stock FC8 kernel version 2.6.25.4-10.fc8 which is the first time.

Going through my tests I found that I was able to achieve the same boot when I
commented out the following patches and built my own version.

# Various low-impact patches to aid debugging.
# 1 ApplyPatch linux-2.6-debug-sizeof-structs.patch
# 2 ApplyPatch linux-2.6-debug-nmi-timeout.patch
# 3 ApplyPatch linux-2.6-debug-taint-vm.patch
# 4 ApplyPatch linux-2.6-debug-spinlock-taint.patch
%if !%{debugbuildsenabled}
ApplyPatch linux-2.6-debug-no-quiet.patch
%endif
# 5 ApplyPatch linux-2.6-debug-vm-would-have-oomkilled.patch

# http://www.lirc.org/
# 6 ApplyPatch linux-2.6-lirc.patch

# FireWire updates and fixes
# snap from http://me.in-berlin.de/~s5r6/linux1394/updates/
# 7 ApplyPatch linux-2.6-firewire-git-update.patch
# 8 ApplyPatch linux-2.6-firewire-git-pending.patch

As it shows the firewire patches are commented out and in later versions where I
put them back in I got "intermittent" results.

Thanks guys for your help in resolving this.  Give me a URL and I'm happy to
test out the kernel I've got about 5 versions of the fc8 one and a few vanilla
ones, another isn't going to hurt.  I can start removing them now!

I also noticed differences between cold and warm boots is this in line with the
suggested bug?  Again things were intermittent.

All the best.

J
Comment 44 Stefan Richter 2008-06-17 19:13:57 EDT
> I also noticed differences between cold and warm boots is this in line with
> the suggested bug?  Again things were intermittent.

Interesting observation.  Yes, cold boot vs. warm boot may make a difference
with controllers like this, as one of the FireWire old-timers reminded us yesterday:
http://marc.info/?l=linux1394-user&m=121366306632205

(But since this line of controllers seems to work with Windows, we may get it
working in Linux too if we are persistent enough.)

Since you already compile your own kernels, you could reactivate all the
firewire patches and then apply the new one from comment #41 on top of it.  OTOH
Jarod will surely supply you with a test kernel soon.  Either way, keep
firewire-ohci blacklisted for now and only load it manually with modprobe when
you are ready to crash your PC (i.e. after running "sync" and all that).

To be clear about it, the patch in comment #41 is only an attempt to fix the
keyboard lock-up.  It in itself will probably not be sufficient to get the
FireWire functionality fully working.
Comment 45 Jarod Wilson 2008-06-18 00:15:33 EDT
Sorry for the delay, just got an x86_64 test build w/the patch from comment #41
together and stashed up here:

http://people.redhat.com/jwilson/kernels/2.6.25.7-64.fw.fc9/

Should work equally well with F8, though it might complain about mkinitrd
versioning... If so, ought to be perfectly safe to override and ignore that.
Comment 46 Stefan Richter 2008-06-18 12:33:34 EDT
I verified the patch with ALi M5271.  (lspci says M5253.)

I also posted an update which improves the patch for working controllers but is
effectively unchanged for non-working controllers.
http://marc.info/?l=linux1394-devel&m=121380606431945
Comment 47 J 2008-06-19 15:53:26 EDT
Created attachment 309878 [details]
lspci -vvv on firewire patched kernel 2.6.25.4-10

This is the lspci -vvv taken upon booting into the stock fedora 8 2.6.25.4-10
kernel with the patch from the post #46 included after the two existing fedora
git firewire patched.
Comment 48 J 2008-06-19 15:58:41 EDT
The patched version of kernel-2.6.25.4-10 using the existing fc8 src rpm and
adding the patch from #46 after the existing fedora firewire git patches works
great in that the system is able to boot fully with a fully functioning keyboard
with recourse to blacklisting the firewire-ochi module.

lspci -vvv posted in #47.
03:03.4 FireWire (IEEE 1394): ALi Corporation M5253 P1394 OHCI 1.1 Controller
(prog-if 10 [OHCI])

Great work.

The F9 kernel indeed did moan about mkinitrd and since I had the FC8 source I
jsut pathced that. Will this make it into the next FC8 update-testing kernel?

Cheers for all your help.
J
Comment 49 Stefan Richter 2008-06-21 05:30:03 EDT
Fixed upstream in kernel 2.6.26-rc7.  Thanks J and Naveed, your debugging work
made the fix possible.

The FireWire part of ALi M5253 and other M52xx is still inoperable
(http://bugzilla.kernel.org/show_bug.cgi?id=10935), but at least it doesn't kill
other functionality of the kernel anymore.
Comment 50 J 2008-06-22 08:53:51 EDT
Hi,

I applied the patch to the latest fedora kernel 2.6.25.6-27 src and compiled a
new kernel. It worked with no keyboard problems but today I noticed this in my
system log.  Everything seems to be working that I normally use so I'm not
completey certain what triggered this?

Jun 22 13:43:30 xray kernel: firewire_core: giving up on config rom for node id ffc1
Jun 22 13:43:30 xray kernel: firewire_core: phy config: card 0, new root=ffc0,
gap_count=5
Jun 22 13:43:30 xray kernel: ------------[ cut here ]------------
Jun 22 13:43:30 xray kernel: WARNING: at drivers/firewire/fw-transaction.c:350
fw_send_phy_config+0xe4/0xf0 [firewire_core]() (Tainted: P        )
Jun 22 13:43:30 xray kernel: Modules linked in: autofs4 w83627hf hwmon_vid
w83792d hwmon fuse sunrpc bonding nf_conntrack_ipv4 ipt_REJECT iptable_filter
ip_tables nf_conntrack_netbios_ns nf_conntrack_ipv6 xt_state nf_conntrack
xt_tcpudp ip6t_ipv6header ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6
dm_mirror dm_multipath dm_mod snd_intel8x0 snd_ac97_codec dvb_pll ac97_bus
nvidia(P)(U) snd_seq_dummy firewire_ohci firewire_core mt2060 snd_seq_oss
snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_mpu401 crc_itu_t
dvb_usb_a800 snd_mpu401_uart dvb_usb_dibusb_common dib3000mc snd_rawmidi snd_pcm
iTCO_wdt e1000 dibx000_common snd_seq_device iTCO_vendor_support via_rhine mii
dvb_usb snd_timer parport_pc button snd i2c_i801 ns558 parport i6300esb dvb_core
snd_page_alloc pcspkr sg soundcore gameport i2c_core shpchp sr_mod floppy cdrom
sata_promise ata_piix pata_acpi ata_generic libata sd_mod scsi_mod ext3 jbd
mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
Jun 22 13:43:30 xray kernel: Pid: 15, comm: events/0 Tainted: P        
2.6.25.6-27.1.fc8 #1
Jun 22 13:43:30 xray kernel: 
Jun 22 13:43:30 xray kernel: Call Trace:
Jun 22 13:43:30 xray kernel: [<ffffffff81032e6c>] warn_on_slowpath+0x60/0x73
Jun 22 13:43:30 xray kernel: [<ffffffff8128b0fb>] ? schedule_timeout+0x98/0xb4
Jun 22 13:43:30 xray kernel: [<ffffffff8103c281>] ? process_timeout+0x0/0xb
Jun 22 13:43:30 xray kernel: [<ffffffff8128b0eb>] ? schedule_timeout+0x88/0xb4
Jun 22 13:43:30 xray kernel: [<ffffffff8128ae2d>] ? wait_for_common+0xff/0x129
Jun 22 13:43:30 xray kernel: [<ffffffff8102ad22>] ? default_wake_function+0x0/0xf
Jun 22 13:43:30 xray kernel: [<ffffffff8823ca8f>]
:firewire_core:fw_send_phy_config+0xe4/0xf0
Jun 22 13:43:30 xray kernel: [<ffffffff8823b8d7>]
:firewire_core:fw_card_bm_work+0x356/0x37b
Jun 22 13:43:30 xray kernel: [<ffffffff811c3d29>] ? usb_start_wait_urb+0xa4/0xb3
Jun 22 13:43:30 xray kernel: [<ffffffff8103c5a9>] ? lock_timer_base+0x26/0x4a
Jun 22 13:43:30 xray kernel: [<ffffffff811c3f32>] ? usb_control_msg+0xd3/0xe5
Jun 22 13:43:30 xray kernel: [<ffffffff810433dd>] ? queue_delayed_work_on+0xae/0xbe
Jun 22 13:43:30 xray kernel: [<ffffffff8104344e>] ? queue_delayed_work+0x46/0x4e
Jun 22 13:43:30 xray kernel: [<ffffffff81043482>] ? schedule_delayed_work+0x2c/0x31
Jun 22 13:43:30 xray kernel: [<ffffffff8817b384>] ?
:dvb_usb:dvb_usb_read_remote_control+0x0/0xc4
Jun 22 13:43:30 xray kernel: [<ffffffff8823b581>] ?
:firewire_core:fw_card_bm_work+0x0/0x37b
Jun 22 13:43:30 xray kernel: [<ffffffff81042bb0>] run_workqueue+0x7d/0x106
Jun 22 13:43:30 xray kernel: [<ffffffff8104355f>] worker_thread+0xd8/0xe5
Jun 22 13:43:30 xray kernel: [<ffffffff81045fcb>] ?
autoremove_wake_function+0x0/0x38
Jun 22 13:43:30 xray kernel: [<ffffffff81043487>] ? worker_thread+0x0/0xe5
Jun 22 13:43:30 xray kernel: [<ffffffff81045e93>] kthread+0x49/0x76
Jun 22 13:43:30 xray kernel: [<ffffffff8100cc78>] child_rip+0xa/0x12
Jun 22 13:43:30 xray kernel: [<ffffffff81045e4a>] ? kthread+0x0/0x76
Jun 22 13:43:30 xray kernel: [<ffffffff8100cc6e>] ? child_rip+0x0/0x12
Jun 22 13:43:30 xray kernel: 
Jun 22 13:43:30 xray kernel: ---[ end trace 00cca12dcbbffe14 ]---
Comment 51 Stefan Richter 2008-06-22 09:28:44 EDT
This is the WARN_ON() in my patch.  It will go away once we figured out how to
make ALi M52xx fully functional.  (bugzilla.kernel.org 10935)

I asked myself whether to log only a single log message or this scary-looking
call trace.  But then I decided to stick with the latter to let it be obvious
that there is still a bug in the driver.  The bug does not reduce the rest of
the system stability though.  Its only effects are:  a) The card simply doesn't
work, b) a small data structure which the firewire-core allocated for temporary
purposes won't be freed anymore until reboot.
Comment 52 Fedora Update System 2008-06-30 12:37:08 EDT
kernel-2.6.25.9-40.fc8 has been submitted as an update for Fedora 8
Comment 53 Fedora Update System 2008-07-01 01:25:30 EDT
kernel-2.6.25.9-40.fc8 has been pushed to the Fedora 8 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update kernel'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F8/FEDORA-2008-5880
Comment 54 Fedora Update System 2008-07-02 23:14:41 EDT
kernel-2.6.25.9-40.fc8 has been pushed to the Fedora 8 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 55 J 2008-07-05 11:39:22 EDT
2.6.25.9-40.fc8 - the first fedora kernel I have been able to use in the last 6
months without any modifications to my modconf or recompiling it myself.

Cheers for all the help.

The kernel looks to be working ok now.

Note You need to log in before you can comment on or make changes to this bug.