From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031114 Description of problem: With kernel-2.4.22-1.2166.nptl the firewire initialization procedure hangs solid a DELL Latitude D800, Centrino 1.7 GHz, while no issue whatsoever is present with 2.4.22-1.2149.nptl Version-Release number of selected component (if applicable): kernel-2.4.22-1.2166.nptl How reproducible: Always Steps to Reproduce: 1.Boot under 2.4.22-1.2166 2.Errors about sbp2 are spit out almost immediately 3.When the boot reaches the initialization of firewire modules the machine hangs solid Actual Results: The machine hangs Expected Results: The boot should go on Additional info: There was no problem with 2.4.22-1.2149, but clearly something has been changed (now with 2166 the kernel seems to "see" the firewire devices almost immediately, failing miserably) If I disconnect all firewire devices (two Lacie hard drives and one Fujitsu magneto-optical drive) the boot doesn't hang. If I reconnect them later, the machine hangs often (not always) when scanning the scsi bus for device recognition (I use rescan-scsi-bus.sh) If I put alias alias ieee1394-controller off inside /etc/modules.conf the boot doesn't hang (-> it doesn't perform the initialization where it hangs) but it still spits error messages about sbp2 at the very beginning and again further scanning of the devices is often troublesome
Further details: setting "alias ieee1394-controller off" doesn't really help too much. 9 out of 10 boots hang when "checking for new hardware"instead of when initializing firewire, not such a big improvement indeed. Clearly kudzu scans the firewire bus and hangs. The only reliable way to boot is: a) physically disconnect the firewire cable b) connect the cable again after boot completion (timed-out and failed reconnect errors all over around, as during the very beginning of the boot process) c) rmmod sbp2 (if not rescan-scsi-bus.sh hangs the machine) d) modprobe sbp2 e) manually echo "scsi add-single-device x y z w" > /proc/scsi/scsi this because the order in which the 3 firewire disks show up is random... (they always showed up with 2149 in the "right" order -> sda <-> the hd connected first, sdb <-> the second etc, while with 2166, after all the mess for booting, there is a random permutation of the three disks) Some sort of (private) reverse patch/indication for bringing 2166 back to 2149 for what concerns firewire would be EXTREMELY WELCOME while waiting for an answer (I have bugzilla reports on previous firewire problems still NEW after years....), I can manage myself to patch and compile a new kernel, 2166 is clearly a no-go for my hardware
a 'reverse patch' isnt quite so simple, as nothing changed in ieee1394 between 2149 and 2166.
I already realized that the issue must be in some other changes (I diff'ed all sources). I am not really expert, excluding video related changes between 2149 and 2166, I do not know which other change can indirectly trigger the problem, if you have some good suggestions about which patches are possible candidates to be removed and/or brought back selectively to the 2149 status, I can try to check if I can trim down the problem. I have no problem in compiling and building modified rpm's for the kernel provided it doesn't imply patching by hand some code. I would like to stress that it is a fully deterministic situation, with the firewire HD's connected no chance to boot the machine. Connecting them after boot and making gymnastics with sbp2 and /proc/scsi/scsi results in a perfectly stable and working system. There were firewire issue all the time with psyche/shrike, but never so serious. 2149 was perfect, 2166 is a nightmare...
I have a similar problem with the kernel-smp-2.4.22-1.2166.nptl.i686. (On an Intel 860 chipset motherboard, dual P4 Xeon 1.7ghz processors). It hangs on boot up if there is something attached to the firewire card ... It hangs (hard lock) if the scsi bus is scanned during operation (whether or not something is attached to the firewire bus). Shifting back to kernel-smp-2.4.22-1.2149.nptl.i686 fixes the problem. I have the proprietary nvidia drivers installed for a GeForce 5200 FX video card.
A few comments related to 2.4.22-1.2173.nptl and 2.4.22-1.2174.nptl 2.4.22-1.2173.nptl : suddenly booting with firewire devices connected works again, no "time-out" or "failed to reconnect" messages, all three HD are recognized and configured. However, using 0-1-2 as the numbering sequence for the devices as they are physically connected (0 being the one connected to the computer, 1 the following one etc), they are recognized as 2-1-0 (2 gets /dev/sda, 1 gets /dev/sdb, 0 gets /dev/sdc). Not a major hassle. Summarizing: 2.4.22-1.2149: no problem at boot, devices configured as 0-1-2, I didn't care whether the usb scsi cdrom appeared on bus 1 and the firewire devices on scsi bus 0 or viceversa (I could check if it is of interest) 2.4.22-1.2166: no chance to boot when firewire devices are connected, connecting them after (with various error messages) they get randomly configured as 2-0-1 or 2-1-0 (never 0-1-2), the external usb cdrom is on scsi bus 0 (0 0 0 0) while the firewire devices are on bus 1 (1 0 x 0, x=0,1,2) 2.4.22-1.2173: no problem at boot, devices configured always (5 boots up to now) as 2-1-0, the external usb cdrom is on scsi bus 1 (1 0 0 0) while the firewire devices are on bus 0 (0 0 x 0, x=0,1,2) 2.4.22-1.2174: ... the machine doesn't boot at all, kernel panic apparently on acpi (see bug 116232)
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/