Bug 271801 - new firewire stack doesn't recognize onboard controller nor external drives
Summary: new firewire stack doesn't recognize onboard controller nor external drives
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 7
Hardware: i686
OS: Linux
high
high
Target Milestone: ---
Assignee: Jarod Wilson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-08-31 15:26 UTC by Roberto Malinverni
Modified: 2008-02-25 19:46 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2008-02-25 18:23:56 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
dmesg output when drives are recognised (2.25 KB, text/plain)
2007-08-31 15:26 UTC, Roberto Malinverni
no flags Details
Full dmesg output from Fedora8 test2 (33.09 KB, text/plain)
2007-09-20 19:56 UTC, Ed Lally
no flags Details
dmesg output from removing/reloading firewire kernel modules (6.52 KB, text/plain)
2007-10-07 06:01 UTC, Ed Lally
no flags Details
Output from ls commands showing corrupted directory listings (4.53 KB, text/plain)
2007-10-07 06:02 UTC, Ed Lally
no flags Details
newer dmesg output from removing/reloading firewire kernel modules (36.92 KB, text/plain)
2008-02-01 02:23 UTC, Ed Lally
no flags Details
lsmod showing loaded modules before/after reloading firewire_ohci (626 bytes, text/plain)
2008-02-01 02:24 UTC, Ed Lally
no flags Details
Buffer IO errors under load (3.06 KB, text/plain)
2008-02-01 12:50 UTC, Ed Lally
no flags Details
dmesg output with koji kernel (73.13 KB, text/plain)
2008-02-03 00:40 UTC, Ed Lally
no flags Details

Description Roberto Malinverni 2007-08-31 15:26:26 UTC
Description of problem:
When using the new firewire stack, at boot time I see a message:
"firewire_core: giving up on config rom for node ..."
No firewire controller appears in the system and when I plug in my external
firewire drive they aren't recognised (dmesg shows absolutely nothing and no
device node is created)

Version-Release number of selected component (if applicable):
kernel 2.6.22.4-65.fc7
 

Additional info:
I installed the "old" firewire kernel module provided by ATRPMS for my kernel.
After blacklisting the new modules and rebooting no error message is shown and
the drives work fine.
The MB is a ASUS P5GD2 Premium, with a firewire onboard controller by Texax
Instruments (hardware browser identifies it as "TI TSB82AA2-1394B link layer
controller", using driver ohci1394).
I attach below the output of dmesg when the drives are plugged in.

Comment 1 Roberto Malinverni 2007-08-31 15:26:27 UTC
Created attachment 183541 [details]
dmesg output when drives are recognised

Comment 2 Ed Lally 2007-09-19 04:14:48 UTC
Still seeing this problem with latest kernel.  This is on a Gigabyte GA-P35-DQ6
motherboard with on-board Firewire to two external drives (a hard drive and DVD
burner).

Output from `uname -a`:
Linux strauss 2.6.22.5-76.fc7 #1 SMP Thu Aug 30 13:08:59 EDT 2007 x86_64 x86_64
x86_64 GNU/Linux

Output from `dmesg | grep firewire` showing connect/reconnect attempts:
firewire_ohci: Added fw-ohci device 0000:05:00.0, OHCI version 1.10
firewire_ohci: Added fw-ohci device 0000:05:06.0, OHCI version 1.10
firewire_core: created new fw device fw0 (0 config rom retries)
firewire_core: created new fw device fw1 (0 config rom retries)
firewire_core: giving up on config rom for node id ffc0
firewire_core: giving up on config rom for node id ffc1
firewire_core: phy config: card 1, new root=ffc0, gap_count=5
firewire_core: giving up on config rom for node id ffc2
firewire_core: phy config: card 0, new root=ffc0, gap_count=63
firewire_core: giving up on config rom for node id ffc1
firewire_core: giving up on config rom for node id ffc2
firewire_core: phy config: card 0, new root=ffc0, gap_count=63
firewire_core: phy config: card 1, new root=ffc2, gap_count=7
firewire_core: giving up on config rom for node id ffc1
firewire_core: giving up on config rom for node id ffc0


My smolt page is
http://smolt.fedoraproject.org/show?UUID=c413b36f-7ba0-405c-ad84-98d4ae3bfb52

Comment 3 Chuck Ebbert 2007-09-19 19:55:51 UTC
Can anyone test with the Fedora8 test2 live CD? This will tell us if the problem
is fixed in kernel 2.6.23.

Comment 4 Pete Zaitcev 2007-09-19 23:16:36 UTC
Another option is to throw a kernel from Rawhide on top of FC-7 (with
rpm --force if needed: should be ok for testing purposes).

Comment 5 Ed Lally 2007-09-20 19:56:58 UTC
Created attachment 201251 [details]
Full dmesg output from Fedora8 test2

Comment 6 Ed Lally 2007-09-20 19:58:03 UTC
Verified problem exists under Fedora8 test2.  uname -a reports:

Linux localhost.localdomain 2.6.23-0.164.rc5.fc8 #1 SMP Tue Sep 4 18:24:12 EDT
2007 x86_64 x86_64 x86_64 GNU/Linux

Please see prior attachment for full dmesg output.


Comment 7 Christopher Brown 2007-10-01 14:45:25 UTC
Hello,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug and will try and assist you in resolving it if I can.

I'm re-assigning to the firewire maintainers who may wish to review it and add
comments. I have also elevated the priority and severity as this could
potentially prevent a successful install of F8 if using external firewire drives.

Cheers
Chris

Comment 8 Stefan Richter 2007-10-01 16:44:02 UTC
Re comment #2 and comment #5:  There appear two PCI/OHCI-1394 devices in the
log.  Is this correct, i.e. are there two controllers in your machine?

Re opening comment:  TSB82AA2 is supported in principle.  I'm successfully using
a PCIe card with this chip and the new driver stack (with a different distro and
various kernel.org prerelease kernels though).  Right now I have no idea what
could cause the local ROM reads to fail.

What if you "modprobe -r firewire-ohci && modprobe firewire-ohci" later after boot?

Comment 9 Ed Lally 2007-10-02 11:36:06 UTC
Hi Chris,

I have your request for add'l info.  I'm currently traveling so it will likely
take a day or so until I can get the info requested.

Regards,

Ed


Comment 10 Ed Lally 2007-10-07 05:58:43 UTC
Hi Chris,

There are indeed two controllers on my machine.  One is the motherboard's
(Gigabyte GA-P35-DQ6 mobo) on-board controller and the other is a PCI card
supporting firewire 800.

I ran the commands you requested.  They commands locked up my USB keyboard for
several minutes, although control finally returned.  Oddly, my USB mouse was not
affected.  I am attaching dmesg output from the modprobe command forward.

After running the command, it looks like I can mount the external hard drive and
view contents, although all accesses (ls on a directory, fdisk -l on the
partition table) are extremely slow -- usually over a minute for initial access.
 In addition, after moving down a few levels into the mounted directory, the
directory appears to be corrupted (please see attachment for that as well).

Regards,

Ed Lally


Comment 11 Ed Lally 2007-10-07 06:01:33 UTC
Created attachment 218561 [details]
dmesg output from removing/reloading firewire kernel modules

Comment 12 Ed Lally 2007-10-07 06:02:29 UTC
Created attachment 218571 [details]
Output from ls commands showing corrupted directory listings

Comment 13 Stefan Richter 2007-10-07 12:51:19 UTC
Ed, you are bitten by a whole swarm of bugs.  Where do I begin?


1.) firewire-core unable to access the cards if the modules are loaded early in
the boot sequence

I don't know why this is and what difference it makes to reload firewire-ohci
later.  I am using Gentoo Linux and they load firewire-ohci (or/and ohci1394,
depending on how I configured the kernel) in one of the init scripts based on
module aliases matching the PCI IDs or whatever.  Works for me with 1, 2, or 3
cards present (onboard 1394a, PCIe 1394b, CardBus 1394a).

So that's still a mystery to me.


2.) firewire-sbp2 blocking keyboard input when trying to add an SBP-2 device

This is fixed in -mm kernels by a patch pending for inclusion into mainline
2.6.24-rc1, "firewire: fw-sbp2: use an own workqueue (fix system responsiveness)".
http://marc.info/?l=linux1394-devel&m=118691816130507
http://me.in-berlin.de/~s5r6/linux1394/updates/2.6.22.y/patches/549-firewire-fw-sbp2-use-an-own-workqueue-fix-system-responsiveness.patch

Besides, not only keyboard input but many other kernel functions, even in
firewire-core, are negatively affected by firewire-sbp2's usage of the shared
workqueue.


3.) "status write for unknown orb" errors

This is fixed in mainline Linux 2.6.23-rc4 by patch "firewire: Add ref-counting
for sbp2 orbs (fix command abortion)"
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e57d2011a6276d55a87f26653a0395f302ce0d51

These errors probably cause the corruption you saw with ls, the "FAT: Filesystem
panic", and perhaps several other errors in your dmesg log.


4.) "scsi scan: 96 byte inquiry failed"

Maybe this too was caused by the "unknown orb" error, or maybe it is a firmware
bug.  In the latter case, the driver can be instructed to use a different flavor
of inquiry: "modprobe firewire-sbp2 workarounds=2" before firewire-sbp2 is
auto-loaded or simply after a "modprobe -r firewire-sbp2".

The workarounds parameter is AFAIK not available in the kernel you are using. 
It is available in -mm kernels by a patch scheduled for 2.4.24-rc1:  "firewire:
fw-sbp2: expose module parameter for workarounds"
http://marc.info/?l=linux1394-devel&m=118691807906588
http://me.in-berlin.de/~s5r6/linux1394/updates/2.6.22.y/patches/548a-firewire-fw-sbp2-expose-module-parameter-for-workarounds.patch

If it is a firmware bug, i.e. only the workarounds parameter suppresses it, then
we should add a respective entry to firewire-sbp2's built-in device blacklist,
based on the dmesg output you would get with the parameter activated.

Jay, it may be appropriate to pull all the firewire-sbp2 fixes which went into
mainline during the 2.6.23-rc phase as well as those scheduled for 2.6.24-rc1
into the FC8 kernel, as far as they aren't already in there.  They may even be
OK for an FC7 kernel update, as long as you are still releasing those.  I didn't
push those 2.6.24-rc1 fixes to Linus already before 2.6.23 because their line
count and age seemed inappropriate to me for the late 2.6.23-rc phase. 
Maintaining the old ieee1394 drivers made me somewhat cautious about the speed
of mainline inclusion of bug fixes.  Have a look at
http://me.in-berlin.de/~s5r6/linux1394/updates/ for series of pending patches.

Comment 14 Stefan Richter 2007-10-07 12:56:23 UTC
PS, about "scsi scan: 96 byte inquiry failed":  If this error does not occur
with the old drivers, then it is not a firmware bug but a mere I/O error, i.e.
fallout from "status write for unknown orb".

Comment 15 Ed Lally 2007-10-21 01:54:56 UTC
Hi Chris and Stefan,

My apologies for the delayed reply to your notes.  Thanks for taking the time to
help identify my "swarm" -- I believe the metaphor "stung to death by gnats" may
be apropos here ;-)

At this point they are an inconvenience but not a showstopper.  I am content to
hold until the 2.6.24 kernels make it into FC7 or FC8 (as appropriate).

Thanks again for all your help.

Best regards,

Ed


Comment 16 Stefan Richter 2007-10-21 09:45:21 UTC
> At this point they are an inconvenience but not a showstopper.
> I am content to hold until the 2.6.24 kernels make it into FC7
> or FC8 (as appropriate).

Well, as mentioned, I recommend to the Fedora kernel package managers that they
pull all of the 2.6.24-rc1 changes to the firewire drivers (except 2.6.24
specific kernel API changes of course) over into the 2.6.23 based Fedora
kernels.  I would have sent almost all of those changes to Linus before his
2.6.23 release if I had anticipated how long the 2.6.23-rc phase would stretch.

Comment 17 Roberto Malinverni 2007-12-05 21:26:43 UTC
The issue persist with F8, kernel 2.6.23.1-49.fc8; I can access my firewire HD
only with the modules from ATRPMS.

Comment 18 Christopher Brown 2008-01-03 22:54:35 UTC
(In reply to comment #17)
> The issue persist with F8, kernel 2.6.23.1-49.fc8; I can access my firewire HD
> only with the modules from ATRPMS.

2.6.24 is almost upon us and as Stephan has indicated contains a raft of
updates. Please could you test with this when it arrives (or even with a rawhide
kernel if you are able) and report back.

Regards
Chris

Comment 19 Ed Lally 2008-01-10 22:18:18 UTC
Chris,

I'm now on Fedora 8 with kernel 2.6.23.9-85.fc8.  I will test with 2.6.24 as 
soon as I can get it.

Cheers,

Ed


Comment 20 Jarod Wilson 2008-01-18 19:50:00 UTC
The latest koji F8 kernel is also worth trying, and carries the same firewire
updates as rawhide kernels. As of right now, that would be:

http://koji.fedoraproject.org/packages/kernel/2.6.23.14/111.fc8/

Comment 21 Bastien Montagne 2008-01-30 14:50:42 UTC
I've same types of problems with Fedora 8 ("DVD" release, i386):
Nothing at boot time, but here are "dmesg" logs when pluging in my camera:

#When pluging in the camera on my Pinnacle DV500+ (dmesg):
[fedora@fedora ~]$ dmesg 
[...]
firewire_ohci: node ID not valid, new bus reset in progress
firewire_ohci: node ID not valid, new bus reset in progress
firewire_core: created new fw device fw2 (0 config rom retries, S100)
firewire_core: phy config: card 1, new root=ffc1, gap_count=5
firewire_core: BM lock failed, making local node (ffc0) root.
firewire_core: phy config: card 1, new root=ffc0, gap_count=5
firewire_core: phy config: card 1, new root=ffc1, gap_count=5
firewire_core: giving up on config rom for node id ffc1

#When pluging in the camera on my Asus Nvidia Nforce2 board (dmesg):
[fedora@fedora ~]$ dmesg 
[...]
firewire_core: BM lock failed, making local node (ffc0) root.
firewire_core: phy config: card 0, new root=ffc0, gap_count=5
firewire_core: BM lock failed, making local node (ffc0) root.
firewire_core: phy config: card 0, new root=ffc0, gap_count=5
firewire_core: BM lock failed, making local node (ffc0) root.
firewire_core: phy config: card 0, new root=ffc0, gap_count=5
firewire_core: BM lock failed, making local node (ffc0) root.
firewire_core: phy config: card 0, new root=ffc0, gap_count=5
firewire_core: BM lock failed, making local node (ffc0) root.
firewire_core: phy config: card 0, new root=ffc0, gap_count=5
firewire_core: BM lock failed, making local node (ffc0) root.
firewire_core: phy config: card 0, new root=ffc0, gap_count=5
firewire_core: BM lock failed, making local node (ffc0) root.
firewire_core: phy config: card 0, new root=ffc0, gap_count=5
firewire_core: BM lock failed, making local node (ffc0) root.
firewire_core: phy config: card 0, new root=ffc0, gap_count=5
#... and so on, ending with a system freeze!


Now, various tests with both dvgrab and a GStreamer setting:

#First test, with dvgrab:

[root@fedora fedora]# dvgrab -i
Found AV/C device with GUID 0x00008500008cec1e
ioctl call failed, retval = -1
ieee1394io.cc:460: In function "virtual bool iec61883Reader::StartReceive()":
"iec61883_dv_fb_start( m_iec61883.dv, channel )" evaluated to -1
"Loading Medium" ff:ff:ff:ff ""          sec      

#and dvgrab can't capture anything nor quit (I have to close the console...),
#It even freezes all the system when I unplug the camera (I have to do a RESET!).


#I also tested with GStreamer (after a reset, with same dmesgs):

[root@fedora fedora]# gst-launch dv1394src ! decodebin name=d ! queue !
audioconvert ! audioresample ! alsasink d. ! ffmpegcolorspace ! xvimagesink
Setting pipeline to PAUSED ...
ioctl call failed, retval = -1
ERROR: Pipeline doesn't want to pause.
ERROR: from element /pipeline0/dv1394src0: Could not read from resource.
Additional debug info:
gstdv1394src.c(866): gst_dv1394src_start (): /pipeline0/dv1394src0:
can't start 1394 iso receive
Setting pipeline to NULL ...
FREEING pipeline ...

#No exit problems here...

I will test with the new kernel...

Bastien.

Comment 22 Jarod Wilson 2008-01-30 15:11:24 UTC
Hi Bastien (and others!),

The 'giving up on config rom' problem should be fixed in rawhide now. I still
need to backport the fixes to F8 and F7 though. Not sure what the deal is with
the nforce2 system, but it would definitely be worth testing the latest Fedora 8
kernel (or even better, a rawhide kernel), as well as updating misc userspace
bits (particularly dvgrab and libraw1394). We've done a lot of work on this
front just recently to greatly improve the situation, need to know if we've
still got more work to do here or if we've already fixed your problem...

Comment 23 Stefan Richter 2008-01-30 16:17:59 UTC
nForce2: bug 244576

Comment 24 Jarod Wilson 2008-01-30 16:29:38 UTC
Ah yes, I thought I'd seen that before... :)

Comment 25 Ed Lally 2008-02-01 02:22:14 UTC
I've updated to a newer kernel: Linux strauss 2.6.23.14-107.fc8 #1 SMP Mon Jan
14 22:07:11 EST 2008 x86_64 x86_64 x86_64 GNU/Linux

Three of the errors in comment #13 have been resolved -- "firewire-sbp2 blocking
keyboard input when trying to add an SBP-2 device", "status write for unknown
orb", and "scsi scan: 96 byte inquiry failed".

Per Jarod's suggestion, I tried the koji F8 kernel, but realized that I still
got the problem resolved with the earlier kernel from fedora-updates and went back.

The only issue remaining is the first one -- "firewire-core unable to access the
cards if the modules are loaded early in the boot sequence", which impacts an
external hard drive and external CD-RW drive.

I put the command "modprobe -r firewire-ohci && modprobe firewire-ohci" in
/etc/rc.local to no effect.  However, if I run the same command after logging in
to GNOME, it works just fine -- the drives are recognized and mounted under
/media.  

I am attaching outputs from lsmod before and after reloading the firewire
modules.  I am also attaching dmesg output from startup through accessing the
drives.  Also, if it helps, my smolt page is at
http://www.smolts.org/show?UUID=c413b36f-7ba0-405c-ad84-98d4ae3bfb52

Please let me know if there's anything else I can try.

Thanks!

- Ed


Comment 26 Ed Lally 2008-02-01 02:23:33 UTC
Created attachment 293681 [details]
newer dmesg output from removing/reloading firewire kernel modules

Comment 27 Ed Lally 2008-02-01 02:24:38 UTC
Created attachment 293682 [details]
lsmod showing loaded modules before/after reloading firewire_ohci

Comment 28 Ed Lally 2008-02-01 12:49:21 UTC
Take back my earlier report...  I did some load testing by rsync'ing a directory
from another computer and ran into a bunch of buffer IO errors within a few
seconds.  I've attached dmesg output.  I'll try moving back up to the latest
koji kernel to see if that fixes the problem.

Comment 29 Ed Lally 2008-02-01 12:50:06 UTC
Created attachment 293720 [details]
Buffer IO errors under load

Comment 30 Ed Lally 2008-02-03 00:38:56 UTC
I'm having problems even with koji kernel "Linux strauss 2.6.23.14-123.fc8 #1
SMP Fri Jan 25 19:54:41 EST 2008 x86_64 x86_64 x86_64 GNU/Linux".  

I'm testing the drive by rsyncing a directory from "bach" to the server
"strauss" (the one that has the firewire drive) over the LAN.  The rsync moves
along just fine for a while, but then pauses for about 30 seconds with no
apparent LAN or disk activity.  I get I/O errors followed by the message
"kernel: bad page state in process 'swapper'" appearing on the console. 
Sometime later, the computer with the drive will invariably crash (screen,
keyboard, and network all go dead) and require a reboot.

Also, the drive is still not recognized at boot -- I have to execute "modprobe
-r firewire-ohci && modprobe firewire-ohci" to have them detected.

Dmesg output is attached.

Comment 31 Ed Lally 2008-02-03 00:40:01 UTC
Created attachment 293810 [details]
dmesg output with koji kernel

Comment 32 Jarod Wilson 2008-02-04 15:01:58 UTC
Hi Ed,

From your dmesg output, it looks like the latest rawhide/devel kernel might get
your disks working on boot, as you're hitting the 'giving up on config rom'
problem, detailed in bug 429598. Please give that a spin and report back, and/or
wait until I get the backports to the F8 kernel done...

Comment 33 Stefan Richter 2008-02-04 17:28:33 UTC
Re attachment 293810 [details]:
> Feb  2 19:25:20 strauss kernel: sd 15:0:0:0: [sde] Result:
> hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK,SUGGEST_OK
> Feb  2 19:25:20 strauss kernel: end_request: I/O error, dev sde,
> sector 38971935
> Feb  2 19:25:20 strauss kernel: sd 15:0:0:0: rejecting I/O to offline device
> Feb  2 19:25:20 strauss kernel: sd 15:0:0:0: [sde] Result:
> hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK

DID_BUS_BUSY typically happens when a bus reset occurs.
DID_NO_CONNECT happens when the device was unplugged.

Well, you apparently did not unplug it, but there might have been noise on the
bus which inspired the controller to send a "self ID complete" event to the
drivers, without self ID of the disk --- or with firewire-core misinterpreting
the self ID buffer.

I saw something similar infrequently happen on my test setup:  When I plugged
something in to a bus with already a few nodes present, firewire-core
misinterpreted this as an existing device going away, rather than a new one
joining the bunch.

Comment 34 Ed Lally 2008-02-08 04:23:56 UTC
Hi folks,

I loaded up the latest rawhide kernel and the drives were detected on boot --
woohoo!

Unfortunately the other problems with I/O buffers, etc., are still there.

Regarding Stefan's suggestion, both the drives in question (external CD burner
and external HD) are on two separate firewire buses, and each is the only device
on its bus.  The HD is attached to the motherboard's bus; the burner is attached
to a TI firewire 800 PCI card.

Please let me know if there's anything else I can try to work around or
troubleshoot this.

Cheers,

Ed


Comment 35 Jarod Wilson 2008-02-08 18:59:18 UTC
Ed, exactly what kernel version was that with? I suspect some additional patches
we have queued up for rawhide, which haven't yet been in a build due to some
issues with gcc 4.3, might further help your situation.

Comment 36 Ed Lally 2008-02-08 20:20:16 UTC
Jarod -- it's 2.6.24-17.fc9.  Architecture is x86_64.

Comment 37 Stefan Richter 2008-02-08 20:44:20 UTC
Patches "firewire: fw-sbp2: fix I/O errors during reconnect" and "firewire:
fw-sbp2: preemptively block sdev" may be beneficial to Ed's setup.

I suspect the ultimate problem is electrically unstable hardware here, but the
patches should make things smoother even for unreliable hardware.

The issue described in http://marc.info/?l=linux1394-devel&m=120237058319592
needs to be addressed eventually as well.  It is hopefully not of immediate
importance to Ed's setup though.

Comment 38 Stefan Richter 2008-02-08 20:48:26 UTC
Reference for comment #37:  http://lkml.org/lkml/2008/2/3/195

Comment 39 Chuck Ebbert 2008-02-08 21:16:11 UTC
(In reply to comment #36)
> Jarod -- it's 2.6.24-17.fc9.  Architecture is x86_64.

-23 is the latest.

Comment 40 Bastien Montagne 2008-02-13 18:56:53 UTC
Hi every body!

I tried with Kernel 2.6.24-7.fc9:

#at start-up (dmesg):

firewire_ohci: Added fw-ohci device 0000:00:0d.0, OHCI version 1.10-
firewire_ohci: Added fw-ohci device 0000:02:0c.0, OHCI version 1.0
firewire_core: created new fw device fw0 (0 config rom retries, S400)
firewire_core: created new fw device fw1 (0 config rom retries, S400)

#when pluging in the camera (motherboard, nForce2), I still have the
#same problem (endless messages, and final freeze)

#when pluging in the camera (DV500+, after a "reset", dmesg):

firewire_ohci: node ID not valid, new bus reset in progress
firewire_ohci: node ID not valid, new bus reset in progress
firewire_core: created new fw device fw2 (0 config rom retries, S100)
firewire_core: phy config: card 1, new root=ffc1, gap_count=5
firewire_core: BM lock failed, making local node (ffc0) root.
firewire_core: phy config: card 1, new root=ffc0, gap_count=5
firewire_core: phy config: card 1, new root=ffc1, gap_count=5
firewire_core: giving up on config rom for node id ffc1

#and after running dvgrab:

firewire_ohci: context_stop: still active (0x40000411)
dvgrab[3128]: segfault at b0bd6008 eip 0069119e esp bfced110 error 4


#evrything seems to work here, but I have problems when exiting the
#capture apps:

#dvgrab:

[root@fedora fedora]# dvgrab -i ./ttt.avi
Found AV/C device with GUID 0x00008500008cec1e
Going interactive. Press '?' for help.

"stdout": buffer underrun near: timecode 00:4875865:-1993832910.00 date
????.??.?? ??:??:??
This error means that the frames could not be written fast enough.
q=quit, p=play, c=capture, Esc=stop, h=reverse, j=backward scan, k=pause        
l=forward scan, a=rewind, z=fast forward, 0-9=trickplay, <space>=play/pause
Capture Started" ff:ff:ff:ff ""          sec                                    
"./ttt001.avi":    39.40 MiB 277 frames timecode 00:250000000:-1076964892.03
date 2008.02.03 18:15:08
Capture Stopped
Warning: 1 dropped frames.
Erreur de segmentation:ff:ff ""          sec                                    

#exept the "segfault" when living, everything seems okay!


#gstreamer:

[root@fedora fedora]# gst-launch dv1394src ! decodebin name=d ! queue ! 
                                 audioconvert ! audioresample ! alsasink d. !
                                 ffmpegcolorspace ! xvimagesink
Setting pipeline to PAUSED ...
Pipeline is live and does not need PREROLL ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
Caught interrupt -- handling interrupt.
Interrupt: Setting pipeline to PAUSED ...
Execution ended after 33318467000 ns.
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Caught SIGSEGV accessing address 0xb6886004
#0  0x00110402 in _start () from /lib/ld-linux.so.2
#1  0x0065647b in ?? ()
#2  0x009c1218 in ?? ()
#3  0x00000000 in ?? ()
Spinning.  Please run 'gdb gst-launch 3145' to continue debugging, Ctrl-C to
quit, or Ctrl-\ to dump core.

#Here everything work well too, until the exit (and another "segfault"!)
#I can't quit by closing the video window: I have to do 'ctrl-C' in the terminal
#I can't do a dump (french keyboard: '\' with 'altgr', doesn't work...)
#The video window doesn't even close until I close the terminal!

#Running gdb:

[root@fedora sdb]# gdb gst-launch 3318
GNU gdb Red Hat Linux (6.6-35.fc8rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...

warning: Missing the separate debug info file:
/usr/lib/debug/.build-id/ec/a38595da00301898debe867d96a6c3b13a0201.debug
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
Attaching to program: /usr/bin/gst-launch, process 3318

warning: Missing the separate debug info file:
/usr/lib/debug/.build-id/ac/2eeb206486bb7315d6ac4cd64de0cb50838ff6.debug
(no debugging symbols found)
(no debugging symbols found)
0x00110402 in _start () from /lib/ld-linux.so.2
(gdb) bt
#0  0x00110402 in _start () from /lib/ld-linux.so.2
#1  0x00655d26 in ?? ()
#2  0x00000000 in ?? ()

#With no "debug" version, nothing new... even if the back trace isn't
#exactly the same ?!?
#But killing gst-launch here deletes the video window (has expected...)


I hope this might help... and I'll continue to test with newest versions
when I can...

Comment 41 Jarod Wilson 2008-02-25 18:23:56 UTC
So we have a few different bugs that have ended up in here... Here's what I'd
like to do:

1) the original bug, reported by Roberto "giving up on config rom" should be
resolved -- this was actually tracked in bug 429598. Roberto, please confirm if
you would though. Ed hit this too, initially, but has confirmed it to be
resolved for him. Given that I believe the original problem is fixed, I'm goinge
to close this bug.

2) Ed's additional issues listed in comment #13, all of which have been
resolved, save the I/O buffer problems. I'd like to open a new bug for this
issue, if its still a problem with the latest rawhide kernel.

3) Bastien's non-working nForce 2 controller is already being tracked separately
in bug 244576.

4) Bastien's segfault-on-exit of dvgrab seems to be similar to bug 243081, would
like to track it over there.


Comment 42 Roberto Malinverni 2008-02-25 19:46:28 UTC
I'll try the new kernel as soon as possible and report back if it won't work.
Thanks all for your interest in addressing this issue.


Note You need to log in before you can comment on or make changes to this bug.