Description of problem: Version-Release number of selected component (if applicable): dvgrab 3.1 as shipped with F9 How reproducible: completely Steps to Reproduce: 1. connect camcorder via 1394 2. execute (as root, else it fails to even find the camera - that's bug 445100) "dvgrab -d 2s" Actual results: rom1394_0 warning: read failed: 0x0000fffff000040c Found AV/C device with GUID 0x00000000b0f1b832 send oops "" 0.00 MiB 0 frames Capture Stopped Error: no DV Expected results: What we get with the workaround, shown below. Workaround: dmesg -n8 ; echo 1 > /sys/module/firewire_ohci/parameters/debug Result of "dvgrab -d 2s" after workaround: Found AV/C device with GUID 0x00804580b0f1b832 Capture Started "dvgrab-015.dv": 6.87 MiB 60 frames timecode 00:05:31.29 date 2000.01.15 21:00:00 Capture Stopped Additional info: Kernel: 2.6.25.3-18.fc9.i686 #1 SMP Tue May 13 05:38:53 EDT 2008 i686 i686 Running under strace (and without enabling the debug output) also makes it work. Anyone smell a timing issue? :-/ I'd be happy to provide additional info and/or test patches.
"lspci -tv" and kernel .config may be good to know.
lspci -tv: -[0000:00]-+-00.0 Intel Corporation 82830 830 Chipset Host Bridge +-01.0-[0000:01]----00.0 ATI Technologies Inc Radeon Mobility M6 LY +-1d.0 Intel Corporation 82801CA/CAM USB Controller #1 +-1d.1 Intel Corporation 82801CA/CAM USB Controller #2 +-1e.0-[0000:02-06]--+-04.0 Conexant HSF 56k HSFi Modem | +-05.0 Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link) | +-06.0 Texas Instruments PCI1410 PC card Cardbus Controller | \-08.0 Intel Corporation 82801CAM (ICH3) PRO/100 VE (LOM) Ethernet Controller +-1f.0 Intel Corporation 82801CAM ISA Bridge (LPC) +-1f.1 Intel Corporation 82801CAM IDE U100 Controller +-1f.3 Intel Corporation 82801CA/CAM SMBus Controller \-1f.5 Intel Corporation 82801CA/CAM AC'97 Audio Controller The kernel I'm running is the Fedora released update kernel: 2.6.25.3-18.fc9.i686 Should I dig out and attach the .config for it?
Oh! Another TSB43AB22/A... Chris, how much memory do you have in this PC? pg
Created attachment 308227 [details] /boot/config-2.6.25.3-18.fc9.i686 redundant attachment for those weirdos who don't have Fedora, or net even RPM
Thanks, Stefan. "free" says: total used free shared buffers cached Mem: 254588 239288 15300 0 17412 85132 -/+ buffers/cache: 136744 117844 Swap: 2228208 79632 2148576
Is there anything I can do to help track this down (in less time than the several days it would take me to come up to speed on the code)? I'm happy to test patches, collect debugging info, etc. This is a significant issue potentially blocking upgrades from FC6 (the last release where firewire worked properly) to F9 for the university researchers I support.
It's a pity that the immediately available debug facility lets the problem vanish. For now I have no idea where to start looking (and things other than Linux keep me busy). First things you could try though would be different kernel config options, like group scheduling off, tickless system off, preemptible kernel (low latency desktop), ...
I think it did fail a couple of times with debug output enabled - I can keep trying. What debug setting would be best, if I can get it to fail and capture the output?
I would say try reproduce it first with 7, if this works too well test with 3, if it is still not reproducible test with 1. A log of a failing session for comparison with a log of a working session would be good to have.
Maybe the bug is in the libraw1394's juju code. AFAIU, after it sent a FW_CDEV_IOC_SEND_REQUEST ioctl(), it has to wait for and read() an fw_cdev_event_response with a matching .closure from the device file. fw_cdev_event_*s could of course arrive out of order, and libraw1394 has to cope with that.
Created attachment 308644 [details] fw-cdev-log-request-response.patch This patch adds a runtime option to firewire-core to log send-request ioctls and response events which go through the character device file interface. AFAICS "send oops" is an error message from libavc1394 which is emitted if a write transaction to the FCP_COMMAND register of the camera went wrong (for any reason).
PS: Please apply the patch and test with firewire-ohci's debug logging off and firewire-core's debug logging on.
(In reply to comment #3) > Oh! Another TSB43AB22/A... > Chris, how much memory do you have in this PC? I *finally* found a system here in the office with this controller, and have borrowed some memory for it to knock its total up to 3GB. Will beat on it some tomorrow...
Created attachment 308959 [details] fw-cdev debug log dump from tsb43ab22/a in machine w/8G RAM Here's a fw-cdev debug log dump during a few seconds of attempting to capture video from a dv camera attached to a TSB43AB22/A firewire controller onboard a Tyan Thunder n6650W motherboard w/8G RAM. # dvgrab Found AV/C device with GUID 0x0000850000667a8d ^C"" 134217432.00 MiB 0 frames Capture Stopped Error: no DV
To make sure we've got our cross-references right, just want to add a note that this is starting to look like its actually a duplicate of bug 435550.
Re comment #15: According to the Intel 830 chipset datasheet, main system SDRAM memory ranges from 1MB to 1GB. Hence Chris does not get physical addresses above 2G which seem to be the trigger of bug 435550. Furthermore, comment #0 looks like already asynchronous communication with the camera fails, i.e. that FCP requests could not be sent. (The error output from dvgrab/libavc could be misleading though.)
Oh, crap, I read Chris' free output as saying 2.5G of RAM, rather than 256MB.
Created attachment 309400 [details] debug output with firewire_core patch Sorry for the delay. I've attached the output with /sys/module/firewire_core/parameters/debug=1 and /sys/module/firewire_ohci/parameters/debug=0 when trying to run dvgrab with the same camera I've been using (a Panasonic PV-DV402D). I have borrowed another camera, a Sony DVR-TRV27; with this camera, dvgrab works fine even without the patch. I should have both cameras for the next week or so, so please let me know ASAP if you'd like any additional information gathered.
(In reply to comment #18) > I have borrowed another camera, a Sony DVR-TRV27; with this camera, dvgrab > works fine even without the patch. s/even without the patch/even with all debugging turned off/
In case it is a clue: I notice that sometimes the GUID that dvgrab mentions has the high 32 bits zeroed rather than matching the GUID shown by the kernel module; also sometimes it works (successfully grabs video) and sometimes it doesn't. However, the two things do not correlate; I've seen it fail with the real GUID, fail with the half-right GUID, and succeed both ways as well. Example: Jun 15 10:53:04 localhost kernel: firewire_core: created device fw1: GUID 00804580b0f1b832, S100 ... dvgrab says: Found AV/C device with GUID 0x00804580b0f1b832 send oops Capture Started "dvgrab-028.dv": 6.87 MiB 60 frames timecode 00:03:56.23 date 2000.01.15 21:00:00 Capture Stopped (that's a success, despite the "send oops", which can appear or not in various quantities) and other times (with camera still plugged in, so above kernel module output still applies), dvgrab says: send oops send oops send oops send oops rom1394_0 warning: read failed: 0x0000fffff000040c Found AV/C device with GUID 0x00000000b0f1b832 Warning: Cannot set RR-scheduler Warning: Cannot disable swapping send oops Capture Started "dvgrab-029.dv": 6.87 MiB 60 frames timecode 00:04:00.06 date 2000.01.15 21:00:00 Capture Stopped
(In reply to comment #20) > Warning: Cannot set RR-scheduler > Warning: Cannot disable swapping Those warnings are not related, by the way - I happened to run that one without sudo, as I was trying to test for bug 445100.
(In reply to comment #18) > I have borrowed another camera, a Sony DVR-TRV27; with this camera, dvgrab > works fine ... and does not print all the "send oops", "rom1394_0 warning", etc. messages. It also seems to consistently get the right GUID.
Do problems persist with the latest rawhide kernel?
Ah, a Panasonic PV-DV402D. And attachment 309400 [details] shows lots of "busy" ACKs. What if you start the camcorder manually and run "dvgrab -noavc"? I suspect this is the same as Fedora bug 477279 (Panasonic PV-DV401) and as dvgrab bug https://sourceforge.net/tracker/index.php?func=detail&aid=2492640&group_id=14103&atid=114103 (Panasonic NV-MX500EG). The latter bug is actually with an Ubuntu kernel which uses the old ieee1394 drivers.
Proposed fix: http://marc.info/?l=linux1394-devel&m=123247509617629
Patch from comment 25 was merged into Linus' linux-2.6.git a week ago, mailed to kernel.org's stable list just now. Jarod, is there a Fedora kernel with patch "firewire: ohci: increase AT req. retries, fix ack_busy_X from Panasonic camcorders and others" included? Chris, does it fix this issue for you?
I too have been having problems with firewire_core and would be willing to test the updated kernel on the following system: Panasonic PV-DV73D camera + Panasonic AG-DV2500 tape deck Belkin F5U506 ExpressCard (104c:8235) Lenovo ThinkPad T60p Stock Fedora 10 with latest updates. Linux machine 2.6.27.12-170.2.5.fc10.x86_64 #1 SMP Wed Jan 21 01:33:24 EST 2009 x86_64 x86_64 x86_64 GNU/Linux []# free total used free shared buffers cached Mem: 2059360 785412 1273948 0 19128 223652 -/+ buffers/cache: 542632 1516728 Swap: 2249092 0 2249092 Strace and firewire_ohci in debug mode don't seem to affect the problem. Cheers, George
The patch from comment 25 is available in upstream released kernels 2.6.29-rc4, 2.6.28.5, and 2.6.27.16.
On 2.6.27.19-170.2.35.fc10.x86_64 the PV-DV73D camera works as expected. However, dvgrab fails to capture from a Panasonic AG-DV2500 tape deck, producing "giving up on config rom" and "error reading config rom directory" messages.
I have a Sony DCR-TRV19 miniDV camera that is having similar issues under 2.6.27.19-170.2.35.fc10.x86_64, dvgrab-3.2-1.fc10.x86_64, libraw1394-2.0.1-1.fc10.x86_64, libavc1394-0.5.3-4.fc10.x86_64, and libiec61883-1.2.0-1.fc10.x86_64. AV/C controls work, it will rewind and attempt to capture video, but gives up with: [dave@obelisk video]$ dvgrab -showstatus -rewind Found AV/C device with GUID 0x08004601021db4d7 Warning: Cannot set RR-scheduler Warning: Cannot disable swapping "" 0.00 MiB 0 frames Capture Stopped Error: no DV The machine has 6 GB of memory, and the 1394 controller is a "Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller" according to lspci. I can capture logs, etc., but it will be about a week before I can circle back to this.
Re comment #30: This is unlikely to be bug 449252, which is about failure of asynchronous I/O (reading the device's configuration ROM, setting up the AV/C output plug, controlling the tape and starting the stream). Please check whether it works with RAM limited to 2 GB. See https://bugzilla.redhat.com/show_bug.cgi?id=435550#c82 and https://bugzilla.redhat.com/show_bug.cgi?id=435550#c84 (Bug 435550 was about TSB43AB22(A); maybe TSB43AB23 or some revisions of it are affected too.)
Re comment #29, remaining failures with AG-DV2500 tape deck: Please try the following: # echo 3 > /sys/module/firewire_ohci/parameters/debug then connect the tape deck, run dvgrab, and /if/ it fails, post the output of dvgrab and the contents of the kernel log here, starting at the point at which the tape deck was connected. (You can switch off the log noise with echo 0 > ... again. If we are unlucky, the failures vanish under debug logging load like in comment 0.)
Created attachment 336233 [details] dvgrab output + kernel messages after attachment of AG-DV2500 tape deck
Taking a look at my own debug log (#33), I think the GUID may have been misidentified. Using the old firewire stack, the tape deck is accessible from 0080458280021A70 (although 0102030400000c24 is also listed).
This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Hm... We do have multiple open firewire/dvgrab bug reports that include mention of TSB43AB23 chips now... (well, two anyway -- bug 471708 mentions this chip). Moving bug forward to F10, since it still exists there (and likely in F11 too).
Would an ssh account to a machine with the tape deck attached make fixing this bug possible? (The bug has been present starting from Fedora 7.)
This message is a reminder that Fedora 10 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 10. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '10'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 10's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 10 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Evidently the affected devices are not spec compliant as they react in strange ways on (1.) config ROM read requests by libraw1394 clients even though firewire-core successfully got the ROM initially, or on (2.) FCP write requests which they might not tolerate in a frequency as dvgrab etc. send them. One idea to improve things here is to change libraw1394 to redirect all config ROM read requests into the kernel's config ROM cache. That would be a useful optimization anyway. I take not to try to implement this when I get some spare time. I.e. in ten years or so. :-(
I wasn't aware there were hardware-level problems... Is there a list of known-good devices available?
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.
Re AG-DV2500 tape deck and comment 39: Grr, no, the failure mode per attachment 336233 [details] (comment 33) is different from what was described in the other comments about various camcorders. As can be seen clearly in the log, the kernel successfully reads 148 bytes from the ROM (offsets 0x400...0x490), then starts over at offset 0x400 for no obvious reason, does that a few times, then gives up and does not create a /dev/fw* for the tape deck. To debug this, someone has to sit down with attachment 336233 [details] and the IEEE 1212 spec and check the config ROM for something that irritates firewire-core. I'm afraid that someone is me. BTW, the GUID part of the config ROM at offsets 0x40c and 0x410 matches the GUID that the old stack reads (comment 34). The other GUID and the associated device fw0 belong to the FireWire controller in the PC. George, this bug is different from the initial bug description. Please open a new bug, copy the information from comment 33, attachment 336233 [details], comment 34, comment 42 there, and add me to the CC list. If you are not permitted to do the latter, leave a comment here with the number of the new bug.
Created attachment 394655 [details] config ROM of AG-DV2500 as seen by firewire-core on 2009-03-22 George, I finally checked the debug output which you attached a year ago (comment 33). There are several blocks in the configuration ROM data which are read as zero although they need to be non-zero. And this even though there were repeated attempts to read this. Perhaps the AG-DV2500 is confused by the order in which data is read. Do you still have this device and can access it through the older ieee1394 kernel driver stack?
Yes on both counts. I'm also happy to ship it off, too. -George
George, shipping it to me would be too costly and rather slow. More analysis: According to attachment 336233 [details] (shows which quadlets were read when, and that there were no I/O errors) and to what I gather from linux/drivers/firewire/core-device.c::read_bus_info_block, the read order is not the issue. Rather, the root directory of AG-DV2500's config ROM contains an invalid entry, shown as 430 80021a70 --> leaf at 86df0 in attachment 394655 [details]. firewire-core only accepts leaves and directories which are located at offsets between 400 and 800 exclusive (or to be precise: between 0xfffff0000400 and 0xfffff0000800 exclusive). IEEE 1212 also has a provision for leaves and directories that live at larger offsets; these are not implemented in firewire-core but the AG-DV2500 does not actually make use of that special provision. (IEEE 1212 clause 7.7.18 for the curious.) The high byte of 0x80, meaning "type = leaf, key_ID = 0", indicates that this is not deliberately but a mistake, because key_ID 0 is not defined in IEEE 1212:1994 nor 1212:2001 but also not allowed to be used for vendor-specific data. Currently, firewire-core will reject any leaf or directory pointer that is too large, fail device creation, and consequently block userspace from accessing the device. If it did at least emit a log message about this kind of failure, somebody might have fixed this issue in a much more timely manner... Let me browse the firewire-core sources a bit more to see what needs to be done to switch it from "fail" to "ignore".
Proposed fix: http://marc.info/?l=linux1394-devel&m=126645431512497 George, do you have the means to patch and recompile the kernel (modules) and give it a go? http://fedoraproject.org/wiki/Building_a_custom_kernel The patch was generated against a recent vanilla kernel; if you need something different, please tell me.
George, the fix from comment has been released with mainline kernel 2.6.34-rc1. Would you please try to install kernel-2.6.34-0.10.rc1.git0.fc14 (http://koji.fedoraproject.org/koji/buildinfo?buildID=160780) or any newer 2.6.34 based kernel (http://koji.fedoraproject.org/koji/packageinfo?packageID=8) and test whether the Panasonic AG-DV2500 is accepted by the kernel and made accessible to dvgrab now? If the fix works like intended, I would submit a backport of the fix to mainline 2.6.33.y and also to 2.6.32.y. If you want it to show up in a Fedora 12 kernel update, you should probably open a new bug entry against F12 then and restate the problem and give a pointer to the patch, so that the Fedora kernel maintainers can act on it.
^ the fix from comment 46 [...]
Many thanks for the patch. I was able to compile a patched kernel based on comment 45, but unable to boot it due to grub2 issues. I'll try the tape deck with 2.6.34-rc1, but doing so will take at least a week due to travel. Sincerely, George
On a self-compiled 2.6.34-rc5 x86_64 kernel with Debian patches, I boot the machine, run 'echo 3 > /sys/module/firewire_ohci/parameters/debug', plug the FireWire cable into the deck in an 'off' state, turn 'on' the deck, and attempt a dvgrab as root: [ 136.473540] firewire_ohci: AR spd 0 tl 00, 0000 -> 0000, evt_no_status, QW req, 000000000000 = 00000000 [ 136.473701] firewire_ohci: inconsistent self IDs echo:/home/gchriss# dvgrab -f raw test_ Error: no camera exists ...then turn off the deck: [ 215.454531] firewire_ohci: AR spd 0 tl 00, 0000 -> 0000, evt_no_status, QW req, 000000000000 = 00000000 [ 215.454542] firewire_ohci: inconsistent self IDs ==== The FireWire PCI Card and related dmesg: FireWire (IEEE 1394) [0c00]: VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller [1106:3044] (rev 46) [ 0.848538] firewire_ohci 0000:01:0a.0: PCI INT A -> Link[LNKC] -> GSI 17 (level, low) -> IRQ 17 Apr 21 10:36:44 echo kernel: [ 0.923674] firewire_ohci: Added fw-ohci device 0000:01:0a.0, OHCI v1.10, 4 IR + 8 IT contexts, quirks 0x1 [ 0.923863] firewire_ohci: inconsistent self IDs [ 0.958445] firewire_ohci: inconsistent self IDs [ 0.976002] firewire_ohci: inconsistent self IDs [ 71.244490] firewire_ohci: inconsistent self IDs
OK, works with a little cable jiggling on a Shuttle SG31G2 v2 + Linux 2.6.34 x86_64 derived from Fedora 12 .config setings. Thanks!