When running test-dv or dvgrab, when raw1394_iso_shutdown() is called, the program aborts. [root@brinstar ~]# test-dv *** glibc detected *** test-dv: double free or corruption (top): 0x08f27d20 *** ======= Backtrace: ========= /lib/libc.so.6[0x1a5a31] /lib/libc.so.6(cfree+0x90)[0x1a9060] /usr/lib/libraw1394.so.8(raw1394_iso_shutdown+0x55)[0x350c35] /usr/lib/libiec61883.so.0(iec61883_dv_recv_stop+0x3b)[0x13606b] /usr/lib/libiec61883.so.0(iec61883_dv_close+0x2d)[0x13612d] /usr/lib/libiec61883.so.0(iec61883_dv_fb_close+0x28)[0x1361a8] test-dv[0x8048c3f] test-dv[0x8049195] /lib/libc.so.6(__libc_start_main+0xe0)[0x152390] test-dv[0x8048961] ======= Memory map: ======== 00110000-0012b000 r-xp 00000000 fd:00 5106832 /lib/ld-2.6.90.so 0012b000-0012c000 r-xp 0001a000 fd:00 5106832 /lib/ld-2.6.90.so 0012c000-0012d000 rwxp 0001b000 fd:00 5106832 /lib/ld-2.6.90.so 0012d000-0012e000 r-xp 0012d000 00:00 0 [vdso] 0012e000-0013b000 r-xp 00000000 fd:00 9278079 /usr/lib/libiec61883.so.0.1.0 0013b000-0013c000 rwxp 0000c000 fd:00 9278079 /usr/lib/libiec61883.so.0.1.0 0013c000-0028e000 r-xp 00000000 fd:00 5107044 /lib/libc-2.6.90.so 0028e000-00290000 r-xp 00152000 fd:00 5107044 /lib/libc-2.6.90.so 00290000-00291000 rwxp 00154000 fd:00 5107044 /lib/libc-2.6.90.so 00291000-00294000 rwxp 00291000 00:00 0 0034d000-00353000 r-xp 00000000 fd:00 9270235 /usr/lib/libraw1394.so.8.1.1 00353000-00354000 rwxp 00005000 fd:00 9270235 /usr/lib/libraw1394.so.8.1.1 04b8c000-04b97000 r-xp 00000000 fd:00 5106916 /lib/libgcc_s-4.1.2-20070925.so.1 04b97000-04b98000 rwxp 0000a000 fd:00 5106916 /lib/libgcc_s-4.1.2-20070925.so.1 08048000-0804a000 r-xp 00000000 fd:00 9284743 /usr/bin/test-dv 0804a000-0804b000 rw-p 00001000 fd:00 9284743 /usr/bin/test-dv 08f22000-08f47000 rw-p 08f22000 00:00 0 b7d00000-b7d21000 rw-p b7d00000 00:00 0 b7d21000-b7e00000 ---p b7d21000 00:00 0 b7ef8000-b7f1e000 rw-p b7ef8000 00:00 0 bfbb0000-bfbc5000 rw-p bffea000 00:00 0 [stack] Aborted
Also see bug #240771, where this has been mentioned a few times. (see comment #24 and later)
Has also been reported on linux1394-user at least once, by an Ubuntu user, running mythtv. FWIW, I presume Ubuntu uses the old ieee1394 kernel drivers. http://marc.info/?l=linux1394-user&m=119118398024958
Dan Dennedy released libraw1394-1.3.0 yesterday. It contains a fix that looks relevant. http://wiki.linux1394.org/ReleaseNotesLibraries
Re comment #3: I wonder though, was Fedora's juju-fied libraw1394 based on libraw1394-1.2.0/.1 or already on a more recent SVN version? The fix in question was committed to SVN in October 2006.
The fix in question also does not fix the error. The line that's causing the abort is actually in the juju layer, at line 521 of raw1394-iso.c: void raw1394_iso_shutdown(raw1394handle_t handle) { munmap(handle->iso.buffer, handle->iso.buf_packets * handle->iso.max_packet_size); close(handle->iso.fd); free(handle->iso.packets); /* This is line 521. */ } handle->iso.packets is *not* set to NULL in the juju raw1394_new_handle() function, so I added a patch to set it to NULL when initialized. But that doesn't seem to solve the problem. I did some debugging in gdb and confirmed that, when test-dv aborts, 'handle->iso.packets' is set to a reasonable value. So yeah, that's probably not it. Therefore, I'm guessing that handle->iso.packets has already been freed when this free() is called. But I'm not sure where the first free() would happen. Hopefully someone more familiar with juju could figure that out quickly. (cough cough adding krh)
This is some lovely fun. I've got one box (my powerpc laptop) that doesn't crash, dvgrab actually connects to the camera, which starts playing, but then stops after a few seconds, and dvgrab exits cleanly, saying "no DV found". Same camera hooked up to an x86_64 tower blows up, similarly-but-not-the-same as Will's report: # dvgrab Found AV/C device with GUID 0x0000850000961567 ieee1394io.cc:456: In function "virtual bool iec61883Reader::StartReceive()": "iec61883_dv_fb_start( m_iec61883.dv, channel )" evaluated to -1 ieee1394io.cc:456: errno: 22 (Invalid argument) *** glibc detected *** dvgrab: double free or corruption (top): 0x000000000218d7a0 *** ======= Backtrace: ========= /lib64/libc.so.6[0x2aaaabb1f832] /lib64/libc.so.6(cfree+0x8c)[0x2aaaabb22f2c] /usr/lib64/libiec61883.so.0(iec61883_dv_close+0x15)[0x2aaaaacced45] /usr/lib64/libiec61883.so.0(iec61883_dv_fb_close+0x11)[0x2aaaaacced91] dvgrab[0x421a6c] dvgrab[0x420cbb] dvgrab[0x40d7d4] dvgrab[0x42451e] /lib64/libc.so.6(__libc_start_main+0xf4)[0x2aaaabacb074] dvgrab(__gxx_personality_v0+0x1f9)[0x405579] ======= Memory map: ======== 00400000-0043c000 r-xp 00000000 fd:02 3697428 /usr/bin/dvgrab 0063c000-0063d000 rw-p 0003c000 fd:02 3697428 /usr/bin/dvgrab 0063d000-02198000 rw-p 0063d000 00:00 0 [heap] 40000000-40001000 ---p 40000000 00:00 0 40001000-40a01000 rw-p 40001000 00:00 0 3a1e000000-3a1e006000 r-xp 00000000 fd:02 3722410 /usr/lib64/libraw1394.so.8.1.1 3a1e006000-3a1e205000 ---p 00006000 fd:02 3722410 /usr/lib64/libraw1394.so.8.1.1 3a1e205000-3a1e206000 rw-p 00005000 fd:02 3722410 /usr/lib64/libraw1394.so.8.1.1 3a1e400000-3a1e403000 r-xp 00000000 fd:02 3722694 /usr/lib64/librom1394.so.0.3.0 3a1e403000-3a1e602000 ---p 00003000 fd:02 3722694 /usr/lib64/librom1394.so.0.3.0 3a1e602000-3a1e603000 rw-p 00002000 fd:02 3722694 /usr/lib64/librom1394.so.0.3.0 3a1e800000-3a1e804000 r-xp 00000000 fd:02 3702447 /usr/lib64/libavc1394.so.0.3.0 3a1e804000-3a1ea03000 ---p 00004000 fd:02 3702447 /usr/lib64/libavc1394.so.0.3.0 3a1ea03000-3a1ea04000 rw-p 00003000 fd:02 3702447 /usr/lib64/libavc1394.so.0.3.0 3a26e00000-3a26e21000 r-xp 00000000 fd:02 3712252 /usr/lib64/libjpeg.so.62.0.0 3a26e21000-3a27020000 ---p 00021000 fd:02 3712252 /usr/lib64/libjpeg.so.62.0.0 3a27020000-3a27021000 rw-p 00020000 fd:02 3712252 /usr/lib64/libjpeg.so.62.0.0 2aaaaaaab000-2aaaaaac6000 r-xp 00000000 fd:02 5069181 /lib64/ld-2.7.so 2aaaaaac6000-2aaaaac04000 rw-p 2aaaaaac6000 00:00 0 2aaaaacc5000-2aaaaacc6000 r--p 0001a000 fd:02 5069181 /lib64/ld-2.7.so 2aaaaacc6000-2aaaaacc7000 rw-p 0001b000 fd:02 5069181 /lib64/ld-2.7.so 2aaaaacc7000-2aaaaacd3000 r-xp 00000000 fd:02 3702222 /usr/lib64/libiec61883.so.0.1.0 2aaaaacd3000-2aaaaaed3000 ---p 0000c000 fd:02 3702222 /usr/lib64/libiec61883.so.0.1.0 2aaaaaed3000-2aaaaaed4000 rw-p 0000c000 fd:02 3702222 /usr/lib64/libiec61883.so.0.1.0 2aaaaaed4000-2aaaaaeef000 r-xp 00000000 fd:02 3702082 /usr/lib64/libdv.so.4.0.3 2aaaaaeef000-2aaaab0ee000 ---p 0001b000 fd:02 3702082 /usr/lib64/libdv.so.4.0.3 2aaaab0ee000-2aaaab0f1000 rw-p 0001a000 fd:02 3702082 /usr/lib64/libdv.so.4.0.3 2aaaab0f1000-2aaaab100000 rw-p 2aaaab0f1000 00:00 0 2aaaab100000-2aaaab116000 r-xp 00000000 fd:02 5069183 /lib64/libpthread-2.7.so 2aaaab116000-2aaaab315000 ---p 00016000 fd:02 5069183 /lib64/libpthread-2.7.so 2aaaab315000-2aaaab316000 r--p 00015000 fd:02 5069183 /lib64/libpthread-2.7.so 2aaaab316000-2aaaab317000 rw-p 00016000 fd:02 5069183 /lib64/libpthread-2.7.so 2aaaab317000-2aaaab31b000 rw-p 2aaaab317000 00:00 0 2aaaab31b000-2aaaab400000 r-xp 00000000 fd:02 3699636 /usr/lib64/libstdc++.so.6.0.8 2aaaab400000-2aaaab600000 ---p 000e5000 fd:02 3699636 /usr/lib64/libstdc++.so.6.0.8 2aaaab600000-2aaaab606000 r--p 000e5000 fd:02 3699636 /usr/lib64/libstdc++.so.6.0.8 2aaaab606000-2aaaab609000 rw-p 000eb000 fd:02 3699636 /usr/lib64/libstdc++.so.6.0.8 2aaaab609000-2aaaab61b000 rw-p 2aaaab609000 00:00 0 2aaaab61b000-2aaaab69d000 r-xp 00000000 fd:02 5069160 /lib64/libm-2.7.so 2aaaab69d000-2aaaab89c000 ---p 00082000 fd:02 5069160 /lib64/libm-2.7.so 2aaaab89c000-2aaaab89d000 r--p 00081000 fd:02 5069160 /lib64/libm-2.7.so 2aaaab89d000-2aaaab89e000 rw-p 00082000 fd:02 5069160 /lib64/libm-2.7.so 2aaaab89e000-2aaaab89f000 rw-p 2aaaab89e000 00:00 0 2aaaab89f000-2aaaab8ac000 r-xp 00000000 fd:02 5069188 /lib64/libgcc_s-4.1.2-20070925.so.1 2aaaab8ac000-2aaaabaac000 ---p 0000d000 fd:02 5069188 /lib64/libgcc_s-4.1.2-20070925.so.1 2aaaabaac000-2aaaabaad000 rw-p 0000d000 fd:02 5069188 /lib64/libgcc_s-4.1.2-20070925.so.1 2aaaabaad000-2aaaabbfa000 r-xp 00000000 fd:02 5069144 /lib64/libc-2.7.so 2aaaabbfa000-2aaaabdfa000 ---p 0014d000 fd:02 5069144 /lib64/libc-2.7.so 2aaaabdfa000-2aaaabdfe000 r--p 0014d000 fd:02 5069144 /lib64/libc-2.7.so 2aaaabdfe000-2aaaabdff000 rw-p 00151000 fd:02 5069144 /lib64/libc-2.7.so 2aaaabdff000-2aaab22ab000 rw-p 2aaaabdff000 00:00 0 2aaab4000000-2aaab4021000 rw-p 2aaab4000000 00:00 0 2aaab4021000-2aaab8000000 ---p 2aaab4021000 00:00 0 7fffb3de4000-7fffb3df9000 rw-p 7ffffffea000 00:00 0 [stack] 7fffb3dfe000-7fffb3e00000 r-xp 7fffb3dfe000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Aborted
Forgot to mention: I can confirm that libraw1394-1.3.0 doesn't make a lick of difference. For those that want to play along at home: http://koji.fedoraproject.org/packages/libraw1394/1.3.0/1.fc8/
Ah, and my crash in comment #6 is pretty much the same as the ones Will referred to in comment #1, so I guess that much is good... Anyhow, been poking around the code for a while this afternoon, and at least I see why the double-free is happening... In iso_init() in raw1394-iso.c, the 'retval = ioctl(blah);' call is failing, which results in a call to 'free(handle->iso.packets);'. Because the init failed, we wind up calling iec61883_dv_close() in libiec61883, which in turn, calls raw1394_iso_shutdown, where a second 'free(handle->iso.packets);' call is made. Oops. (along with a second attempt to 'close(handle->iso.fd);'). Still attempting to figure out why the ioctl fails, but it looks like we could certainly use some checks in raw1394_iso_shutdown before trying to free memory that's already been freed...
Created attachment 233121 [details] Fix for the double-free problem... This patch fixes the double-free issue by setting handle->iso.packets to NULL after it gets freed the first time, so that we don't try to free it again in raw1394_iso_shutdown(). Also adds a debugging printf that shows where we're actually failing that led to the situation where we got the double-free... Here's what I get for output now: # dvgrab Found AV/C device with GUID 0x0000850000961567 ioctl call failed, retval = -1 ieee1394io.cc:457: In function "virtual bool iec61883Reader::StartReceive()": "iec61883_dv_fb_start( m_iec61883.dv, channel )" evaluated to -1 ieee1394io.cc:457: errno: 22 (Invalid argument) "" 0.00 MB 0 frames Capture Stopped The camera actually starts rolling for a bit, but as you can see, we never actually capture any video. For some reason, dvgrab is also hanging after the "Capture Stopped" message. Progress though...
Aha. Okay, now I see why I'm not capturing any video... The problem traces back down to the kernel level, and I'm finally recalling a mention somewhere that OHCI 1.0 controllers don't play nice with the new stack yet. Apparently, all I've got are 1.0 controllers (definitely the case in the box I'm testing on right now, anyhow). It ultimately boils down to calling ohci_allocate_iso_context() in drivers/firewire/ fw-ohci.c, where we have: /* FIXME: We need a fallback for pre 1.1 OHCI. */ if (callback == handle_ir_dualbuffer_packet && ohci->version < OHCI_VERSION_1_1) return ERR_PTR(-EINVAL); So I lose... I'll have to see if I can find a 1.1 controller around here to see if with that I can actually get some video, but we really really really could use that pre-1.1 fallback, eh?...
I'm inclined to close this bug once I push updates for F7 and F8, since the double-free is taken care of by the patch attached to this bug, then open a new bug for the kernel side of things (ohci pre-1.1 support) and any other issues that turn up with either libraw1394 or dvgrab...
> I'm finally recalling a mention somewhere that OHCI 1.0 controllers > don't play nice with the new stack yet. This could also be worded the other way around. I now added some more details on this at http://wiki.linux1394.org/JujuMigration. The issue is also listed at http://wiki.linux1394.org/ToDo as a top priority item. Kristian and I were both surprised is how many OHCI 1.0:1997 controllers are still in active use and even newly sold. This has also to do with OHCI 1.1:2000 not being explicit at all about what's new relative to OHCI 1.0. Also, while the OHCI 1.1:2000 spec is available as a gratis download from Intel and Microsoft (and this is well known among developers because you have to pay up for most other FireWire specs), the 1.0 spec is nowhere officially available anymore. Clearly, the Two Other OS vendors never made use of OHCI 1.1 features, otherwise silicon vendors would have moved to OHCI 1.1 like everybody moved from IEEE 1394:1995 to IEEE 1394a:2000.
PS: > The issue is also listed at > http://wiki.linux1394.org/ToDo as a top priority item. which shouldn't be misunderstood as "somebody is doing something about it". I wouldn't have bothered to create that ToDo page if there were active driver developers.
Created attachment 233591 [details] firewire: fw-ohci: log a note about unsupported features You can carry this patch over if you open an OHCI 1.0<->1.1 bug.
Fix confirmed - libraw1394-1.3.0-2.fc8 does not abort when running test-dv or dvgrab. As you noted, though, it still doesn't work - the ioctl() returns -1, because the Agere FW323 in the Mac Mini and in the x86_64 test machine doesn't implement OHCI 1.1. Definitely recommend filing a new bug against 'kernel' for that issue. I'm gonna close this bug RAWHIDE, since the updated libraw1394 will be available there shortly.. but don't forget to push an update for F7. Thanks!
(In reply to comment #15) > Fix confirmed - libraw1394-1.3.0-2.fc8 does not abort when running test-dv or > dvgrab. > > As you noted, though, it still doesn't work - the ioctl() returns -1, because > the Agere FW323 in the Mac Mini and in the x86_64 test machine doesn't > implement OHCI 1.1. Hrm... I'm thinking most (all?) OCHI 1.1+ controllers are the ones that include 800Mbps firewire support, based on the fact my powerbook is the only thing I've got with an 800Mbps firewire port, and the only machine that reports as OHCI 1.1... Of course, this is only a sampling of less than half a dozen machines, but it would seem to make some sense. > Definitely recommend filing a new bug against 'kernel' for that issue. Yeah, planning to do so in a sec... > I'm gonna close this bug RAWHIDE, since the updated libraw1394 will be available > there shortly.. but don't forget to push an update for F7. Done. (Still 1.2.1-based, only added the double-free fix). > Thanks! NP, sorry for letting it sit for so long... :\
I've filed bug 344851 to cover (the lack of) ohci 1.0 isochronous I/O support, and bug 345221 to cover dvgrab not actually capturing any video, even on a system with an ohci 1.1 controller.
> I'm thinking most (all?) OCHI 1.1+ controllers are the ones > that include 800Mbps firewire support Work on a successor to the OHCI 1.1:2000 spec was apparently started but not finished. Hence vendors of IEEE 1394b:2002 compliant cards (among them FireWire 800 cards) often market them as "OHCI 1.1+ compliant" or something like that, but there is no such spec. There are also IEEE 1394a:2000/ FireWire 400 cards which implement OHCI 1.1.
libraw1394-1.2.1-10.fc7 has been pushed to the Fedora 7 stable repository. If problems still persist, please make note of it in this bug report.
I have that dvgrab Found AV/C device with GUID 0x0000850000961567 ieee1394io.cc:456: In function "virtual bool iec61883Reader::StartReceive()": "iec61883_dv_fb_start( m_iec61883.dv, channel )" evaluated to -1 ieee1394io.cc:456: errno: 22 (Invalid argument) and captured 5-10 second and Stopping
(In reply to comment #20) > I have that > > dvgrab > Found AV/C device with GUID 0x0000850000961567 > ieee1394io.cc:456: In function "virtual bool iec61883Reader::StartReceive()": > "iec61883_dv_fb_start( m_iec61883.dv, channel )" evaluated to -1 > ieee1394io.cc:456: errno: 22 (Invalid argument) > > and captured 5-10 second and Stopping You appear to have an OHCI 1.0 card, which won't work at the moment, so the above is the expected behavior right now.
And this OHCI 1.0 when start work?