Bug 328011 - glibc detected double free or corruption in libraw1394 (running dvgrab)
Summary: glibc detected double free or corruption in libraw1394 (running dvgrab)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: libraw1394
Version: rawhide
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Jarod Wilson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-10-11 17:22 UTC by Will Woods
Modified: 2007-11-30 22:12 UTC (History)
4 users (show)

Fixed In Version: 1.2.1-10.fc7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-10-24 07:05:24 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Fix for the double-free problem... (1.11 KB, patch)
2007-10-19 21:23 UTC, Jarod Wilson
no flags Details | Diff
firewire: fw-ohci: log a note about unsupported features (1.72 KB, patch)
2007-10-21 08:45 UTC, Stefan Richter
no flags Details | Diff

Description Will Woods 2007-10-11 17:22:47 UTC
When running test-dv or dvgrab, when raw1394_iso_shutdown() is called, the
program aborts.

[root@brinstar ~]# test-dv
*** glibc detected *** test-dv: double free or corruption (top): 0x08f27d20 ***
======= Backtrace: =========
/lib/libc.so.6[0x1a5a31]
/lib/libc.so.6(cfree+0x90)[0x1a9060]
/usr/lib/libraw1394.so.8(raw1394_iso_shutdown+0x55)[0x350c35]
/usr/lib/libiec61883.so.0(iec61883_dv_recv_stop+0x3b)[0x13606b]
/usr/lib/libiec61883.so.0(iec61883_dv_close+0x2d)[0x13612d]
/usr/lib/libiec61883.so.0(iec61883_dv_fb_close+0x28)[0x1361a8]
test-dv[0x8048c3f]
test-dv[0x8049195]
/lib/libc.so.6(__libc_start_main+0xe0)[0x152390]
test-dv[0x8048961]
======= Memory map: ========
00110000-0012b000 r-xp 00000000 fd:00 5106832    /lib/ld-2.6.90.so
0012b000-0012c000 r-xp 0001a000 fd:00 5106832    /lib/ld-2.6.90.so
0012c000-0012d000 rwxp 0001b000 fd:00 5106832    /lib/ld-2.6.90.so
0012d000-0012e000 r-xp 0012d000 00:00 0          [vdso]
0012e000-0013b000 r-xp 00000000 fd:00 9278079    /usr/lib/libiec61883.so.0.1.0
0013b000-0013c000 rwxp 0000c000 fd:00 9278079    /usr/lib/libiec61883.so.0.1.0
0013c000-0028e000 r-xp 00000000 fd:00 5107044    /lib/libc-2.6.90.so
0028e000-00290000 r-xp 00152000 fd:00 5107044    /lib/libc-2.6.90.so
00290000-00291000 rwxp 00154000 fd:00 5107044    /lib/libc-2.6.90.so
00291000-00294000 rwxp 00291000 00:00 0 
0034d000-00353000 r-xp 00000000 fd:00 9270235    /usr/lib/libraw1394.so.8.1.1
00353000-00354000 rwxp 00005000 fd:00 9270235    /usr/lib/libraw1394.so.8.1.1
04b8c000-04b97000 r-xp 00000000 fd:00 5106916    /lib/libgcc_s-4.1.2-20070925.so.1
04b97000-04b98000 rwxp 0000a000 fd:00 5106916    /lib/libgcc_s-4.1.2-20070925.so.1
08048000-0804a000 r-xp 00000000 fd:00 9284743    /usr/bin/test-dv
0804a000-0804b000 rw-p 00001000 fd:00 9284743    /usr/bin/test-dv
08f22000-08f47000 rw-p 08f22000 00:00 0 
b7d00000-b7d21000 rw-p b7d00000 00:00 0 
b7d21000-b7e00000 ---p b7d21000 00:00 0 
b7ef8000-b7f1e000 rw-p b7ef8000 00:00 0 
bfbb0000-bfbc5000 rw-p bffea000 00:00 0          [stack]
Aborted

Comment 1 Will Woods 2007-10-11 17:27:36 UTC
Also see bug #240771, where this has been mentioned a few times. (see comment
#24 and later)

Comment 2 Stefan Richter 2007-10-14 00:25:08 UTC
Has also been reported on linux1394-user at least once, by an Ubuntu user,
running mythtv.  FWIW, I presume Ubuntu uses the old ieee1394 kernel drivers.
http://marc.info/?l=linux1394-user&m=119118398024958

Comment 3 Stefan Richter 2007-10-14 11:20:29 UTC
Dan Dennedy released libraw1394-1.3.0 yesterday.  It contains a fix that looks
relevant.  http://wiki.linux1394.org/ReleaseNotesLibraries

Comment 4 Stefan Richter 2007-10-17 06:01:45 UTC
Re comment #3:  I wonder though, was Fedora's juju-fied libraw1394 based on
libraw1394-1.2.0/.1 or already on a more recent SVN version?  The fix in
question was committed to SVN in October 2006.

Comment 5 Will Woods 2007-10-17 15:22:46 UTC
The fix in question also does not fix the error. The line that's causing the
abort is actually in the juju layer, at line 521 of raw1394-iso.c:

void raw1394_iso_shutdown(raw1394handle_t handle)
{
    munmap(handle->iso.buffer,
           handle->iso.buf_packets * handle->iso.max_packet_size);
    close(handle->iso.fd);
    free(handle->iso.packets); /* This is line 521. */
}

handle->iso.packets is *not* set to NULL in the juju raw1394_new_handle()
function, so I added a patch to set it to NULL when initialized. But that
doesn't seem to solve the problem.

I did some debugging in gdb and confirmed that, when test-dv aborts,
'handle->iso.packets' is set to a reasonable value. So yeah, that's probably not it.

Therefore, I'm guessing that handle->iso.packets has already been freed when
this free() is called. But I'm not sure where the first free() would happen.
Hopefully someone more familiar with juju could figure that out quickly. (cough
cough adding krh)

Comment 6 Jarod Wilson 2007-10-19 15:18:57 UTC
This is some lovely fun. I've got one box (my powerpc laptop) that doesn't
crash, dvgrab actually connects to the camera, which starts playing, but then
stops after a few seconds, and dvgrab exits cleanly, saying "no DV found".

Same camera hooked up to an x86_64 tower blows up, similarly-but-not-the-same as
Will's report:

# dvgrab
Found AV/C device with GUID 0x0000850000961567
ieee1394io.cc:456: In function "virtual bool iec61883Reader::StartReceive()":
"iec61883_dv_fb_start( m_iec61883.dv, channel )" evaluated to -1
ieee1394io.cc:456: errno: 22 (Invalid argument)
*** glibc detected *** dvgrab: double free or corruption (top):
0x000000000218d7a0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x2aaaabb1f832]
/lib64/libc.so.6(cfree+0x8c)[0x2aaaabb22f2c]
/usr/lib64/libiec61883.so.0(iec61883_dv_close+0x15)[0x2aaaaacced45]
/usr/lib64/libiec61883.so.0(iec61883_dv_fb_close+0x11)[0x2aaaaacced91]
dvgrab[0x421a6c]
dvgrab[0x420cbb]
dvgrab[0x40d7d4]
dvgrab[0x42451e]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x2aaaabacb074]
dvgrab(__gxx_personality_v0+0x1f9)[0x405579]
======= Memory map: ========
00400000-0043c000 r-xp 00000000 fd:02 3697428                           
/usr/bin/dvgrab
0063c000-0063d000 rw-p 0003c000 fd:02 3697428                           
/usr/bin/dvgrab
0063d000-02198000 rw-p 0063d000 00:00 0                                  [heap]
40000000-40001000 ---p 40000000 00:00 0 
40001000-40a01000 rw-p 40001000 00:00 0 
3a1e000000-3a1e006000 r-xp 00000000 fd:02 3722410                       
/usr/lib64/libraw1394.so.8.1.1
3a1e006000-3a1e205000 ---p 00006000 fd:02 3722410                       
/usr/lib64/libraw1394.so.8.1.1
3a1e205000-3a1e206000 rw-p 00005000 fd:02 3722410                       
/usr/lib64/libraw1394.so.8.1.1
3a1e400000-3a1e403000 r-xp 00000000 fd:02 3722694                       
/usr/lib64/librom1394.so.0.3.0
3a1e403000-3a1e602000 ---p 00003000 fd:02 3722694                       
/usr/lib64/librom1394.so.0.3.0
3a1e602000-3a1e603000 rw-p 00002000 fd:02 3722694                       
/usr/lib64/librom1394.so.0.3.0
3a1e800000-3a1e804000 r-xp 00000000 fd:02 3702447                       
/usr/lib64/libavc1394.so.0.3.0
3a1e804000-3a1ea03000 ---p 00004000 fd:02 3702447                       
/usr/lib64/libavc1394.so.0.3.0
3a1ea03000-3a1ea04000 rw-p 00003000 fd:02 3702447                       
/usr/lib64/libavc1394.so.0.3.0
3a26e00000-3a26e21000 r-xp 00000000 fd:02 3712252                       
/usr/lib64/libjpeg.so.62.0.0
3a26e21000-3a27020000 ---p 00021000 fd:02 3712252                       
/usr/lib64/libjpeg.so.62.0.0
3a27020000-3a27021000 rw-p 00020000 fd:02 3712252                       
/usr/lib64/libjpeg.so.62.0.0
2aaaaaaab000-2aaaaaac6000 r-xp 00000000 fd:02 5069181                   
/lib64/ld-2.7.so
2aaaaaac6000-2aaaaac04000 rw-p 2aaaaaac6000 00:00 0 
2aaaaacc5000-2aaaaacc6000 r--p 0001a000 fd:02 5069181                   
/lib64/ld-2.7.so
2aaaaacc6000-2aaaaacc7000 rw-p 0001b000 fd:02 5069181                   
/lib64/ld-2.7.so
2aaaaacc7000-2aaaaacd3000 r-xp 00000000 fd:02 3702222                   
/usr/lib64/libiec61883.so.0.1.0
2aaaaacd3000-2aaaaaed3000 ---p 0000c000 fd:02 3702222                   
/usr/lib64/libiec61883.so.0.1.0
2aaaaaed3000-2aaaaaed4000 rw-p 0000c000 fd:02 3702222                   
/usr/lib64/libiec61883.so.0.1.0
2aaaaaed4000-2aaaaaeef000 r-xp 00000000 fd:02 3702082                   
/usr/lib64/libdv.so.4.0.3
2aaaaaeef000-2aaaab0ee000 ---p 0001b000 fd:02 3702082                   
/usr/lib64/libdv.so.4.0.3
2aaaab0ee000-2aaaab0f1000 rw-p 0001a000 fd:02 3702082                   
/usr/lib64/libdv.so.4.0.3
2aaaab0f1000-2aaaab100000 rw-p 2aaaab0f1000 00:00 0 
2aaaab100000-2aaaab116000 r-xp 00000000 fd:02 5069183                   
/lib64/libpthread-2.7.so
2aaaab116000-2aaaab315000 ---p 00016000 fd:02 5069183                   
/lib64/libpthread-2.7.so
2aaaab315000-2aaaab316000 r--p 00015000 fd:02 5069183                   
/lib64/libpthread-2.7.so
2aaaab316000-2aaaab317000 rw-p 00016000 fd:02 5069183                   
/lib64/libpthread-2.7.so
2aaaab317000-2aaaab31b000 rw-p 2aaaab317000 00:00 0 
2aaaab31b000-2aaaab400000 r-xp 00000000 fd:02 3699636                   
/usr/lib64/libstdc++.so.6.0.8
2aaaab400000-2aaaab600000 ---p 000e5000 fd:02 3699636                   
/usr/lib64/libstdc++.so.6.0.8
2aaaab600000-2aaaab606000 r--p 000e5000 fd:02 3699636                   
/usr/lib64/libstdc++.so.6.0.8
2aaaab606000-2aaaab609000 rw-p 000eb000 fd:02 3699636                   
/usr/lib64/libstdc++.so.6.0.8
2aaaab609000-2aaaab61b000 rw-p 2aaaab609000 00:00 0 
2aaaab61b000-2aaaab69d000 r-xp 00000000 fd:02 5069160                   
/lib64/libm-2.7.so
2aaaab69d000-2aaaab89c000 ---p 00082000 fd:02 5069160                   
/lib64/libm-2.7.so
2aaaab89c000-2aaaab89d000 r--p 00081000 fd:02 5069160                   
/lib64/libm-2.7.so
2aaaab89d000-2aaaab89e000 rw-p 00082000 fd:02 5069160                   
/lib64/libm-2.7.so
2aaaab89e000-2aaaab89f000 rw-p 2aaaab89e000 00:00 0 
2aaaab89f000-2aaaab8ac000 r-xp 00000000 fd:02 5069188                   
/lib64/libgcc_s-4.1.2-20070925.so.1
2aaaab8ac000-2aaaabaac000 ---p 0000d000 fd:02 5069188                   
/lib64/libgcc_s-4.1.2-20070925.so.1
2aaaabaac000-2aaaabaad000 rw-p 0000d000 fd:02 5069188                   
/lib64/libgcc_s-4.1.2-20070925.so.1
2aaaabaad000-2aaaabbfa000 r-xp 00000000 fd:02 5069144                   
/lib64/libc-2.7.so
2aaaabbfa000-2aaaabdfa000 ---p 0014d000 fd:02 5069144                   
/lib64/libc-2.7.so
2aaaabdfa000-2aaaabdfe000 r--p 0014d000 fd:02 5069144                   
/lib64/libc-2.7.so
2aaaabdfe000-2aaaabdff000 rw-p 00151000 fd:02 5069144                   
/lib64/libc-2.7.so
2aaaabdff000-2aaab22ab000 rw-p 2aaaabdff000 00:00 0 
2aaab4000000-2aaab4021000 rw-p 2aaab4000000 00:00 0 
2aaab4021000-2aaab8000000 ---p 2aaab4021000 00:00 0 
7fffb3de4000-7fffb3df9000 rw-p 7ffffffea000 00:00 0                      [stack]
7fffb3dfe000-7fffb3e00000 r-xp 7fffb3dfe000 00:00 0                      [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Aborted

Comment 7 Jarod Wilson 2007-10-19 15:28:36 UTC
Forgot to mention: I can confirm that libraw1394-1.3.0 doesn't make a lick of
difference.

For those that want to play along at home:

http://koji.fedoraproject.org/packages/libraw1394/1.3.0/1.fc8/

Comment 8 Jarod Wilson 2007-10-19 20:59:27 UTC
Ah, and my crash in comment #6 is pretty much the same as the ones Will referred
to in comment #1, so I guess that much is good...

Anyhow, been poking around the code for a while this afternoon, and at least I
see why the double-free is happening...

In iso_init() in raw1394-iso.c, the 'retval = ioctl(blah);' call is failing,
which results in a call to 'free(handle->iso.packets);'. Because the init
failed, we wind up calling iec61883_dv_close() in libiec61883, which in turn,
calls raw1394_iso_shutdown, where a second 'free(handle->iso.packets);' call is
made. Oops. (along with a second attempt to 'close(handle->iso.fd);').

Still attempting to figure out why the ioctl fails, but it looks like we could
certainly use some checks in raw1394_iso_shutdown before trying to free memory
that's already been freed...

Comment 9 Jarod Wilson 2007-10-19 21:23:51 UTC
Created attachment 233121 [details]
Fix for the double-free problem...

This patch fixes the double-free issue by setting handle->iso.packets to NULL
after it gets freed the first time, so that we don't try to free it again in
raw1394_iso_shutdown(). Also adds a debugging printf that shows where we're
actually failing that led to the situation where we got the double-free...

Here's what I get for output now:
# dvgrab
Found AV/C device with GUID 0x0000850000961567
ioctl call failed, retval = -1
ieee1394io.cc:457: In function "virtual bool iec61883Reader::StartReceive()":
"iec61883_dv_fb_start( m_iec61883.dv, channel )" evaluated to -1
ieee1394io.cc:457: errno: 22 (Invalid argument)
""     0.00 MB 0 frames
Capture Stopped


The camera actually starts rolling for a bit, but as you can see, we never
actually capture any video. For some reason, dvgrab is also hanging after the
"Capture Stopped" message. Progress though...

Comment 10 Jarod Wilson 2007-10-21 05:01:53 UTC
Aha. Okay, now I see why I'm not capturing any video... The problem traces back down to the kernel 
level, and I'm finally recalling a mention somewhere that OHCI 1.0 controllers don't play nice with the 
new stack yet. Apparently, all I've got are 1.0 controllers (definitely the case in the box I'm testing on 
right now, anyhow). It ultimately boils down to calling ohci_allocate_iso_context() in drivers/firewire/
fw-ohci.c, where we have:

        /* FIXME: We need a fallback for pre 1.1 OHCI. */
        if (callback == handle_ir_dualbuffer_packet &&
            ohci->version < OHCI_VERSION_1_1)
                return ERR_PTR(-EINVAL);

So I lose... I'll have to see if I can find a 1.1 controller around here to see if with that I can actually get 
some video, but we really really really could use that pre-1.1 fallback, eh?...

Comment 11 Jarod Wilson 2007-10-21 05:04:48 UTC
I'm inclined to close this bug once I push updates for F7 and F8, since the double-free is taken care of by 
the patch attached to this bug, then open a new bug for the kernel side of things (ohci pre-1.1 support) 
and any other issues that turn up with either libraw1394 or dvgrab...

Comment 12 Stefan Richter 2007-10-21 08:12:11 UTC
> I'm finally recalling a mention somewhere that OHCI 1.0 controllers
> don't play nice with the new stack yet.

This could also be worded the other way around.  I now added some more details
on this at http://wiki.linux1394.org/JujuMigration.  The issue is also listed at
http://wiki.linux1394.org/ToDo as a top priority item.

Kristian and I were both surprised is how many OHCI 1.0:1997 controllers are
still in active use and even newly sold.  This has also to do with OHCI 1.1:2000
not being explicit at all about what's new relative to OHCI 1.0.  Also, while
the OHCI 1.1:2000 spec is available as a gratis download from Intel and
Microsoft (and this is well known among developers because you have to pay up
for most other FireWire specs), the 1.0 spec is nowhere officially available
anymore.  Clearly, the Two Other OS vendors never made use of OHCI 1.1 features,
otherwise silicon vendors would have moved to OHCI 1.1 like everybody moved from
IEEE 1394:1995 to IEEE 1394a:2000.

Comment 13 Stefan Richter 2007-10-21 08:15:41 UTC
PS:
> The issue is also listed at
> http://wiki.linux1394.org/ToDo as a top priority item.

which shouldn't be misunderstood as "somebody is doing something about it".  I
wouldn't have bothered to create that ToDo page if there were active driver
developers.

Comment 14 Stefan Richter 2007-10-21 08:45:55 UTC
Created attachment 233591 [details]
firewire: fw-ohci: log a note about unsupported features

You can carry this patch over if you open an OHCI 1.0<->1.1 bug.

Comment 15 Will Woods 2007-10-21 22:36:46 UTC
Fix confirmed - libraw1394-1.3.0-2.fc8 does not abort when running test-dv or
dvgrab. 

As you noted, though, it still doesn't work - the ioctl() returns -1, because
the Agere FW323 in the Mac Mini and in the x86_64 test machine doesn't implement
OHCI 1.1. Definitely recommend filing a new bug against 'kernel' for that issue.

I'm gonna close this bug RAWHIDE, since the updated libraw1394 will be available
there shortly.. but don't forget to push an update for F7.

Thanks!

Comment 16 Jarod Wilson 2007-10-22 01:59:32 UTC
(In reply to comment #15)
> Fix confirmed - libraw1394-1.3.0-2.fc8 does not abort when running test-dv or
> dvgrab. 
> 
> As you noted, though, it still doesn't work - the ioctl() returns -1, because
> the Agere FW323 in the Mac Mini and in the x86_64 test machine doesn't
> implement OHCI 1.1.

Hrm... I'm thinking most (all?) OCHI 1.1+ controllers are the ones that include
800Mbps firewire support, based on the fact my powerbook is the only thing I've
got with an 800Mbps firewire port, and the only machine that reports as OHCI
1.1... Of course, this is only a sampling of less than half a dozen machines,
but it would seem to make some sense.

> Definitely recommend filing a new bug against 'kernel' for that issue.

Yeah, planning to do so in a sec...


> I'm gonna close this bug RAWHIDE, since the updated libraw1394 will be available
> there shortly.. but don't forget to push an update for F7.

Done. (Still 1.2.1-based, only added the double-free fix).

> Thanks!

NP, sorry for letting it sit for so long... :\


Comment 17 Jarod Wilson 2007-10-22 14:10:58 UTC
I've filed bug 344851 to cover (the lack of) ohci 1.0 isochronous I/O support,
and bug 345221 to cover dvgrab not actually capturing any video, even on a
system with an ohci 1.1 controller.

Comment 18 Stefan Richter 2007-10-22 15:38:34 UTC
> I'm thinking most (all?) OCHI 1.1+ controllers are the ones
> that include 800Mbps firewire support

Work on a successor to the OHCI 1.1:2000 spec was apparently started but not
finished.  Hence vendors of IEEE 1394b:2002 compliant cards (among them FireWire
800 cards) often market them as "OHCI 1.1+ compliant" or something like that,
but there is no such spec.

There are also IEEE 1394a:2000/ FireWire 400 cards which implement OHCI 1.1.

Comment 19 Fedora Update System 2007-10-24 07:05:22 UTC
libraw1394-1.2.1-10.fc7 has been pushed to the Fedora 7 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 20 Mateusz Kurtas 2007-10-26 12:42:49 UTC
I have that

dvgrab
Found AV/C device with GUID 0x0000850000961567
ieee1394io.cc:456: In function "virtual bool iec61883Reader::StartReceive()":
"iec61883_dv_fb_start( m_iec61883.dv, channel )" evaluated to -1
ieee1394io.cc:456: errno: 22 (Invalid argument)

and captured 5-10 second and Stopping 

Comment 21 Jarod Wilson 2007-10-26 13:33:40 UTC
(In reply to comment #20)
> I have that
> 
> dvgrab
> Found AV/C device with GUID 0x0000850000961567
> ieee1394io.cc:456: In function "virtual bool iec61883Reader::StartReceive()":
> "iec61883_dv_fb_start( m_iec61883.dv, channel )" evaluated to -1
> ieee1394io.cc:456: errno: 22 (Invalid argument)
> 
> and captured 5-10 second and Stopping 

You appear to have an OHCI 1.0 card, which won't work at the moment, so the
above is the expected behavior right now.


Comment 22 Mateusz Kurtas 2007-11-15 10:29:14 UTC
And this OHCI 1.0 when start work?


Note You need to log in before you can comment on or make changes to this bug.