Bug 132043 - x86_64 dvgrab issue
x86_64 dvgrab issue
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: dvgrab (Show other bugs)
rawhide
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Warren Togami
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-09-08 03:48 EDT by Dean Kolosiek
Modified: 2007-11-30 17:10 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-05-31 00:37:45 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Dean Kolosiek 2004-09-08 03:48:40 EDT
Description of problem:

Running dvgrab immediately results in a segmentation fault.

Version-Release number of selected component (if applicable):

dvgrab-1.5

How reproducible:

always

Steps to Reproduce:
1. dvgrab foo
  
Actual results:

Segmentation fault

Expected results:

Capture some video from camera.

Additional info:

I've tried building dvgrab-1.6 to debug it. It builds with a warning
on a type cast, and a fault occurs in the first fprintf in
raw1394util.c in raw1394_get_num_ports. I added some debugging lines

int error_number;
char * error_string;
...
error_number = errno;
error_string =  strerror(error_number);

The second line results in the warning

raw1394util.c:30: warning: assignment makes pointer from integer
without a cast

but this is not the warning from the original build.

While stepping through the first assignment I added error_number is
initially 13, then after the call to strerror() error_number became
1655725376 and error_string was 0x1, as if invoking a function like

int strerror_r(int errnum, char *buf, size_t n);

instead of invoking

char *strerror(int errnum);

There must be something wrong with how dvgrab is built, but I don't
see the problem. I wonder if it is specific to x86_64.
Comment 1 Dean Kolosiek 2004-09-12 17:34:19 EDT
I made a simple test case. I get different results by adding -static
to the link:

#include <stdio.h>
#include <errno.h>
 
int main(int argv[], int argc)
{
        int error_number;
        char * error_string;
 
        error_number = 13;
        error_string =  strerror(error_number);
        printf("error_string: %s\n", error_string);
 
        return 0;
}

[kolosiek@plato test]$ gcc -c -g -o test.o test.c 
test.c: In function `main':
test.c:11: warning: assignment makes pointer from integer without a cast
[kolosiek@plato test]$ gcc -g -o test test.o 
[kolosiek@plato test]$ ./test 
Segmentation fault
[kolosiek@plato test]$ gcc -g -static -o test test.o
[kolosiek@plato test]$ ./test
error_string: Permission denied

strerror() is supposed to return a char* but the warning implies it
returns an integer. In the version linked without -static I get this
symbol for strerror from nm:

test:                 U strerror@@GLIBC_2.2.5

rpm lists glibc twice:

[kolosiek@plato test]$ rpm -q glibc
glibc-2.3.3-27
glibc-2.3.3-27

There seems to be a fundamental mismatch between includes, libraries,
rpms and/or man pages.
Comment 2 Warren Togami 2004-09-12 17:49:29 EDT
rpm listing glibc twice is probably because x86_64 installs i386 stuff
for compatibility.  You can check for sure with:
rpm -q glibc --qf '%{name}-%{version}-%{release}(%{arch})\n'
Comment 3 Dean Kolosiek 2004-09-12 18:01:12 EDT
Yes, that explains that part.

[kolosiek@plato test]$ rpm -q glibc --qf
%{name}-%{version}-%{release}(%{arch})\n'
glibc-2.3.3-27(i686)
glibc-2.3.3-27(x86_64)
Comment 4 Warren Togami 2004-09-13 05:43:44 EDT
<foo> warren: There are two different strerror_r implementations in
glibc.  one returns a string, the other an int.  If the char* return
value is expected but the compiler sees the int return variant, the
program might crash
<arjan> but... he wasn't calling strerror() without _r ?
<foo> arjan: yeah, maybe
<foo> warren: if error_string =  strerror(error_number); leads to that
warning is simply means <string.h> has not been included
<foo> the "but this is not the warning from the original build." is
irritating, though
<foo> if the program really uses strerror (run nm to find out) then
the return value is a pointer.  If the compiler generates the warning
shown in the bug then the strerror prototype has not been seen, which
means <string.h> has not been included
<foo> but this guy also talks about the warning not being printed when
the real package is built

Warren is adding this on TODO for this week to test locally on x86_64.
 In the mean time please report any new findings.
Comment 5 Dean Kolosiek 2004-09-13 18:15:27 EDT
<string.h> was not included. I added the include and it works now,
without linking -static, for both my test case and my modified dvgrab.
However, I still don't understand why -static made a difference to the
seg fault when <string.h> was not included.

By "but this is not the warning from the original build." I meant that
my debugging code added another warning. There are several calls to
strerror() in raw1394util.c but they are inside fprintf() calls, so
there is no type checking on them. My debugging code moved the
strerror() outside of fprintf() which resulted in it being type
checked. The original line was:
fprintf( stderr, "raw1394 - failed to get handle: %s.\n", strerror(
errno ) );

I haven't even gotten to the warning in the original build, but it's
another type check on a seperate line of code.

I tried stepping through strerror() once, but it got into a lot of
localization routines to find the error message and I got confused.

I noticed the man page for strerror implies that strerror is not
thread-safe, but I think dvgrab is multithreaded. My test case above
is definitely not multithreaded, however.
Comment 6 Dean Kolosiek 2004-09-13 18:28:20 EDT
By "it works now", I mean that line of code with strerr(). I still
don't have dvgrab grabbing video.
Comment 7 Ulrich Drepper 2004-09-15 18:07:03 EDT
Not a glibc issue.  I changed the summary.  Maybe even the x86-64
should be removed.
Comment 8 Dean Kolosiek 2004-09-16 22:11:59 EDT
I got back to dvgrab today. Now I'm really confused - even with
<string.h> included dvgrab, I still see the behavior in variables I
originally reported while stepping through the code in the debugger,
where the parameter error_number appears to be clobbered by strerror()
and became 1655725376, and error_string was set to 0x01. The code
prints an error string, so it must be confusion in the debugger. I was
going nuts for a while trying to figure out why it broke again, and
why strerror() clobbered the parameter.

I figured out why -static makes it work without <string.h>. Without
<string.h> the return type of strerror in the assignment is defaulted
to int which is 32 bits. Without -static, the messages are at an
address that is bigger than 32 bits that gets truncated in the
assignment. With -static the messages are at an address that fits
within 32 bits. This problem must only appear on 64 bit machines.

The warning in the original build is another 64 bit pointer/32 bit int
warning. The cast is from int 32 to pointer 64, so the data fits, but
there's still a warning. As far as I can tell the offending line has
no functional use, except perhaps debugging. It stores a value that I
don't see retrieved.

My brain hurts.
Comment 9 Dean Kolosiek 2004-11-03 20:42:22 EST
I got it working on my machine by installing the 2.6.8-1.521 kernel.

I wrote problem reports DVGRAB-39 and DVGRAB-40 at
http://kino.schirmacher.de/
Comment 10 A. Folger 2005-01-01 18:08:28 EST
[afolger@localhost projects]$ uname -r 
2.6.8-1.603 
[afolger@localhost projects]$ rpm -q dvgrab 
dvgrab-1.6-1 
[afolger@localhost projects]$ dvgrab foo 
Segmentation fault 
======================== 
 
So, even in 1.6, this issue seems not to have been fully worked out. 
I am not a C programmer, so I can't help out in that sense, but will 
gladly test the result. 
Comment 11 A. Folger 2005-01-01 18:10:14 EST
Let me hurry to add: 
[afolger@localhost projects]$ cat /etc/fedora-release 
Fedora Core release 3 (Heidelberg) 
============================ 
 
So that this remained even after fc2. 
Comment 12 Dean Kolosiek 2005-01-01 19:05:44 EST
I should have said that the seg fault occurs instead of printing an
error message that starts out   "raw1394 - failed to get handle: "
because it can't access the camera. Fixing the seg fault in the code
gets the user a better error message.

After upgrading the kernel I still had to start the Firewire stuff
running:

su
/sbin/modprobe ohci1394
/sbin/modprobe ieee1394
/sbin/modprobe raw1394
chmod 666 /dev/raw1394
Verify with more /proc/modules.

The fix for the seg fault is easy, they just haven't made a release
with it. It just needs #include <string.h> in raw1394util.c. They
released 1.7 without fixing it.
Comment 13 A. Folger 2005-01-02 03:57:29 EST
Well, in the mean time, I uninstalled dvgrab.x86_64 and installed the 
i386 version instead (long live bi-arch!), and ... it outputs 
"raw1394 - failed to get handle: ". So I modprobed raw1394 and 
dv1394, and now it no longer complains about that, ... but complains 
that "raw1394 - failed to get handle: Invalid argument". 
 
Googling didn't quite bring up useful, up to date discussions (it's 
mostly about migration problems from 2.4 to 2.6, which is no longer 
very relevant). So, I still have a dvgrab issue. What now? (need I 
post a separate bug report for this new thing?) 
 
Should the issue be reclassified as fc3, instead of fc2? 
Comment 14 Warren Togami 2005-05-31 00:37:45 EDT
This bug became too confusing with multiple different issues reported.  If you
still have problems, report that SINGLE problem in a new report.
Comment 15 Martin Ellison 2006-01-31 02:16:54 EST
Anyway the prob seems to go away if you do your modprobes.

Note You need to log in before you can comment on or make changes to this bug.