Bug 132043
| Summary: | x86_64 dvgrab issue | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Dean Kolosiek <kolosiek> |
| Component: | dvgrab | Assignee: | Warren Togami <wtogami> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | rawhide | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2005-05-31 04:37:45 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Dean Kolosiek
2004-09-08 07:48:40 UTC
I made a simple test case. I get different results by adding -static
to the link:
#include <stdio.h>
#include <errno.h>
int main(int argv[], int argc)
{
int error_number;
char * error_string;
error_number = 13;
error_string = strerror(error_number);
printf("error_string: %s\n", error_string);
return 0;
}
[kolosiek@plato test]$ gcc -c -g -o test.o test.c
test.c: In function `main':
test.c:11: warning: assignment makes pointer from integer without a cast
[kolosiek@plato test]$ gcc -g -o test test.o
[kolosiek@plato test]$ ./test
Segmentation fault
[kolosiek@plato test]$ gcc -g -static -o test test.o
[kolosiek@plato test]$ ./test
error_string: Permission denied
strerror() is supposed to return a char* but the warning implies it
returns an integer. In the version linked without -static I get this
symbol for strerror from nm:
test: U strerror@@GLIBC_2.2.5
rpm lists glibc twice:
[kolosiek@plato test]$ rpm -q glibc
glibc-2.3.3-27
glibc-2.3.3-27
There seems to be a fundamental mismatch between includes, libraries,
rpms and/or man pages.
rpm listing glibc twice is probably because x86_64 installs i386 stuff
for compatibility. You can check for sure with:
rpm -q glibc --qf '%{name}-%{version}-%{release}(%{arch})\n'
Yes, that explains that part.
[kolosiek@plato test]$ rpm -q glibc --qf
%{name}-%{version}-%{release}(%{arch})\n'
glibc-2.3.3-27(i686)
glibc-2.3.3-27(x86_64)
<foo> warren: There are two different strerror_r implementations in glibc. one returns a string, the other an int. If the char* return value is expected but the compiler sees the int return variant, the program might crash <arjan> but... he wasn't calling strerror() without _r ? <foo> arjan: yeah, maybe <foo> warren: if error_string = strerror(error_number); leads to that warning is simply means <string.h> has not been included <foo> the "but this is not the warning from the original build." is irritating, though <foo> if the program really uses strerror (run nm to find out) then the return value is a pointer. If the compiler generates the warning shown in the bug then the strerror prototype has not been seen, which means <string.h> has not been included <foo> but this guy also talks about the warning not being printed when the real package is built Warren is adding this on TODO for this week to test locally on x86_64. In the mean time please report any new findings. <string.h> was not included. I added the include and it works now, without linking -static, for both my test case and my modified dvgrab. However, I still don't understand why -static made a difference to the seg fault when <string.h> was not included. By "but this is not the warning from the original build." I meant that my debugging code added another warning. There are several calls to strerror() in raw1394util.c but they are inside fprintf() calls, so there is no type checking on them. My debugging code moved the strerror() outside of fprintf() which resulted in it being type checked. The original line was: fprintf( stderr, "raw1394 - failed to get handle: %s.\n", strerror( errno ) ); I haven't even gotten to the warning in the original build, but it's another type check on a seperate line of code. I tried stepping through strerror() once, but it got into a lot of localization routines to find the error message and I got confused. I noticed the man page for strerror implies that strerror is not thread-safe, but I think dvgrab is multithreaded. My test case above is definitely not multithreaded, however. By "it works now", I mean that line of code with strerr(). I still don't have dvgrab grabbing video. Not a glibc issue. I changed the summary. Maybe even the x86-64 should be removed. I got back to dvgrab today. Now I'm really confused - even with <string.h> included dvgrab, I still see the behavior in variables I originally reported while stepping through the code in the debugger, where the parameter error_number appears to be clobbered by strerror() and became 1655725376, and error_string was set to 0x01. The code prints an error string, so it must be confusion in the debugger. I was going nuts for a while trying to figure out why it broke again, and why strerror() clobbered the parameter. I figured out why -static makes it work without <string.h>. Without <string.h> the return type of strerror in the assignment is defaulted to int which is 32 bits. Without -static, the messages are at an address that is bigger than 32 bits that gets truncated in the assignment. With -static the messages are at an address that fits within 32 bits. This problem must only appear on 64 bit machines. The warning in the original build is another 64 bit pointer/32 bit int warning. The cast is from int 32 to pointer 64, so the data fits, but there's still a warning. As far as I can tell the offending line has no functional use, except perhaps debugging. It stores a value that I don't see retrieved. My brain hurts. I got it working on my machine by installing the 2.6.8-1.521 kernel. I wrote problem reports DVGRAB-39 and DVGRAB-40 at http://kino.schirmacher.de/ [afolger@localhost projects]$ uname -r 2.6.8-1.603 [afolger@localhost projects]$ rpm -q dvgrab dvgrab-1.6-1 [afolger@localhost projects]$ dvgrab foo Segmentation fault ======================== So, even in 1.6, this issue seems not to have been fully worked out. I am not a C programmer, so I can't help out in that sense, but will gladly test the result. Let me hurry to add: [afolger@localhost projects]$ cat /etc/fedora-release Fedora Core release 3 (Heidelberg) ============================ So that this remained even after fc2. I should have said that the seg fault occurs instead of printing an error message that starts out "raw1394 - failed to get handle: " because it can't access the camera. Fixing the seg fault in the code gets the user a better error message. After upgrading the kernel I still had to start the Firewire stuff running: su /sbin/modprobe ohci1394 /sbin/modprobe ieee1394 /sbin/modprobe raw1394 chmod 666 /dev/raw1394 Verify with more /proc/modules. The fix for the seg fault is easy, they just haven't made a release with it. It just needs #include <string.h> in raw1394util.c. They released 1.7 without fixing it. Well, in the mean time, I uninstalled dvgrab.x86_64 and installed the i386 version instead (long live bi-arch!), and ... it outputs "raw1394 - failed to get handle: ". So I modprobed raw1394 and dv1394, and now it no longer complains about that, ... but complains that "raw1394 - failed to get handle: Invalid argument". Googling didn't quite bring up useful, up to date discussions (it's mostly about migration problems from 2.4 to 2.6, which is no longer very relevant). So, I still have a dvgrab issue. What now? (need I post a separate bug report for this new thing?) Should the issue be reclassified as fc3, instead of fc2? This bug became too confusing with multiple different issues reported. If you still have problems, report that SINGLE problem in a new report. Anyway the prob seems to go away if you do your modprobes. |