Red Hat Bugzilla – Bug 877187
oprofile finds mismatched CRC between many runtime binaries and their respective debuginfo file
Last modified: 2013-02-12 23:27:08 EST
Created attachment 646013 [details]
testcase for reproducing the problem
Description of problem:
OProfile reports don't show symbol names for most samples, even though debuginfo packages are installed. When I was initially asked to help look at this, I noted that Fedora18/ppc64 was still using oprofile 0.9.7 which had a number of known debuginfo bugs. But even after upgrading to 0.9.8, we got the pretty close to the same results. A system-wide profile showed we weren't getting symbol info for samples in various packages: mesa, gnome-shell, libX11, cairo, to name a few. When I generated a "--verbose=all" report, I saw that in each failing case, oprofile found that the CRC value stored in the runtime binary did not match the CRC it calculated from the debuginfo file's contents. This algorithm as worked for oprofile for 10+ years, so it's a mystery why it would be failing now.
Version-Release number of selected component (if applicable): 0.9.8
How reproducible: Consistent. Note that I've also seen this same issue on my RHEL 6.3 Intel laptop. But the problem is even worse on the laptop, since the glibc-debuginfo fails the CRC check. At least on Fedora18/ppc64, the glibc-debuginfo is processed OK by oprofile.
Steps to Reproduce:
1. Install openssl, openssl-devel, and openssl-debuginfo packages.
2. Compile the attached testcase (ssl_test.c): gcc -g ssl_test.c -lcrypto -o sll
3. Assuming you have oprofile 0.9.8 installed, run 'operf ./ssl'.
4. Generate a debug info report: opreport --symbols --debug-info
5. There should be some samples for libcrypto, but no linenr or symbol information.
6. Re-run opreport and add "--verbose=all" and redirect the output to a file.
7. Open the report file and search for "found /usr/lib/debug/usr/lib64/libcrypto.so.1.0.1c.debug". Note the lines before and after. It should appear something like the following:
looking for debugging file libcrypto.so.1.0.1c.debug with crc32 = 33de6fcf
found /usr/lib/debug/usr/lib64/libcrypto.so.1.0.1c.debug with crc32 = 79848060
failed to process separate debug file
It fails to process the separate debug file due to the mis-match in the CRC.
I did some testing to see if GDB might be encountering similiar issues when using separate debuginfo files. My first test of running 'gdb ./ssl' (where 'ssl' is the attached testcase used to reproduce the oprofile proble) showed that gdb was able to successfully obtain the debug info for libcrypto. In digging into how gdb obtains debug info, I found that it can use two different techniques: the build-id technique and the CRC technique. I found the specific build-id files related to libcrypto, and I moved them to a different location so that gdb would not be able to find them. Then I re-ran the 'gdb ./ssl' test, setting a breakpoint for a function in libcrypto. Here are my results:
[mpj@dhcp-9-5-170-12 ~]$ gdb ./ssl
GNU gdb (GDB) Fedora (184.108.40.20620926-25.fc18)
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "ppc64-redhat-linux-gnu".
For bug reporting instructions, please see:
Reading symbols from /home/mpj/ssl...(no debugging symbols found)...done.
(gdb) b EVP_add_cipher
Function "EVP_add_cipher" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (EVP_add_cipher) pending.
Starting program: /home/mpj/ssl
warning: the debug information found in "/usr/lib/debug/usr/lib64/libcrypto.so.1.0.1c.debug" does not match "/lib64/libcrypto.so.10" (CRC mismatch).
warning: the debug information found in "/usr/lib/debug//usr/lib64/libcrypto.so.1.0.1c.debug" does not match "/lib64/libcrypto.so.10" (CRC mismatch).
warning: the debug information found in "/usr/lib/debug/usr/lib64//libcrypto.so.1.0.1c.debug" does not match "/lib64/libcrypto.so.10" (CRC mismatch).
Missing separate debuginfo for /lib64/libcrypto.so.10
Try: yum --disablerepo='*' --enablerepo='*debug*' install /usr/lib/debug/.build-id/d3/cc0484f7c0c1b5459eb5a74e75d5ab69c8ee57.debug
Breakpoint 1, 0x00000080c9f5f3a4 in .EVP_add_cipher () from /lib64/libcrypto.so.10
As can be seen above, when gdb is forced to fall back to its crc technique for matching up separate debuginfo files with their corresponding runtime bniary, it fails, too -- just like oprofile. This implies to me that the debuginfo files are either stale or have not been built correctly.
(In reply to comment #0)
> Created attachment 646013 [details]
> testcase for reproducing the problem
> Description of problem:
> OProfile reports don't show symbol names for most samples, even though
> debuginfo packages are installed.
> How reproducible: Consistent. Note that I've also seen this same issue on
> my RHEL 6.3 Intel laptop.
I have to retract this statement about also seeing this issue on my RHEL 6.3 laptop. When I was installing openssl packages (runtime, devel, and debuginfo) to set up my laptop to reproduce the "ssl_test.c" testcase (attached), I didn't notice that when I installed the devel package, yum picked the most recent version of devel package and then automatically upgraded the runtime package, thus making the runtime package a newer version than the openssl-debuginfo package I had installed. So the CRC mis-match oprofile later reported was actually correct in that case.
I wrote a patch for oprofile to add support for using the biuld-id to find/validate the debuginfo file as the primary mechanism, retaining the CRC method as a fallback (as GDB does). See the attachment to oprofile bug http://sourceforge.net/tracker/?func=detail&atid=116191&aid=3591165&group_id=16191. With this patch, oprofile works properly on Fedora 18, basically making the CRC mis-match issue moot.
Maynard, can you list the exact rpm versions (including arch field) for all the rpms that are related to your comment #0 situation?
(In reply to comment #4)
> Maynard, can you list the exact rpm versions (including arch field) for all
> the rpms that are related to your comment #0 situation?
I just reproduced this on "Fedora release 18 (Spherical Cow)". Here are the openssl RPMs I used:
oprofile-0.9.8-3.fc18 has been submitted as an update for Fedora 18.
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing oprofile-0.9.8-3.fc18'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
oprofile-0.9.8-3.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report.