Description of problem: oprofile does not seem to understand separate debuginfo. For example, even after I install scim-bridge-debuginfo, opreport -l does not show detailed symbol information from scim-bridge. There seems to be a related fix (the patch changing bfd_support.*) in oprofile 0.9.6. However, the resulting opreport does not work at all on my kernel, either when I rebuild and install the entire oprofile-0.9.6-2.fc13.src.rpm, or backport just the bfd_support patches to 0.9.5-4.fc12. Overall, the support of separate debuginfo in libbfd-based applications still seems to be quite messy. I can't even persuade addr2info to make use of the information, although eu-addr2info does. Version-Release number of selected component (if applicable): kernel-2.6.31.9-174.fc12.x86_64 oprofile-0.9.5-4.fc12.x86_64 How reproducible: Always Steps to Reproduce: 1. Install the debuginfo package of some user-space application or library, e.g. scim-bridge 2. Use oprofile to collect some samples in the application. 3. Run opreport -l Actual results: Only the executable or library names are shown in the "symbol name" column, even when the relevant debuginfo has been installed. Expected results: The relevant symbol name should be shown. Additional info:
Created attachment 386296 [details] The aforementioned patch in 0.9.6 This is the aforementioned patch in 0.9.6. It seems to be assuming that the corresponding sections have the same number in the executable/library and in the debuginfo file. However, this is not true for executables on Fedora 12. For example, in scim-bridge-0.4.16-2.fc12.x86_64, .dynstr is section 30 in /usr/bin/scim-bridge according to readelf -SW, but it is section 6 in /usr/lib/debug/usr/bin/scim-bridge.debug.
Created attachment 386297 [details] Workaround that simply ignores SEC_LOAD This patch to 0.9.5 simply removes the SEC_LOAD check in interesting_symbol(). It seems to work for me, although I'm not sure of its implications. (SEC_LOAD is usually not set for sections in the .debug file because the sections are NOBITS there; see bfd/elf.c:_bfd_elf_make_section_from_shdr(). This makes opreport etc. ignore all symbols from the .debug file. For certain libraries such as libc, the library itself is not stripped, so this inability to load separate debuginfo may not be as apparent.)
oprofile-0.9.6-2.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/oprofile-0.9.6-2.fc12
oprofile-0.9.6-2.fc12 is broken in the way described in Comment 1. opreport simply dies with "opreport error: profile_t::samples_range(): start > end something wrong with kernel or module layout ?" whenever the profile includes any executable with separate debuginfo, so the end-user experience is even worse. For example, to reproduce this bug with gcc: 1. Install oprofile-0.9.6-2.fc12 2. Install gcc-4.4.3-4.fc12.x86_64 and gcc-debuginfo-4.4.3-4.fc12.x86_64 3. Run the following commands: # readelf -SW /usr/libexec/gcc/x86_64-redhat-linux/4.4.3/cc1 | grep dynstr [19] .dynstr STRTAB 0000000000c5459c 85459c 000849 00 A 0 0 1 # readelf -SW /usr/lib/debug/usr/libexec/gcc/x86_64-redhat-linux/4.4.3/cc1.debug | grep dynstr [ 6] .dynstr NOBITS 0000000000401818 000248 00082d 00 A 0 0 1 # opcontrol --reset && opcontrol --start && sleep 20 && opcontrol --stop Signalling daemon... done Profiler running. (Run gcc in the meantime, e.g. by building something) Stopping profiling. # opreport -l | less ... opreport error: profile_t::samples_range(): start > end something wrong with kernel or module layout ? please report problem to oprofile-list.net # opreport.my -l | less (This is the 0.9.5-4 opreport with the patch in comment 2, which appears to work) ... samples % image name app name symbol name 9568 1.5946 libc-2.11.1.so cc1 _int_malloc 8816 1.4693 cc1 cc1 _cpp_lex_direct 8687 1.4478 cc1 cc1 ht_lookup_with_hash 6518 1.0863 cc1 cc1 htab_find_slot_with_hash ...
The eliminating the SEC_LOAD check on 0.9.5 seems to remove the symptom. However, the same patch doesn't help with oprofile-0.9.6. This is likely not dealing with the root cause.
Well, since the section numbers can differ between the image and the debuginfo files, I guess we have to fix bfd_info::translate_debuginfo_syms() so that it compares section names instead. An alternative is to remove the call to translate_debuginfo_syms() together with the SEC_LOAD check, thus reverting to 0.9.5 behavior. Since translate_debuginfo_syms() seems to be used elsewhere, though, it isn't a thorough fix.
The difference is the section numbers is unlikely the problem. I compiled the stock oprofile 0.9.4 with the oprofile-basename.patch configuration patch. The oprofile 0.9.4 is able to read separate debuginfo and display the results. Running opreport with the "--verbose=all" to get a better idea of what is going on. For the good run (oprofile 0.9.4) see the following in the output: ... bfd_info::get_symbols() for /usr/lib/debug/bin/rm.debug bfd_get_symtab_upper_bound: 2688 bfd_canonicalize_symtab: 335 number of symbols before filtering 134 number of symbols now 134 ... symbol atexit, value 7270 start 8d90, end 8db0 in section .text, filepos 1b20 symbol __do_global_ctors_aux, value 7290 start 8db0, end 8de8 in section .text, filepos 1b20 symbol _fini, value 0 start 8de8, end e058 in section .fini, filepos 8de8 For the bad run (upstream oprofile) see the following in the output: ... now loading: /usr/lib/debug/bin/rm.debug bfd_info::get_symbols() for /usr/lib/debug/bin/rm.debug bfd_get_symtab_upper_bound: 2688 bfd_canonicalize_symtab: 335 number of symbols before filtering 151 number of symbols now 151 ... symbol __libc_csu_init, value 71e0 start 88a8, end 8938 in section .plt, filepos 16c8 symbol atexit, value 7270 start 8938, end 8958 in section .plt, filepos 16c8 symbol __do_global_ctors_aux, value 7290 start 8958, end 1b18 in section .plt, filepos 16c8 That values for symbol __do_global_ctors_aux end of 1b18 looks pretty suspect.
Created attachment 405000 [details] Reverted a patch for overlay symbols for Cell SPE applications Reverting this patch allow oprofile to work with the separate debuginfo file. The patch changes how the end and the resulting size is computed. Adds the following function: unsigned long op_bfd_symbol::symbol_endpos(void) const { return bfd_symbol->section->filepos + bfd_symbol->section->size; } The uses of this function is causing oprofile to fail.
oprofile-0.9.6-2.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update oprofile'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/oprofile-0.9.6-2.fc12
There has been discussion on the oprofile mailing list about this problem. A pointer to the oprofile email containing a proposed patch for this: http://marc.info/?l=oprofile-list&m=127119391506908&w=2
oprofile-0.9.6-5.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/oprofile-0.9.6-5.fc12
oprofile-0.9.6-5.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update oprofile'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/oprofile-0.9.6-5.fc12
oprofile-0.9.6-5.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report.