Bug 169672

Summary: libdwfl kernel_report succeeds even if no debuginfo found
Product: [Fedora] Fedora Reporter: Frank Ch. Eigler <fche>
Component: elfutilsAssignee: Roland McGrath <roland>
Status: CLOSED RAWHIDE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: rawhide   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-01-08 20:42:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Frank Ch. Eigler 2005-09-30 21:06:11 UTC
systemtap uses the "dwfl_linux_kernel_report_kernel" entry point to ask elfutils
to locate the kernel-debuginfo.  Formerly, this used fail by returning an errno
or -1 if the debuginfo was not found.  But now it returns *zero* even in case of
failure.

This is because:
- report_kernel uses try_kernel_name to locate /vmlinux, but around line 91,
reports failure by returning errno, but
- try_kernel_name fails the open64 branch, and backs down to ...
- dwfl_standard_find_debuginfo, which sets errno to 0 upon failure

Until this bug is fixed, can you suggest an api call sequence against this
elfutils version, so that systemtap can detect after the fact that debuginfo was
not in fact found?

Version-Release number of selected component (if applicable):
0.115-0.1

Comment 1 Roland McGrath 2005-10-01 03:24:04 UTC
(Please set "version" to "devel" when using the systemtap-elfutils.repo
elfutils, which is really the rawhide elfutils, not fc4 elfutils.)

It is indeed a bug that dwfl_linux_kernel_report_kernel returns 0 when it found
no kernel.  But note that actual success does not in theory mean there is debug
info.
(One could have an installation with a stripped vmlinux in /boot and a
/usr/lib/debug/boot/vmlinux-*.debug file, for example.)  For each Dwfl_Module
(kernel and each .ko, in the kernel case), we can know about it, and then we
might find an ELF file, and then we might find debug info (i.e. four total
states between "never heard of it" and complete success).  For each module, if
you want to know for sure that its debuginfo was found and not grossly
corrupted, calling dwarf_module_getdwarf tells you for sure that you have debug
info, or a dwfl_err* detailed failure specific to that particular module.
Note you don't want to call that too eagerly, since debuginfo is loaded only on
demand, and so you slow down by calling it on any module you are never actually
going to examine.  (AFAIK at the moment systemtap is not prepared to avoid
examining all modules anyway, but something to keep in mind.)

Comment 2 Frank Ch. Eigler 2005-10-01 17:28:31 UTC
Thanks for the information.  systemtap now treats the dwarf_module_getdwarf
failures more specifically.  Fixing the _report RC bug is not at all urgent.

Comment 3 Roland McGrath 2005-11-02 19:52:42 UTC
The library bug ought to be fixed in 0.116.  Please verify.

Comment 4 Roland McGrath 2006-01-08 20:42:33 UTC
Frank is never going to test this.