Description of problem:
The module bias returned by dwfl_module_getdwarf() is incorrect for a prelinked "/usr/bin/find", causing that app to segfault when probed by systemtap due to badly aligned breakpoint addresses. Running "prelink -u /usr/bin/find" avoids the issue.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
With "find" having prelink -u, and find.prelink being the original prelinked:
$ eu-addr2line -e find 0x403ea0
$ eu-addr2line -e find.prelink 0x403ea0
That address is main(). Prelink didn't move .text, so it should find the same source:line both times.
Should be fixed upstream as of commit 3a44c9a (there are three consecutive commits that add up to the whole fix).
For getting a test case, two issues were relevant here (finding three bugs, in fact).
First was that .interp changed address after prelink. This was not correctly handled by libdwfl, though it didn't actually matter in this case. libdwfl was failing to decode PT_INTERP from .gnu.prelink_undo at all for 64-bit files. libdwfl was also using PT_INTERP from .gnu.prelink_undo when considering the prelinked file's sections, which was wrong--it had to consider the file's PT_INTERP as set by prelink to match it up with the .interp as moved by prelink. Those issues all matter only for the purpose of excluding .interp from consideration as a synchronization address. This would not come up unless .interp was the only PROGBITS/NOBITS section in the file, or the highest-addressed one. Those are things that could happen in a valid ELF file, but will never happen in practice.
The second issue was that prelink split the .bss section into .dynbss and .bss, thus changing the sh_addr of .bss. So, while the actual contents at the same addresses continued to match up, the sh_addr of .bss did not match up. Since .bss was the highest-addressed PROGBITS/NOBITS section, it was chosen as the synchronization address and this was wrong. The new code uses the highest sh_addr+sh_size (section ending address) instead.
I was trying to write a test for this, and 'prelink -u /usr/bin/find' does not fix things for me (still segfaults under systemtap). Am I doing something wrong here?
I have the identical version of packages as in the initial comment, and I see the same outputs of eu-addr2line for both prelinked and un-prelinked binaries.
You probably need to clear your ~/.systemtap/cache directory. With the same script, stap will see that it was the same build ID for /usr/bin/find and reuse the module compiled the first time.
Actually, stap should still use a different cache id, as the file size will have changed. That, and caching efforts don't come into play until pass-3/4, and it's in pass-2 it ought to (incorrectly) come up with the different module bias.
Do you by chance have /bin in your PATH before /usr/bin? "find" exists in both places, the latter actually being a symlink. Operating prelink on the symlink will separate the two, so actually we should probably operate everything on /bin/find instead.
elfutils-0.152-1.fc13 has been submitted as an update for Fedora 13.
elfutils-0.152-1.fc14 has been submitted as an update for Fedora 14.
elfutils-0.152-1.fc14 has been pushed to the Fedora 14 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
su -c 'yum --enablerepo=updates-testing update elfutils'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/elfutils-0.152-1.fc14
elfutils-0.152-1.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report.
elfutils-0.152-1.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report.