Bug 1130781

Summary: Symbols of deleted libraries do not resolve
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: elfutilsAssignee: Jan Kratochvil <jan.kratochvil>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: aoliva, fche, jakub, jan.kratochvil, mjw, mjw, pmachata, roland
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-08-29 21:06:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Kratochvil 2014-08-17 17:04:23 UTC
Tested with trunk: 9d29ed2989b6691457bbd602de740c4423ac8781

jankratochvil/deletedtest
with trunk it produces:
        #0  0x00007fc4d83c2970 __nanosleep
        #1  0x00007fc4d83c2824 sleep
        #2  0x00007fc4d86c56d6
        #3  0x0000000000400938 main
        #4  0x00007fc4d8327d65 __libc_start_main
        #5  0x0000000000400799 _start
with jankratochvil/deletedfix it produces:
        #0  0x00007f6109184970 __nanosleep
        #1  0x00007f6109184824 sleep
        #2  0x00007f61094876d6 libfunc
        #3  0x0000000000400938 main
        #4  0x00007f61090e9d65 __libc_start_main
        #5  0x0000000000400799 _start

jankratochvil/deletedfix
    Original fix as posted by Jan Kratochvil
    It works unrelated to whether jankratochvil/deletedrevert is applied.
    It is merged:
    [patch 1/3] Extend __libdw_open_file and elf_begin as *_at_offset
    [patch 3/3] Access deleted files by /dev/PID/mem

jankratochvil/deletedrevert
    It reverts:
    commit 4b9e1433d2272f5f68b3227abdd9cf6817a0afd3
    Author: Mark Wielaard <mjw>
    Date:   Tue Mar 4 11:27:15 2014 +0100
        libdwfl: dwfl_linux_proc_find_elf use elf_from_remote_memory for (deleted).
    It has no effect on jankratochvil/deletedfix.
This upstream commit had no testcase so it was not discovered it does not work.

Comment 1 Mark Wielaard 2014-08-18 08:54:55 UTC
Thanks for the testcase. It does show things work for unwinding, the "deleted" ELF image is recovered through elf_from_remote_memory and the .eh_frame is found causing the unwinder to go through the "deleted" module. But it doesn't handle address/function name resolving.

That isn't too surprising. elf_from_remote_memory is conservative and probably only retrieved the memory mapped regions of the ELF image. In your example I saw elf_from_remote_memory explicitly clears the shdrs from the image. Which makes the symtab symbols unreachable.

But since your elf_begin_at_offset seems to be able to provide them it might be that elf_from_remote_memory is too conservative and could be fixed to map in more of the ELF image to make not just unwinding but also addr/function name resolving work. The best approach is probably to trace which parts of the "deleted .so" are actually mmaped in (maybe use /proc/PID/maps) and match them with the regions that elf_from_remote_memory picked up from the PT_LOAD map.

Note there are still some open issues with the patches your propose as discussed on the list:
https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-February/003836.html
https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-February/003838.html
https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-March/003865.html
https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-March/003863.html

Comment 2 Jan Kratochvil 2014-08-18 11:45:24 UTC
(In reply to Mark Wielaard from comment #1)
> Thanks for the testcase.

The testcase was already part of the original posting (although it had a bug):
  [patch 3/3] Access deleted files by /dev/PID/mem


> That isn't too surprising. elf_from_remote_memory is conservative and
> probably only retrieved the memory mapped regions of the ELF image. In your
> example I saw elf_from_remote_memory explicitly clears the shdrs from the
> image. Which makes the symtab symbols unreachable.

All the needed symbols are in memory mapped .dynsym, that is intentional by the testcase:
deleted_lib_so_LDFLAGS = -shared -rdynamic

Comment 3 Mark Wielaard 2014-08-19 07:56:25 UTC
(In reply to Jan Kratochvil from comment #2)
> (In reply to Mark Wielaard from comment #1)
> > That isn't too surprising. elf_from_remote_memory is conservative and
> > probably only retrieved the memory mapped regions of the ELF image. In your
> > example I saw elf_from_remote_memory explicitly clears the shdrs from the
> > image. Which makes the symtab symbols unreachable.
> 
> All the needed symbols are in memory mapped .dynsym, that is intentional by
> the testcase:
> deleted_lib_so_LDFLAGS = -shared -rdynamic

In that case they should be found through the phdrs in libdwfl/dwfl_module_getdwarf.c (find_dynsym). But apparently they are not in your testcase. I think the issue is that find_dynsym calls find_offsets which translates the addresses into offsets, but seems to get things wrong with the in-memory ELF image phdrs.

Comment 4 Jan Kratochvil 2014-08-28 20:13:09 UTC
[patch] Fix resolving ELF symbols for live PIDs with deleted files
https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-August/004121.html