Bug 852390 - Unable to extract build-ids from ARM coredumps
Unable to extract build-ids from ARM coredumps
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
17
arm Linux
medium Severity medium
: ---
: ---
Assigned To: Peter Robinson
Fedora Extras Quality Assurance
:
Depends On:
Blocks: ARMTracker
  Show dependency treegraph
 
Reported: 2012-08-28 07:13 EDT by Michal Toman
Modified: 2015-03-22 20:41 EDT (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-03-31 14:54:09 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
eu-unstrip output from binary, libraries, live process and core (1.33 KB, text/plain)
2012-08-28 07:13 EDT, Michal Toman
no flags Details

  None (edit)
Description Michal Toman 2012-08-28 07:13:47 EDT
Created attachment 607478 [details]
eu-unstrip output from binary, libraries, live process and core

Description of problem:
As mentioned few months ago on ARM list - http://lists.fedoraproject.org/pipermail/arm/2012-May/003253.html eu-unstrip is unable to extract build-id information required for debugging by ABRT from ARM coredumps. I'm not sure whether the information is even present in the coredump itself.

Version-Release number of selected component (if applicable):
elfutils-0.154-2.fc17.armv7hl
but it's the same since F15 bootstrap

How reproducible:
always

Steps to Reproduce:
1. set "ulimit -c unlimited"
2. crash whatever application (eg. "sleep 100 & kill -11 %")
3. run "eu-unstrip -n --core core.pid"
  
Actual results:
very few build-ids (or none) are shown

Expected results:
all build-ids are listed correctly

Additional info:
After installing all relevant debuginfos manually, the debug process works and I am able to get a full backtrace.

Results are not influenced by the presence of ABRT coredump hook.

eu-unstrip works for binaries, libraries and live processes (attaching a sample output for sleep)
Comment 1 Petr Machata 2013-02-24 00:27:54 EST
Looking into the core dump taken at
  http://mtoman.fedorapeople.org/arm/core.12331

Of the segments that are physically present in the file, none seem to be ELF files.  E.g. consider this:

  Type           Offset   VirtAddr   PhysAddr   FileSiz  MemSiz   Flg Align
  NOTE           0x0002f4 0x00000000 0x00000000 0x00037c 0x000000     0x0
  LOAD           0x001000 0x00008000 0x00000000 0x000000 0x005000 R E 0x1000
  LOAD           0x001000 0x00014000 0x00000000 0x001000 0x001000 R   0x1000

The first loadable segment is not in the file.  The second one (at offset 0x1000) contains the following data:

00001000: 0020 a0e3 0030 a0e3 0000 00eb f1ff ffea  . ...0..........
00001010: f040 2de9 14d0 4de2 0210 90e9 0060 a0e1  .@-...M......`..
00001020: 2840 9de5 0100 5ce1 0400 000a 2840 8de5  (@....\.....(@..
00001030: 0600 a0e1 14d0 8de2 f040 bde8 1bf4 ffea  .........@......
[...]

That's not an ELF header.  The ELF header probably was in the previous segment, and that was elided.  The same story repeats with all R E segments, e.g. those that would probably contain the ELF header.

As a backup plan, libdwfs looks for DT_DEBUG.  For that it needs to load a PHDR, which is on address 0x8034, i.e. in the elided segment.  I don't have an ARM machine handy at this moment, but I'll experiment with it more on Monday.  For now it seems libdwfl simply doesn't have enough information to figure out what was loaded.
Comment 2 Petr Machata 2013-02-24 16:18:52 EST
Looking into some other core dumps, it seems that the build ID bits tend to be exactly in those R E segment that the linked core dump lacks.  Any such note would be preceded by a NOTE header with a word GNU in it, e.g. like this:

0000f1c0: 0100 0000 0000 0000 0400 0000 1400 0000  ................
0000f1d0: 0300 0000 474e 5500 2eb7 febf a4e0 3eb0  ....GNU.......>.
0000f1e0: b46d 3fe6 3021 431a f376 8f77 0000 0000  .m?.0!C..v.w....

Here, the note starts at 0xf1c8, it's a GNU note of type NT_GNU_BUILD_ID.  There are no such patterns visible in the ARM core dump.  So there really is no way whatsoever to deduce what modules that core dump consists of, the information simply is absent.  It might be a kernel bug that those segments are elided, but I'll need to look more into why exactly the kernel does that.
Comment 3 Roland McGrath 2013-02-25 14:10:26 EST
It's probably a kernel configuration problem.
Check the /proc/PID/coredump_filter value while running the kernel that produced the core dump.  If it's not 0x33 then the configuration is not what it should be.
Check CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS.
Comment 4 Michal Toman 2013-02-26 11:29:30 EST
That's it. I've rebuilt the kernel with CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y and everything works fine.
Comment 5 Peter Robinson 2013-02-26 11:35:26 EST
This issue unique to the omap config for historical reasons, it'll be fixed for 3.8.0+ on f17/18 and 3.9+ on rawhide. It's not an issue on tegra/unified kernels.
Comment 6 Mark Wielaard 2013-02-26 13:29:06 EST
(In reply to comment #5)
> This issue unique to the omap config for historical reasons, it'll be fixed
> for 3.8.0+ on f17/18 and 3.9+ on rawhide. It's not an issue on tegra/unified
> kernels.

Should we close this bug or move it to the kernel till it is updated?
Or is the kernel already updated?
Comment 7 Peter Robinson 2013-03-31 14:54:09 EDT
Fixed in 3.8+
Comment 8 Jan Kratochvil 2013-07-23 14:29:32 EDT
Just reassigning closed Bug as it was fixed in kernel, not in elfutils.

Note You need to log in before you can comment on or make changes to this bug.