Bug 1566639 - "debuginfo reader: ensure_valid failed" on libglvnd-glx-debuginfo
Summary: "debuginfo reader: ensure_valid failed" on libglvnd-glx-debuginfo
Alias: None
Product: Fedora
Classification: Fedora
Component: valgrind
Version: rawhide
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
Assignee: Mark Wielaard
QA Contact: Fedora Extras Quality Assurance
Depends On:
TreeView+ depends on / blocked
Reported: 2018-04-12 16:28 UTC by Adam Jackson
Modified: 2018-04-17 00:18 UTC (History)
8 users (show)

Fixed In Version: valgrind-3.13.0-18.fc28
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-04-17 00:18:41 UTC
Type: Bug

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
KDE Software Compilation 393062 None None None 2018-04-12 20:03:27 UTC

Description Adam Jackson 2018-04-12 16:28:38 UTC
With F28, trying to valgrind the X server explodes when processing the debuginfo for one of its loaded libraries:

desoxy:~/git/xserver% valgrind /usr/bin/Xvfb :10
==7018== Memcheck, a memory error detector
==7018== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==7018== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==7018== Command: ./build/hw/vfb/Xvfb :10
==7018== Valgrind: debuginfo reader: ensure_valid failed:
==7018== Valgrind:   during call to ML_(img_get)
==7018== Valgrind:   request for range [460632, +12) exceeds
==7018== Valgrind:   valid image size of 333064 for image:
==7018== Valgrind:   "/usr/lib/debug/.build-id/3e/30f2307639da3a66b4c72c310049c659461253.debug"
==7018== Valgrind: debuginfo reader: Possibly corrupted debuginfo file.
==7018== Valgrind: I can't recover.  Giving up.  Sorry.
desoxy:~/git/xserver% rpm -qf /usr/lib/debug/.build-id/3e/30f2307639da3a66b4c72c310049c659461253.debug
desoxy:~/git/xserver% rpm -q valgrind

Filing this as a valgrind bug as I don't think we're doing anything
special in the libglvnd build that would emit broken dwarf.

Comment 1 Adam Jackson 2018-04-12 17:58:24 UTC
Moving this to rpm. Rebuilding libglvnd with %global debug_package %{nil} produces debuggable libraries, so this has to be a problem in find-debuginfo.sh or something it calls.

Comment 2 Mark Wielaard 2018-04-12 19:05:52 UTC
Replicated. But not yet investigated.

This is the valgrind backtrace (run under gdb) when ensure_valid_failed is hit:

#0  ensure_valid_failed (offset=460632, size=12, 
    caller=caller@entry=0x582221e0 "ML_(img_get)", img=<optimized out>, 
    img=<optimized out>) at m_debuginfo/image.c:1052
#1  0x00000000580d876e in ensure_valid (caller=0x582221e0 "ML_(img_get)", 
    size=12, offset=460632, img=0x1002976b30) at m_debuginfo/image.c:1076
#2  vgModuleLocal_img_get (dst=dst@entry=0x1002eace74, 
    img=img@entry=0x1002976b30, offset=offset@entry=460632, size=size@entry=12)
    at m_debuginfo/image.c:1085
#3  0x0000000058001522 in find_buildid (img=img@entry=0x1002976b30, 
    rel_ok=rel_ok@entry=0 '\000', search_shdrs=search_shdrs@entry=1 '\001')
    at m_debuginfo/readelf.c:1150
#4  0x00000000580017c6 in open_debug_file (
    name=name@entry=0x1002a4e1e0 "/usr/lib/debug/.build-id/3e/30f2307639da3a66b4c72c310049c659461253.debug", 
    buildid=buildid@entry=0x10028847b0 "3e30f2307639da3a66b4c72c310049c659461253", crc=crc@entry=0, rel_ok=rel_ok@entry=0 '\000', 
    serverAddr=serverAddr@entry=0x0) at m_debuginfo/readelf.c:1252
#5  0x000000005800192a in find_debug_file (di=di@entry=0x1002b8b960, 
    objpath=0x1002c8bec0 "/usr/lib64/libGL.so.1.7.0", 
    buildid=buildid@entry=0x10028847b0 "3e30f2307639da3a66b4c72c310049c659461253", 
    debugname=debugname@entry=0x1002a4e180 "libGL.so.1.7.0-1.0.1-0.1.20180226gitb029c24.fc28.x86_64.debug", crc=crc@entry=1951608855, 
    rel_ok=rel_ok@entry=0 '\000') at m_debuginfo/readelf.c:1308

Comment 3 Mark Wielaard 2018-04-12 19:13:39 UTC
This does look like a valgrind issue. If you look at the backtrace in comment 2 you'll notice this comes from "find_buildid". Looking at the source it looks like it is trying to get the buildid first though the phdrs and if that fails it should fall back on trying to get them through the shdrs. But the phdrs in a .debug file aren't reliable. So that is why getting the PT_NOTE fails. It really shouldn't trigger the "cannot recover" part. It should fall back to trying to find the build-id through the shdrs.

Comment 4 Mark Wielaard 2018-04-12 19:30:18 UTC
This seems to resolve the issue:

diff --git a/coregrind/m_debuginfo/readelf.c b/coregrind/m_debuginfo/readelf.c
index 70c28e629..8bd3e049c 100644
--- a/coregrind/m_debuginfo/readelf.c
+++ b/coregrind/m_debuginfo/readelf.c
@@ -1137,7 +1137,11 @@ HChar* find_buildid(DiImage* img, Bool rel_ok, Bool search_shdrs)
       ElfXX_Ehdr ehdr;
       ML_(img_get)(&ehdr, img, 0, sizeof(ehdr));
-      for (i = 0; i < ehdr.e_phnum; i++) {
+      /* Skip the phdrs when we have to search the shdrs. In separate
+         .debug files the phdrs might not be valid (they are a copy of
+         the main ELF file) and might trigger assertions when getting
+        image notes based on them. */
+      for (i = 0; !search_shdrs && i < ehdr.e_phnum; i++) {
          ElfXX_Phdr phdr;
          ML_(img_get)(&phdr, img,
                       ehdr.e_phoff + i * ehdr.e_phentsize, sizeof(phdr));

I'll report upstream and build new fedora valgrind packages.

Comment 5 Adam Jackson 2018-04-12 20:02:00 UTC
Oh very cool. Thanks for figuring it out so quickly!

Comment 6 Fedora Update System 2018-04-12 21:14:28 UTC
valgrind-3.13.0-18.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-6e2b5f0c1e

Comment 7 Fedora Update System 2018-04-15 02:22:37 UTC
valgrind-3.13.0-18.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-6e2b5f0c1e

Comment 8 Fedora Update System 2018-04-17 00:18:41 UTC
valgrind-3.13.0-18.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.