Bug 1170810

Summary: Fuzzing elfutils -- various badness
Product: [Fedora] Fedora Reporter: Alexander Cherepanov <cherepan>
Component: elfutilsAssignee: Mark Wielaard <mjw>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: aoliva, fche, jakub, jan.kratochvil, mjw, mjw, mnewsome, roland
Target Milestone: ---Keywords: Security, Tracking
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: elfutils-0.163-1.fc21 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-06-30 20:11:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
Crashers for `objdump -rs`
none
Crashes for `readelf -aAdehIlnrsSVcp -w`
none
Aborts for `readelf -aAdehIlnrsSVcp -w`
none
More crashes for `readelf -aAdehIlnrsSVcp -w`
none
More crashers for `readelf -aAdehIlnrsSVcp -w`
none
Problems with `nm -DsSCp --mark-special`
none
Problems with `strings`
none
Problems with `elflint --strict`
none
More problems with `elflint --strict`
none
Problems with `elfcmp -l`
none
Problems with `stack -abdilmsv --core=`
none
Problems with `readelf -aAdehIlnrsSVcp -Nw`
none
Problems with `strings`
none
Problems with `strip`
none
Problems with `findtextrel`
none
Problems with `readelf -aAdehIlnrsSVcp -Nw` (32-bit)
none
Problems with `addr2line -e @@ -- ...` (32-bit)
none
Problems with `elflint --strict` (32-bit)
none
Problems with `nm -DsSp --mark-special` (32-bit)
none
Problems with `ar -tv` (32-bit)
none
Problems with `strip -o /dev/null` (32-bit) none

Description Alexander Cherepanov 2014-12-04 22:23:21 UTC
This is a bug to track the results of (some) fuzzing of elfutils. As suggested here: https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-December/004351.html . The start of the thread is here: https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-December/004346.html .

Comment 1 Alexander Cherepanov 2014-12-04 22:27:02 UTC
Created attachment 964861 [details]
Crashers for `objdump -rs`

Files: 11
Errors:
     10 Invalid read of size ...
      7 Process terminating with default action of signal 11 (SIGSEGV)
      3 Process terminating with default action of signal 8 (SIGFPE)

The same as in https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-December/004346.html .

Comment 2 Alexander Cherepanov 2014-12-04 22:28:23 UTC
Created attachment 964862 [details]
Crashes for `readelf -aAdehIlnrsSVcp -w`

Files: 11
Errors:
      1 Argument 'size' of function malloc has a fishy (possibly negative) value: ...
      1 Invalid free() / delete / delete[] / realloc()
      9 Invalid read of size ...
      6 Process terminating with default action of signal 11 (SIGSEGV)
      1 Process terminating with default action of signal 8 (SIGFPE)

The same as in
https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-December/004346.html
.

Comment 3 Alexander Cherepanov 2014-12-04 22:29:16 UTC
Created attachment 964863 [details]
Aborts for `readelf -aAdehIlnrsSVcp -w`

Files: 2
Errors:
      1 No assertion info in gdb backtrace.
      1 readelf.c:7731: print_debug_exception_table: Assertion `readp == action_table' failed.

Comment 5 Alexander Cherepanov 2014-12-08 02:58:49 UTC
Created attachment 965676 [details]
More crashes for `readelf -aAdehIlnrsSVcp -w`

crashes

Files: 24
Errors:
      1 Argument 'size' of function malloc has a fishy (possibly negative) value: ...
     30 Invalid read of size ...
      3 Invalid write of size ...
     17 Process terminating with default action of signal 11 (SIGSEGV)

----------------------------------------------------------------------

asserts

Files: 2
Errors:
      1 readelf.c:4456: notice_listptr: Assertion `p->offset == offset' failed.
      1 readelf.c:7751: print_debug_exception_table: Assertion `readp == action_table' failed.

----------------------------------------------------------------------

catchsegv

Files: 2
Errors:
      2 *** Segmentation fault

Comment 6 Mark Wielaard 2014-12-15 12:11:28 UTC
I believe the "Argument 'size' of function malloc has a fishy (possibly negative) value" in dwarf_begin_elf.c (check_section) is correct, but harmless. We do check the value doesn't actually overflow, the allocation will likely fail, but that is also checked.

Comment 7 Mark Wielaard 2014-12-20 22:51:01 UTC
elfutils 0.161 was released with patches that solve various crashers.
https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-December/004481.html
None of the above samples crash anymore.

But one sample shows there is an invalid memory access when manipulating a Dwarf_Abbrev with dwarf_getabbrevattr this is one of three areas that I know of that still need improvements, but for which we haven't seen any crashers yet:

1) When creating an Dwarf_Abbrev we don't keep track of the memory buffer it came from, so we are unable to bounds check any accesses made through them.

2) read_encoded_value () used to parse CFI data doesn't do any bounds checking yet.

3) There are various places in the code that allocate memory on the stack with alloca that might depend on untrusted input values. There might be a possibility that such allocations blow up the stack. They need to be carefully audited.

I like to keep this bug open till at least those three issues have been fixed.

Of course any new fuzzed data files that cause crashes are also very welcome.

Comment 8 Alexander Cherepanov 2014-12-20 23:49:36 UTC
Created attachment 971587 [details]
More crashers for `readelf -aAdehIlnrsSVcp -w`

crashes

Files: 9
Errors:
      1 Argument 'size' of function malloc has a fishy (possibly negative) value: ...
      5 Conditional jump or move depends on uninitialised value(s)
      7 Invalid read of size ...
      7 Process terminating with default action of signal 11 (SIGSEGV)
      3 Use of uninitialised value of size ...

----------------------------------------------------------------------

catchsegv

Files: 2
Errors:
      2 *** Segmentation fault

Comment 9 Mark Wielaard 2014-12-21 22:13:24 UTC
(In reply to Alexander Cherepanov from comment #8)
> Created attachment 971587 [details]
> More crashers for `readelf -aAdehIlnrsSVcp -w`

Thanks I posted two patches (also on mjw/pending branch) to solve these issues:

      readelf: Add more sanity checks to print_debug_exception_table.
      readelf: Don't try to read macinfo cus sentinel or beyond.

> Errors:
>       1 Argument 'size' of function malloc has a fishy (possibly negative)

See comment #6.

Comment 10 Alexander Cherepanov 2014-12-25 17:30:55 UTC
Created attachment 973055 [details]
Problems with `nm -DsSCp --mark-special`

valgrind

Files: 1
Errors:
      1 Process terminating with default action of signal 8 (SIGFPE)

----------------------------------------------------------------------

hang

Files: 1
Errors:
      1 Hangs

Comment 11 Alexander Cherepanov 2014-12-25 21:13:55 UTC
Created attachment 973086 [details]
Problems with `strings`

valgrind

Files: 2
Errors:
      2 Process terminating with default action of signal 7 (SIGBUS)

----------------------------------------------------------------------

gdb

Files: 3
Errors:
      1 strings.c:636: read_block: Assertion `from >= (off64_t) elfmap_off && from < (off64_t) (elfmap_off + elfmap_size)' failed.
      1 strings.c:644: read_block: Assertion `elfmap_size >= keep_area + ps' failed.
      1 strings.c:672: read_block: Assertion `handled_to % ps == 0' failed.

Comment 12 Alexander Cherepanov 2014-12-25 21:48:24 UTC
Created attachment 973087 [details]
Problems with `elflint --strict`

(Deduplication of valgrind errors by last frame only, not full stacktrace. To reduce number of samples.)

valgrind

Files: 39
Errors:
     33 Conditional jump or move depends on uninitialised value(s)
     26 Invalid read of size ...
      3 Invalid write of size ...
     28 Process terminating with default action of signal 11 (SIGSEGV)
      6 Process terminating with default action of signal 8 (SIGFPE)
      1 Stack overflow in thread ...
     13 Use of uninitialised value of size ...

----------------------------------------------------------------------

gdb

Files: 4
Errors:
      2 No assertion info in gdb backtrace.
      2 elflint.c:696: check_symtab: Assertion `name != ((void *)0) || strshdr->sh_type != 3' failed.

----------------------------------------------------------------------

catchsegv

Files: 3
Errors:
      3 *** Segmentation fault

Comment 13 Mark Wielaard 2014-12-26 19:28:53 UTC
(In reply to Alexander Cherepanov from comment #10)
> Created attachment 973055 [details]
> Problems with `nm -DsSCp --mark-special`

There are 2 commits on the mjw/pending branch posted to the list that solve these issues:

    nm: Guard against divide by zero in error check.
    nm: Stop processing ar members on first invalid offset.

Comment 14 Mark Wielaard 2014-12-26 22:04:20 UTC
(In reply to Alexander Cherepanov from comment #11)
> Created attachment 973086 [details]
> Problems with `strings`

The following commit on mjw/pending branch, posted to the list, solves these issues:

    strings: Produce error when section data falls outside file.

Comment 15 Mark Wielaard 2014-12-31 10:18:22 UTC
(In reply to Alexander Cherepanov from comment #12)
> Created attachment 973087 [details]
> Problems with `elflint --strict`

The following commit on the mjw/pending branch, also posted to the list, solve these issues:

    elflint: Add various low-level checks.

Some additional fuzzing found two more issues, also fixed on that branch:

    backends: Check sh_entsize is not zero in ppc_symbol.c (find_dyn_got).
    libelf: gelf_getphdr should check phdr index is valid.

Comment 16 Alexander Cherepanov 2015-01-02 15:17:52 UTC
Created attachment 975293 [details]
More problems with `elflint --strict`

valgrind

Files: 2
Errors:
     25 Conditional jump or move depends on uninitialised value(s)
     24 Use of uninitialised value of size ...

----------------------------------------------------------------------

hang

Files: 1
Errors:
      1 Long loop?

Comment 17 Alexander Cherepanov 2015-01-02 15:22:53 UTC
Created attachment 975294 [details]
Problems with `elfcmp -l`

(Comparing a file with itself.)

valgrind

Files: 4
Errors:
      1 Conditional jump or move depends on uninitialised value(s)
      2 Invalid read of size ...
      2 Process terminating with default action of signal 11 (SIGSEGV)
      1 Process terminating with default action of signal 8 (SIGFPE)

Comment 18 Alexander Cherepanov 2015-01-02 18:33:56 UTC
Created attachment 975325 [details]
Problems with `stack -abdilmsv --core=`

(Deduplicated by only one stack frame.)

valgrind

Files: 34
Errors:
     50 Conditional jump or move depends on uninitialised value(s)
     27 Invalid read of size ...
     14 Invalid write of size ...
     29 Process terminating with default action of signal 11 (SIGSEGV)
      1 Source and destination overlap in memcpy...
     34 Use of uninitialised value of size ...

----------------------------------------------------------------------

catchsegv

Files: 19
Errors:
     19 *** Segmentation fault

Comment 19 Mark Wielaard 2015-01-03 22:19:06 UTC
(In reply to Alexander Cherepanov from comment #16)
> Created attachment 975293 [details]
> More problems with `elflint --strict`
> 
> valgrind
> 
> Files: 2
> Errors:
>      25 Conditional jump or move depends on uninitialised value(s)
>      24 Use of uninitialised value of size ...
> 
> ----------------------------------------------------------------------
> 
> hang
> 
> Files: 1
> Errors:
>       1 Long loop?


There were some more low-level (version offset) checks missing which I have incorporated into the orginal fix on the mjw/pending branch. See "elflint: Add various low-level checks."

The usage of undefined memory comes from the libelf translation functions which cannot convert the full version section data because the version data definitions are corrupt/incomplete. This could lead to a data leak because the buffer is malloced, but not fully filled. I have a patch for that also on the mjw/pending branch "libelf: Make sure version xlate dest buffer is fully defined." but I am not too happy about it. In general it is inefficient and should be unnecessary. But there is no way currently for the translation functions to signal "failure" (all other translations are always well defined, only version data seems special)

Comment 20 Mark Wielaard 2015-01-03 23:40:46 UTC
(In reply to Alexander Cherepanov from comment #17)
> Created attachment 975294 [details]
> Problems with `elfcmp -l`

Some NULL and zero checks were missing (plus a ehdr/phdr typo). Fixed on mjw/pending branch and posted to the list as "elfcmp: Add some NULL and zero checks."

Comment 21 Mark Wielaard 2015-01-07 22:52:03 UTC
(In reply to Mark Wielaard from comment #7)
> 2) read_encoded_value () used to parse CFI data doesn't do any bounds
> checking yet.

There is a patch posted, and on the mjw/pending branch, that add bounds checking:

    libdw: Robustify eh_frame_hdr and encoded-values reading.
    
    Sanity check and keep track of binary_search_table data buffer length.
    Add bounds check to encoded value reading. Also fix a bug when reading
    the eh_frame header data from an other endian ELF image. Add a testcase
    that would fail the new sanity checks because of the endian bug.

The other FDE and CIE parsing code should also be reviewed for any missing bounds checks.

Comment 22 Alexander Cherepanov 2015-01-11 20:51:24 UTC
Created attachment 978856 [details]
Problems with `readelf -aAdehIlnrsSVcp -Nw`

Files: 1
Errors:
      1 Process terminating with default action of signal 11 (SIGSEGV)

Comment 23 Mark Wielaard 2015-01-11 22:12:58 UTC
(In reply to Alexander Cherepanov from comment #22)
> Created attachment 978856 [details]
> Problems with `readelf -aAdehIlnrsSVcp -Nw`
> 
> Files: 1
> Errors:
>       1 Process terminating with default action of signal 11 (SIGSEGV)

Thanks. I was just pondering about all these unbounded array allocations on the stack in the code. This is an example of one. In this case descsz=8388624 and we have the following in ebl_object_note.c:

 uint32_t buf[descsz / 4];

Causing a stack overflow that looks like a crash while calling a function.

BTW. The valgrind output already hints there is something fishy going on here:

==24364== Warning: client switching stacks?  SP change: 0xffefff5f0 --> 0xffe7ff5d0

When compiling with gcc -Wstack-usage=65536 these issues are easy to spot as:

/home/mark/src/elfutils/libebl/eblobjnote.c:220:1: error: stack usage might be unbounded [-Werror=stack-usage=]

Comment 24 Alexander Cherepanov 2015-01-15 23:08:44 UTC
Created attachment 980692 [details]
Problems with `strings`

Files: 1
Errors:
      1 Invalid read of size ...

Comment 25 Alexander Cherepanov 2015-01-18 22:55:56 UTC
Created attachment 981241 [details]
Problems with `strip`

(If arguments of malloc are Ok, then please ignore "fishy" errors.)

Files: 18
Errors:
      3 Argument 'size' of function malloc has a fishy (possibly negative) value: ...
     14 Invalid read of size ...
      5 Process terminating with default action of signal 11 (SIGSEGV)
      4 Process terminating with default action of signal 8 (SIGFPE)

Comment 26 Mark Wielaard 2015-01-22 12:02:21 UTC
(In reply to Alexander Cherepanov from comment #24)
> Created attachment 980692 [details]
> Problems with `strings`
> 
> Files: 1
> Errors:
>       1 Invalid read of size ...

That was an interesting case. It isn't specific to eu-strings. It is a generic issue with elf_strptr which could return a non-NUL terminated string. Most callers weren't prepared for that. So the fix is to add a check to elf_strptr:

  libelf: Make sure string returned by elf_strptr is NUL terminated.

But before I could fix that I had to fix some other issues:

  libelf: elf_strptr should fetch the shdr for the section if not yet known.
  libelf: Fix elf_newdata when raw ELF file/image data is available.
  libelf: elf_strptr should use datalist when data has been added to section.

All have been posted to the mailinglist and are on the mjw/pending git branch.

Comment 27 Alexander Cherepanov 2015-01-23 23:08:19 UTC
Created attachment 983597 [details]
Problems with `findtextrel`

Files: 5
Errors:
      5 Process terminating with default action of signal 8 (SIGFPE)

Comment 28 Alexander Cherepanov 2015-01-23 23:16:58 UTC
(In reply to Mark Wielaard from comment #26)
> That was an interesting case.

I'm glad that you like it:-)

> It isn't specific to eu-strings. It is a
> generic issue with elf_strptr which could return a non-NUL terminated
> string. Most callers weren't prepared for that.

It's surprising to hit such a bug after the previous fuzzing.

Comment 29 Alexander Cherepanov 2015-02-10 18:28:46 UTC
Created attachment 990196 [details]
Problems with `readelf -aAdehIlnrsSVcp -Nw` (32-bit)

I've built elfutils for 32-bit with some hardning turned on (exact configure is in configure.txt). The crashes on the attached samples are somewhat elusive. LD_PRELOAD=/lib/i386-linux-gnu/libSegFault.so makes crashes more determninistic. Stacktraces are in the catchsegv/ subdir.

There is also a number of cases of undefined behavior catched with gcc 4.9 -fsanitize=undefined (exact configure is in configure-ubsan.txt). More or less the same samples.

----------------------------------------------------------------------

ubsan

Files: 8
Errors:
      1 ../../../source/src/../libdw/memory-access.h:103:5: runtime error: left shift of ... by ... places cannot be represented in type 'long long int'
      1 ../../../source/src/readelf.c:1133:28: runtime error: left shift of ... by ... places cannot be represented in type 'int'
      1 ../../../source/src/readelf.c:5626:11: runtime error: signed integer overflow: ... - ... cannot be represented in type 'int'
      1 ../../../source/src/readelf.c:8021:25: runtime error: signed integer overflow: ... - ... cannot be represented in type 'int'
      1 ../../../source/src/readelf.c:8043:25: runtime error: signed integer overflow: ... - ... cannot be represented in type 'int'
      1 ../../../source/src/readelf.c:8069:27: runtime error: signed integer overflow: ... - ... cannot be represented in type 'int'
      1 ../../../source/src/readelf.c:8098:26: runtime error: signed integer overflow: ... - ... cannot be represented in type 'int'
      1 ../../../source/src/readelf.c:8116:8: runtime error: signed integer overflow: ... - ... cannot be represented in type 'int'

----------------------------------------------------------------------

catchsegv

Files: 6
Errors:
      6 *** Segmentation fault

Comment 30 Alexander Cherepanov 2015-02-11 23:44:53 UTC
Created attachment 990700 [details]
Problems with `addr2line -e @@ -- ...` (32-bit)

The exact (long) command line is in cmd.txt.

----------------------------------------------------------------------

valgrind

Files: 4
Errors:
      2 Conditional jump or move depends on uninitialised value(s)
      7 Invalid read of size ...
      3 Process terminating with default action of signal 11 (SIGSEGV)
      1 Use of uninitialised value of size ...

Comment 31 Alexander Cherepanov 2015-02-12 21:09:53 UTC
Created attachment 991147 [details]
Problems with `elflint --strict` (32-bit)

valgrind

Files: 6
Errors:
      2 Conditional jump or move depends on uninitialised value(s)
      5 Invalid read of size ...
      5 Process terminating with default action of signal 11 (SIGSEGV)

----------------------------------------------------------------------

ubsan

Files: 6
Errors:
      1 ../../../source/src/elflint.c:2204:30: runtime error: shift exponent ... is negative
      1 ../../../source/src/elflint.c:2204:30: runtime error: shift exponent ... is too large for 32-bit type 'unsigned int'
      1 ../../../source/src/elflint.c:2211:30: runtime error: shift exponent ... is negative
      1 ../../../source/src/elflint.c:2211:30: runtime error: shift exponent ... is too large for 32-bit type 'unsigned int'
      1 ../../../source/src/elflint.c:3035:3: runtime error: signed integer overflow: ... + ... cannot be represented in type 'int'
      1 ../../../source/src/elflint.c:3194:3: runtime error: signed integer overflow: ... + ... cannot be represented in type 'int'

Comment 32 Alexander Cherepanov 2015-02-15 22:03:58 UTC
Created attachment 991985 [details]
Problems with `nm -DsSp --mark-special` (32-bit)

valgrind

Files: 2
Errors:
      1 Argument 'size' of function malloc has a fishy (possibly negative) value: ...
      1 Invalid read of size ...
      1 Process terminating with default action of signal 11 (SIGSEGV)

----------------------------------------------------------------------

ubsan

Files: 2
Errors:
      1 ../../../source/libdw/dwarf_getsrclines.c:575:13: runtime error: signed integer overflow: ... + ... cannot be represented in type 'int'
      1 ../../../source/libdw/memory-access.h:103:5: runtime error: left shift of ... by ... places cannot be represented in type 'long long int'

Comment 33 Alexander Cherepanov 2015-02-15 22:26:02 UTC
Created attachment 991986 [details]
Problems with `ar -tv` (32-bit)

ubsan

Files: 1
Errors:
      1 ../../../source/src/ar.c:463:8: runtime error: variable length array bound evaluates to non-positive value ...

Comment 34 Alexander Cherepanov 2015-02-19 00:23:09 UTC
Created attachment 993359 [details]
Problems with `strip -o /dev/null` (32-bit)

valgrind

Files: 3
Errors:
      1 Argument 'size' of function malloc has a fishy (possibly negative) value: ...
      1 Invalid read of size ...
      1 Process terminating with default action of signal 11 (SIGSEGV)
      1 Process terminating with default action of signal 8 (SIGFPE)

Comment 35 Jaroslav Reznik 2015-03-03 16:34:07 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 36 Mark Wielaard 2015-04-21 09:10:43 UTC
(In reply to Alexander Cherepanov from comment #29)
> Created attachment 990196 [details]
> Problems with `readelf -aAdehIlnrsSVcp -Nw` (32-bit)
> 
> I've built elfutils for 32-bit with some hardning turned on (exact configure
> is in configure.txt). The crashes on the attached samples are somewhat
> elusive. LD_PRELOAD=/lib/i386-linux-gnu/libSegFault.so makes crashes more
> determninistic. Stacktraces are in the catchsegv/ subdir.

Thanks. These are all fixed by the following patch currently on the mjw/pending branch:

commit 3e4d5ccd97f597fef9e4053bd58ae54ba815499f
Author: Mark Wielaard <mjw@redhat.com>
Date:   Fri Apr 17 20:03:44 2015 +0200

    readelf: Add overflow checking to print_gdb_index_section dataend checks.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c29
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

Comment 37 Mark Wielaard 2015-04-21 09:24:11 UTC
(In reply to Alexander Cherepanov from comment #29)
> There is also a number of cases of undefined behavior catched with gcc 4.9
> -fsanitize=undefined (exact configure is in configure-ubsan.txt). More or
> less the same samples.
> 
>       1 ../../../source/src/readelf.c:1133:28: runtime error: left shift of
> ... by ... places cannot be represented in type 'int'

This one really is a bug in glibc elf.h which has a simple one character fix, but has for some reason been somewhat hard to get applied. Most recent attempt:
https://sourceware.org/ml/libc-alpha/2015-04/msg00255.html

Comment 38 Mark Wielaard 2015-04-22 10:53:55 UTC
(In reply to Alexander Cherepanov from comment #29)
> There is also a number of cases of undefined behavior catched with gcc 4.9
> -fsanitize=undefined (exact configure is in configure-ubsan.txt). More or
> less the same samples.
> 
> ----------------------------------------------------------------------
> 
> ubsan
> 
> Files: 8
> Errors:
>       1 ../../../source/src/../libdw/memory-access.h:103:5: runtime error:
> left shift of ... by ... places cannot be represented in type 'long long int'
>       1 ../../../source/src/readelf.c:1133:28: runtime error: left shift of
> ... by ... places cannot be represented in type 'int'
>       1 ../../../source/src/readelf.c:5626:11: runtime error: signed integer
> overflow: ... - ... cannot be represented in type 'int'
>       1 ../../../source/src/readelf.c:8021:25: runtime error: signed integer
> overflow: ... - ... cannot be represented in type 'int'
>       1 ../../../source/src/readelf.c:8043:25: runtime error: signed integer
> overflow: ... - ... cannot be represented in type 'int'
>       1 ../../../source/src/readelf.c:8069:27: runtime error: signed integer
> overflow: ... - ... cannot be represented in type 'int'
>       1 ../../../source/src/readelf.c:8098:26: runtime error: signed integer
> overflow: ... - ... cannot be represented in type 'int'
>       1 ../../../source/src/readelf.c:8116:8: runtime error: signed integer
> overflow: ... - ... cannot be represented in type 'int'

Serveral patches, currently on the mjw/pending branch, were needed to correct all of these:

commit 87ad3a2ae15fc7b259dc2028d9a4c11c41199a4f
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed Apr 22 12:47:46 2015 +0200

    readelf: Fix cie_offset calculation comparison on 32bit.
    
    gcc -fsanitize=undefined pointed out that on 32bit systems the calculation
    to match the cie_offset to the cie_id could be undefined because a cie_id
    could be an unsigned 64bit value while ptrdiff_t is only 32bits. Correct
    the calculation to use 64bit values.
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

commit b09d39ea4817ed66d302af219523af4e3b992513
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed Apr 22 12:28:30 2015 +0200

    libdw: Undefined behavior in get_sleb128_step.
    
    gcc -fsanitize=undefined pointed out that for too big sleb128 values we
    could shift into the sign bit. So for sleb128 values that have to fit
    in a (signed) int64_t variable reduce the max number of steps by one.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c29
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

commit 257cbc32eabacb038e76fa3ca779e1b99abef7d3
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed Apr 22 11:44:32 2015 +0200

    readelf: Check all offsets used in print_gdb_index_section against d_size.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c29
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

commit 3454ca233c3c8bcbc3e139a32b6d1a72cc855215
Author: Mark Wielaard <mjw@redhat.com>
Date:   Tue Apr 21 15:56:06 2015 +0200

    libdwfl: Fix wrong type to make gcc -fsanitize=undefined happy.
    
    gcc -fsanitize=undefined would warn about calling dwfl_error accessing
    the error string array: elfutils/libdwfl/dwfl_error.c:156:60:
        runtime error: index 385 out of bounds for type 'char [9]'
    
    This is because it thought we went beyond error string zero "no error",
    which is indeed 9 chars. "Correct" the type of the pointer to make
    ubsan happy.
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

Comment 39 Mark Wielaard 2015-05-05 08:42:59 UTC
(In reply to Alexander Cherepanov from comment #30)
> Created attachment 990700 [details]
> Problems with `addr2line -e @@ -- ...` (32-bit)
> 
> The exact (long) command line is in cmd.txt.
> 
> ----------------------------------------------------------------------
> 
> valgrind
> 
> Files: 4
> Errors:
>       2 Conditional jump or move depends on uninitialised value(s)
>       7 Invalid read of size ...
>       3 Process terminating with default action of signal 11 (SIGSEGV)
>       1 Use of uninitialised value of size ...

These were caused by three different issues. Posted patches upstream and on the mjw/pending git branch:

commit 019e4e17bf7705f5adfd896fb8fbe3e0417fa61a
Author: Mark Wielaard <mjw@redhat.com>
Date:   Tue May 5 10:26:56 2015 +0200

    libdwfl: Sanity check cu offset before trying to intern.
    
    We need to check the cuoff points to a real Dwarf_Die before trying to
    intern the cu with tsearch. Otherwise bogus keys might end up in the
    search tree with NULL cus. That will cause crashes in compare_cukey
    during next insertion or deletion of cus.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c30
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

commit 91f219ca5be8fb5f054fada1fe3b333e6325fa4e
Author: Mark Wielaard <mjw@redhat.com>
Date:   Tue May 5 10:16:42 2015 +0200

    libdw: dwarf_getaranges check there is enough data before reading.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c30
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

commit 6abdb1c54fdf0ac258618a282c24bc9e7d2a91e3
Author: Mark Wielaard <mjw@redhat.com>
Date:   Tue May 5 10:05:01 2015 +0200

    libdwfl: Bounds check Dwarf_Fileinfo file number in dwfl_lineinfo.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c30
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

Comment 40 Mark Wielaard 2015-05-06 10:51:09 UTC
(In reply to Alexander Cherepanov from comment #32)
> Created attachment 991985 [details]
> Problems with `nm -DsSp --mark-special` (32-bit)
> 
> valgrind
> 
> Files: 2
> Errors:
>       1 Argument 'size' of function malloc has a fishy (possibly negative)
> value: ...

This is inside an xmalloc and the result is that the malloc just fails.

>       1 Invalid read of size ...
>       1 Process terminating with default action of signal 11 (SIGSEGV)

These are the same, fixed by the following patch submitted upstream and on mjw/pending:

commit 53d0023ca901da072653bd6837300958736d8676
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed May 6 12:45:49 2015 +0200

    nm: Handle dwarf_linesrc returning NULL.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#32
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

Comment 41 Mark Wielaard 2015-05-06 11:01:07 UTC
(In reply to Alexander Cherepanov from comment #32)
> ubsan
> 
> Files: 2
> Errors:
>       1 ../../../source/libdw/dwarf_getsrclines.c:575:13: runtime error:
> signed integer overflow: ... + ... cannot be represented in type 'int'
>       1 ../../../source/libdw/memory-access.h:103:5: runtime error: left
> shift of ... by ... places cannot be represented in type 'long long int'

These are fixed by the following proposed fix currently on the mjw/pending branch:

commit 0a4e09387bdbe62a3a9d9d5b3045f05df48ef0e5
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed May 6 12:55:21 2015 +0200

    libdw: Detect line number overflow in dwarf_getsrclines on 32bit.
    
    We do check whether the values we store for the line fit our data
    representation in add_new_line, but on 32bit systems we would fail
    to notice line overflowing.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c32
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

Comment 42 Mark Wielaard 2015-05-06 16:04:59 UTC
(In reply to Alexander Cherepanov from comment #31)
> Created attachment 991147 [details]
> Problems with `elflint --strict` (32-bit)
> 
> valgrind
> 
> Files: 6
> Errors:
>       2 Conditional jump or move depends on uninitialised value(s)
>       5 Invalid read of size ...
>       5 Process terminating with default action of signal 11 (SIGSEGV)
> 
> ----------------------------------------------------------------------
> 
> ubsan
> 
> Files: 6
> Errors:
>       1 ../../../source/src/elflint.c:2204:30: runtime error: shift exponent
> ... is negative
>       1 ../../../source/src/elflint.c:2204:30: runtime error: shift exponent
> ... is too large for 32-bit type 'unsigned int'
>       1 ../../../source/src/elflint.c:2211:30: runtime error: shift exponent
> ... is negative
>       1 ../../../source/src/elflint.c:2211:30: runtime error: shift exponent
> ... is too large for 32-bit type 'unsigned int'
>       1 ../../../source/src/elflint.c:3035:3: runtime error: signed integer
> overflow: ... + ... cannot be represented in type 'int'
>       1 ../../../source/src/elflint.c:3194:3: runtime error: signed integer
> overflow: ... + ... cannot be represented in type 'int'

These were caused by several issues. Proposed fixes on mjw/pending branch:

commit 6356deaac3cf10254c25907b3cd690404bac716a
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed May 6 18:02:10 2015 +0200

    elflint: Check gnu_hash has enough data and bitmask_words is not zero.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c31
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

commit 215328072eddfa048079c8fcb1b7791440ead699
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed May 6 17:38:18 2015 +0200

    elflint: Add sanity checks to check_attributes.
    
    This is similar to commit 9644aa for readelf print_attributes.
    Bail out when the vendor name isn't terminated and add overflow check
    for subsection_len.
    
    Note that readelf does handle non-gnu attributes, while elflint doesn't.
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

commit 378fb6c86cc192d4701e584bd250135b13de0bf6
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed May 6 16:01:55 2015 +0200

    elflint: Use Use Elf64_Word for shdr->sh_info cnt.
    
    On 32bit using int might overflow.
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c31
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

commit 1290c63739f6be8d96c5226b9d9361b3fb913cce
Author: Mark Wielaard <mjw@redhat.com>
Date:   Wed May 6 13:09:23 2015 +0200

    elflint: Stop checking section when 2nd hash function shift too big.
    
    Nothing good comes from trying to continue with a bogus hash function.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c31

Comment 43 Mark Wielaard 2015-05-12 14:07:10 UTC
(In reply to Alexander Cherepanov from comment #34)
> Created attachment 993359 [details]
> Problems with `strip -o /dev/null` (32-bit)
> 
> valgrind
> 
> Files: 3
> Errors:
>       1 Argument 'size' of function malloc has a fishy (possibly negative)
> value: ...

I don't believe this is a real issue.

>       1 Invalid read of size ...
>       1 Process terminating with default action of signal 11 (SIGSEGV)
>       1 Process terminating with default action of signal 8 (SIGFPE)

This (and a lot more) has been fixed by the following commit on the mjw/pending branch:

commit d447095e373f67d8a13d1cd969456b4b6e51e2d7
Author: Mark Wielaard <mjw@redhat.com>
Date:   Tue May 12 15:59:04 2015 +0200

    strip: Harden against bogus input files. Don't leak tmp debug file on error.
    
    There were various places where a bogus/unexpected input file would cause
    eu-strip to crash. Also on an unexpected error eu-strip would leak the temp
    debug file it was writing.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1170810#c34
    
    Signed-off-by: Mark Wielaard <mjw@redhat.com>

You might need some other fixes from the mjw/pending branch too.

Comment 44 Mark Wielaard 2015-05-23 22:17:39 UTC
(In reply to Mark Wielaard from comment #23)
> (In reply to Alexander Cherepanov from comment #22)
> > Created attachment 978856 [details]
> > Problems with `readelf -aAdehIlnrsSVcp -Nw`
> > 
> > Files: 1
> > Errors:
> >       1 Process terminating with default action of signal 11 (SIGSEGV)
> 
> Thanks. I was just pondering about all these unbounded array allocations on
> the stack in the code. This is an example of one. In this case
> descsz=8388624 and we have the following in ebl_object_note.c:
> 
>  uint32_t buf[descsz / 4];
> 
> Causing a stack overflow that looks like a crash while calling a function.
> 
> BTW. The valgrind output already hints there is something fishy going on
> here:
> 
> ==24364== Warning: client switching stacks?  SP change: 0xffefff5f0 -->
> 0xffe7ff5d0
> 
> When compiling with gcc -Wstack-usage=65536 these issues are easy to spot as:
> 
> /home/mark/src/elfutils/libebl/eblobjnote.c:220:1: error: stack usage might
> be unbounded [-Werror=stack-usage=]

This one, and hopefully all other stack overflow issues in the library code (but sadly not yet in the tools code), has been fixed by the following patch series (17 patches):
https://lists.fedorahosted.org/pipermail/elfutils-devel/2015-May/004899.html

all those patches are also on the mjw/pending branch.

I believe with those patches all crashes for samples posted in this bug report have finally been fixed.

So please bring on new samples :)

Comment 45 Mark Wielaard 2015-06-08 12:30:14 UTC
All patches are now on master and I haven't found any new crashers myself using afl. So I intend to close this issue when we release 0.162 later this week.

Comment 46 Fedora Update System 2015-06-11 13:16:07 UTC
elfutils-0.162-1.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/elfutils-0.162-1.fc22

Comment 47 Fedora Update System 2015-06-13 06:35:35 UTC
Package elfutils-0.162-1.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing elfutils-0.162-1.fc22'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-9857/elfutils-0.162-1.fc22
then log in and leave karma (feedback).

Comment 48 Fedora Update System 2015-06-19 14:53:55 UTC
elfutils-0.163-1.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/elfutils-0.163-1.fc22

Comment 49 Fedora Update System 2015-06-30 20:11:16 UTC
elfutils-0.163-1.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 50 Fedora Update System 2015-07-08 15:06:48 UTC
elfutils-0.163-1.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/elfutils-0.163-1.fc21

Comment 51 Fedora Update System 2015-07-29 01:35:30 UTC
elfutils-0.163-1.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.