Bug 1420551 - non-deterministic errors/ICE building dyninst
Summary: non-deterministic errors/ICE building dyninst
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: gcc
Version: 26
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-08 23:51 UTC by Josh Stone
Modified: 2018-05-29 12:39 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-29 12:39:35 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
pre-processed Ident.C (224.84 KB, text/plain)
2017-02-09 00:23 UTC, Josh Stone
no flags Details
pre-processed Ident.C (no pch) (1.72 MB, text/plain)
2017-02-09 00:50 UTC, Josh Stone
no flags Details
Valgrind log of cc1plus (315.29 KB, text/plain)
2017-02-09 21:18 UTC, Josh Stone
no flags Details

Description Josh Stone 2017-02-08 23:51:26 UTC
Description of problem:
Two attempts to build dyninst in koji have resulted in different errors each time, sometimes ICE, some "invalid preprocessor directive" on even #include.

I'm not able to reproduce these locally, though I instead hit a later "error: ISO C++ forbids comparison between pointer and integer" that looks like a real error.

Version-Release number of selected component (if applicable):
gcc-7.0.1-0.6.fc26
trying to build dyninst-9.3.0-2.fc26

How reproducible:
???

Steps to Reproduce:
1. fedpkg clone dyninst && cd dyninst
2. fedpkg build

Actual results:
Failed to build on two occasions:
https://koji.fedoraproject.org/koji/taskinfo?taskID=17679787
https://koji.fedoraproject.org/koji/taskinfo?taskID=17680734

Expected results:
Successful builds, or at least relevant errors.

Additional info:

The errors appear to be focused around common/src/Ident.C.

In the first attempt, the errors are:

x86_64:
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:36:2: error: invalid preprocessing directive #include
 #include "common/src/Ident.h"
  ^~~~~~~
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:37:2: error: invalid preprocessing directive #include
 #include <stdio.h>
  ^~~~~~~
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:38:2: error: invalid preprocessing directive #include
 #include <string>
  ^~~~~~~
[and more]

ppc64:
In file included from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:36:0:
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.h:37:2: error: invalid preprocessing directive #if
 #if !defined(_Ident_h_)
  ^~
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.h:38:2: error: invalid preprocessing directive #define
 #define _Ident_h_
  ^~~~~~
[and more]

i686:
In file included from /usr/include/c++/7/iostream:38:0,
                 from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.h:44,
                 from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:36:
/usr/include/c++/7/i686-redhat-linux/bits/c++config.h:1:0: internal compiler error: Segmentation fault
 // Predefined symbols and macros -*- C++ -*-
 
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
Preprocessed source stored into /tmp/ccWwJ0N2.out file, please attach this to your bugreport.
[but this being in koji, I can't grab that file...]


On the second attempt, the errors are:

x86_64:
In file included from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/h/dyntypes.h:172:0,
                 from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Types.h:168,
                 from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/headers.h:53,
                 from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.h:47,
                 from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:36:
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/h/dyn_regs.h:53:9: internal compiler error: Segmentation fault
         Arch_none   = 0x00000000,
         ^~~~~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
Preprocessed source stored into /tmp/cccayAil.out file, please attach this to your bugreport.

ppc64 and i686:
/builddir/build/BUILD/dyninst-9.3.0/testsuite-9.3.0/src/JUnitOutputDriver.cpp:38:39: error: ISO C++ forbids comparison between pointer and integer [-fpermissive]
             if(last_group->mutatee != '\0') suitename << "." << last_group->mutatee;
                                       ^~~~
[this one looks like a real error, whew!]

In my local x86_64 "fedpkg mockbuild", I only got that last legitimate error.

Comment 1 Josh Stone 2017-02-08 23:58:55 UTC
Third time is not the charm...
https://koji.fedoraproject.org/koji/taskinfo?taskID=17680868

x86_64:
make[2]: *** Deleting file 'common/CMakeFiles/common.dir/src/Ident.C.o'
c++: internal compiler error: Segmentation fault (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.

ppc64:
In file included from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/headers.h:37:0,
                 from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.h:47,
                 from /builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:36:
/usr/include/sys/types.h:27:1: internal compiler error: Segmentation fault
 __BEGIN_DECLS
 ^~~~~~~~~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
Preprocessed source stored into /tmp/ccrPA72F.out file, please attach this to your bugreport.

Comment 2 Josh Stone 2017-02-09 00:11:34 UTC
Running locally with valgrind has complaints, though it still succeeds.

==3646== Conditional jump or move depends on uninitialised value(s)
==3646==    at 0x4C35860: putenv (vg_replace_strmem.c:1988)
==3646==    by 0x465870: UnknownInlinedFun (gcc.c:2673)
==3646==    by 0x465870: driver::maybe_putenv_OFFLOAD_TARGETS() const (gcc.c:7707)
==3646==    by 0x447449: driver::main(int, char**) (gcc.c:7238)
==3646==    by 0x4474D3: main (gcc-main.c:46)
==3646==
==3649== Syscall param execve(envp[i]) points to uninitialised byte(s)
==3649==    at 0x5220B57: execve (in /usr/lib64/libc-2.24.90.so)
==3649==    by 0x4649FB: pex_unix_exec_child (pex-unix.c:675)
==3649==    by 0x464495: pex_run_in_environment (pex-common.c:344)
==3649==    by 0x464495: pex_run (pex-common.c:374)
==3649==    by 0x454174: execute() (gcc.c:3092)
==3649==    by 0x452CD7: driver::do_spec_on_infiles() const (gcc.c:8125)
==3649==    by 0x447475: driver::main(int, char**) (gcc.c:7248)
==3649==    by 0x4474D3: main (gcc-main.c:46)
==3649==  Address 0x5557748 is 104 bytes inside a block of size 4,064 alloc'd
==3649==    at 0x4C2DB8D: malloc (vg_replace_malloc.c:299)
==3649==    by 0x4650CD: xmalloc (xmalloc.c:147)
==3649==    by 0x4770CA: call_chunkfun (obstack.c:94)
==3649==    by 0x4770CA: _obstack_begin_worker (obstack.c:141)
==3649==    by 0x4654A1: driver::putenv_COLLECT_GCC(char const*) const (gcc.c:7659)
==3649==    by 0x447439: driver::main(int, char**) (gcc.c:7236)
==3649==    by 0x4474D3: main (gcc-main.c:46)
==3649==
==3650== Syscall param execve(envp[i]) points to uninitialised byte(s)
==3650==    at 0x5220B57: execve (in /usr/lib64/libc-2.24.90.so)
==3650==    by 0x522144A: execvpe (in /usr/lib64/libc-2.24.90.so)
==3650==    by 0x464A51: pex_unix_exec_child (pex-unix.c:670)
==3650==    by 0x46437A: pex_run_in_environment (pex-common.c:344)
==3650==    by 0x46437A: pex_run (pex-common.c:374)
==3650==    by 0x40A39F: execute() [clone .cold.68] (gcc.c:3092)
==3650==    by 0x452CD7: driver::do_spec_on_infiles() const (gcc.c:8125)
==3650==    by 0x447475: driver::main(int, char**) (gcc.c:7248)
==3650==    by 0x4474D3: main (gcc-main.c:46)
==3650==  Address 0x5557748 is 104 bytes inside a block of size 4,064 alloc'd
==3650==    at 0x4C2DB8D: malloc (vg_replace_malloc.c:299)
==3650==    by 0x4650CD: xmalloc (xmalloc.c:147)
==3650==    by 0x4770CA: call_chunkfun (obstack.c:94)
==3650==    by 0x4770CA: _obstack_begin_worker (obstack.c:141)
==3650==    by 0x4654A1: driver::putenv_COLLECT_GCC(char const*) const (gcc.c:7659)
==3650==    by 0x447439: driver::main(int, char**) (gcc.c:7236)
==3650==    by 0x4474D3: main (gcc-main.c:46)

Comment 3 Josh Stone 2017-02-09 00:23:20 UTC
Created attachment 1248702 [details]
pre-processed Ident.C

I can reproduce the valgrind errors on this Ident.ii.
$ valgrind g++ -fPIC -c Ident.ii

Comment 4 Jakub Jelinek 2017-02-09 00:35:53 UTC
There is indeed a bug in the maybe_putenv_OFFLOAD_TARGETS function, but I don't see how it could cause any non-determinism when you don't use -fopenmp target offloading, that env var is only used to determine if not finding gcc-offload-nvptx files should result in an error or not during linking.

diff --git a/gcc7-foffload-default.patch b/gcc7-foffload-default.patch
index 74f8b4a..646db4e 100644
--- a/gcc7-foffload-default.patch
+++ b/gcc7-foffload-default.patch
@@ -45,7 +45,7 @@ libgomp/
 +      if (offload_targets_default)
 +      {
 +        obstack_grow (&collect_obstack, "OFFLOAD_TARGET_DEFAULT=1",
-+                      sizeof ("OFFLOAD_TARGET_DEFAULT=1") - 1);
++                      sizeof ("OFFLOAD_TARGET_DEFAULT=1"));
 +        xputenv (XOBFINISH (&collect_obstack, char *));
 +      }
      }

Comment 5 Josh Stone 2017-02-09 00:50:57 UTC
Created attachment 1248709 [details]
pre-processed Ident.C (no pch)

Here's the same pre-processed source having disabled cotire for pch.  It still gives the same valgrind errors, but completes without issue.

Comment 6 Josh Stone 2017-02-09 02:39:47 UTC
I ran with MALLOC_PERTURB_=42, then I can see at exec the environment has:

> OFFLOAD_TARGET_DEFAULT=1<D5><D5><D5><D5><D5><D5><D5><D5>COLLECT_GCC_OPTIONS='-fPIC' '-c' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
> COLLECT_GCC_OPTIONS='-fPIC' '-c' '-shared-libgcc' '-mtune=generic' '-march=x86-64'

(where every <D5> is the escaped byte value)

But would this cause any problem in practice?

Comment 7 Josh Stone 2017-02-09 21:18:11 UTC
Created attachment 1248904 [details]
Valgrind log of cc1plus

In my previous valgrind run, I forgot --trace-children=yes, so the errors from just the driver were not very interesting.  With children included, the driver's errors are the same as before; /usr/bin/as was clean; but the attached log from cc1plus has *much* to say:

ERROR SUMMARY: 4794 errors from 117 contexts (suppressed: 0 from 0)

Comment 8 Jakub Jelinek 2017-02-09 21:28:24 UTC
Those shouldn't be a problem, those all look like the usual sparseset* cases which valgrind doesn't understand.  GCC needs to be configured with --enable-valgrind-annotations to be fully usable under valgrind (but that slows it down a tiny bit, so it is not intentionally enabled in the rpm builds).

Anyway, the driver bug with OFFLOAD_TARGET_DEFAULT=1 should be fixed in gcc-7.0.1-0.7.fc26 that is currently building.
Are the only bugs you're seeing when using PCH?  If so, that of course makes it hard to reproduce.  Are those PCH files built at least by the very same compiler as is used to consume them?

Comment 9 Josh Stone 2017-02-09 21:36:44 UTC
Yes, the PCH files are built with the same compiler.  It appears to make no difference locally with or without PCH, but then I haven't really reproduced the same issues as koji anyway.

I'll wait for the 0.7 release and try on koji again, and if that still fails I'll do a koji build without PCH.

Comment 10 Josh Stone 2017-02-14 23:44:35 UTC
New attempt with gcc-7.0.1-0.8.fc26:
https://koji.fedoraproject.org/koji/taskinfo?taskID=17864361

i686:
make[2]: *** Deleting file 'common/CMakeFiles/common.dir/src/Ident.C.o'
c++: internal compiler error: Segmentation fault (program cc1plus)


x86_64:
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:36:2: error: invalid preprocessing directive #include
 #include "common/src/Ident.h"
  ^~~~~~~
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:37:2: error: invalid preprocessing directive #include
 #include <stdio.h>
  ^~~~~~~
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:38:2: error: invalid preprocessing directive #include
 #include <string>
  ^~~~~~~
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:39:2: error: invalid preprocessing directive #include
 #include <iostream>
  ^~~~~~~
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:40:2: error: invalid preprocessing directive #include
 #include <ostream>
  ^~~~~~~
/builddir/build/BUILD/dyninst-9.3.0/dyninst-9.3.0/common/src/Ident.C:47: confused by earlier errors, bailing out


ppc64 got through to the known error in the testsuite sources.

Comment 11 Josh Stone 2017-02-15 20:15:46 UTC
I disabled cotire (PCH) and fixed the testsuite error, and got a good build!
https://koji.fedoraproject.org/koji/taskinfo?taskID=17888674

Of course, it would still be nice to know what was failing in GCC.  I've archived the "bad" srpm using PCH here:
https://jistone.fedorapeople.org/bz1420551/dyninst-9.3.0-2.fc26.src.rpm

Comment 12 Jakub Jelinek 2017-02-16 15:51:22 UTC
Can't really reproduce even in koji:
https://koji.fedoraproject.org/koji/taskinfo?taskID=17900080
using your "bad" srpm.

Comment 13 Fedora End Of Life 2017-02-28 11:13:33 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle.
Changing version to '26'.

Comment 14 Fedora End Of Life 2018-05-03 08:03:47 UTC
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 15 Fedora End Of Life 2018-05-29 12:39:35 UTC
Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26
is no longer maintained, which means that it will not receive any
further security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.