Bug 2171888 - ocaml-dune segfault on i386 when inlining
Summary: ocaml-dune segfault on i386 when inlining
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: gcc
Version: rawhide
Hardware: i386
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedFreezeException
Depends On:
Blocks: F38BetaBlocker F38BetaFreezeException
TreeView+ depends on / blocked
 
Reported: 2023-02-20 18:15 UTC by Jerry James
Modified: 2023-02-23 00:45 UTC (History)
14 users (show)

Fixed In Version: gcc-13.0.1-0.5.fc38
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-02-23 00:45:23 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
GNU Compiler Collection 108868 0 P1 NEW [13 Regression] ocaml-dune miscompilation on i686 since r13-5965 2023-02-20 23:03:07 UTC

Description Jerry James 2023-02-20 18:15:32 UTC
Description of problem:
The ocaml-dune package started failing to build from source on i386 only when gcc-13.0.1-0.4 landed in the repos (see bug 2168932).  The immediate problem is a segfault inside a call to pthread_sigmask:

pthread_sigmask(SIG_SETMASK, &saved_procmask, NULL);

Although saved_procmask is a local variable, gdb shows that pthread_sigmask receives the number 9 as its second argument.  The first and third arguments are passed correctly.  Downgrading to gcc-13.0.1-0.3 makes the issue go away.  Adding -fno-inline-functions to the build flags also makes the issue go away.

I tried to make a reduced test case to illustrate the problem.  The code in question (vendor/spawn/src/spawn_stubs.c) has calls into the OCaml runtime, mostly for memory handling.  I tried to replace all such calls with generic C library functionality.  The resulting program does not segfault.  I suspect that means that the OCaml C API practice of converting back and forth between pointers and integers, and bit shifting in the process, somehow plays a role.  At this point, I don't know how to produce a small reproducer.

Version-Release number of selected component (if applicable):
gcc-13.0.1-0.4.fc39.i686

How reproducible:
Always

Steps to Reproduce:
1. Build the ocaml-dune package for Rawhide on i386
2.
3.

Actual results:
The bootstrap dune binary segfaults.

Expected results:
No segfaults.

Additional info:

Comment 1 Richard W.M. Jones 2023-02-20 21:15:20 UTC
I can't even get dune to compile on x86-64.  It fails at:

+ /usr/bin/make -O -j24 V=1 VERBOSE=1 release
ocamlc -output-complete-exe -w -24 -g -o .duneboot.exe -I boot unix.cma boot/libs.ml boot/duneboot.ml
./.duneboot.exe
Internal error, please report upstream including the contents of _build/log.
Description:
  ("Unexpected find result", { found = Not_found; lib.name = "pp" })
Raised at Stdune__code_error.raise in file
  "otherlibs/stdune/src/code_error.ml", line 11, characters 30-62
Called from Fiber__scheduler.exec in file "otherlibs/fiber/src/scheduler.ml",
  line 73, characters 8-11
-> required by ("<unnamed>", ())
-> required by ("<unnamed>", ())
-> required by ("<unnamed>", ())
-> required by ("load-dir", In_build_dir "default/src/fsevents/bin")
-> required by ("toplevel", ())

I must not crash.  Uncertainty is the mind-killer. Exceptions are the
little-death that brings total obliteration.  I will fully express my cases. 
Execution will pass over me and through me.  And when it has gone past, I
will unwind the stack along its path.  Where the cases are handled there will
be nothing.  Only I will remain.
make: *** [Makefile:47: release] Error 1

Comment 2 Richard W.M. Jones 2023-02-20 21:19:35 UTC
As for the original bug, I don't think there's an easy way to reproduce it
without a full i686 environment.  In theory this might work:

$ setarch i386 fedpkg local

but in practice it fails because of lack of OCaml dependencies compiled for
i686 like ocaml-pp.

Comment 3 Fedora Update System 2023-02-22 08:43:45 UTC
FEDORA-2023-213a7f0317 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-213a7f0317

Comment 4 Fedora Blocker Bugs Application 2023-02-22 10:05:44 UTC
Proposed as a Blocker for 38-beta by Fedora user jakub using the blocker tracking app because:

 An important wrong-code bug has been introduced in gcc-13.0.1-0.4.fc38 (tagged into stable Feb 17), which results in likely miscompilation of all code that uses vfork.  I'm afraid every day the fixed gcc-13.0.1-0.5.fc38 isn't tagged into f38 means risk of further miscompiled packages.

Comment 5 Jakub Jelinek 2023-02-22 10:08:08 UTC
Proposed as F38 beta blocker.
Details in https://gcc.gnu.org/PR108868 and https://gcc.gnu.org/PR108691

Comment 6 Fedora Blocker Bugs Application 2023-02-22 10:33:03 UTC
Proposed as a Freeze Exception for 38-beta by Fedora user frantisekz using the blocker tracking app because:

 Proposing as a FE, this will have the same effect as Blocker in this case (we have a fix), but makes things wording-wise easier procedurally.

Comment 7 Fedora Update System 2023-02-22 13:32:33 UTC
FEDORA-2023-213a7f0317 has been pushed to the Fedora 38 testing repository.

You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-213a7f0317

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 8 Adam Williamson 2023-02-22 17:15:49 UTC
+4 FE in https://pagure.io/fedora-qa/blocker-review/issue/1045 , marking accepted FE.

Comment 9 Fedora Update System 2023-02-23 00:45:23 UTC
FEDORA-2023-213a7f0317 has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.