Bug 2430184 - cca 30 % python3.15 aarch64 builds fail with internal compiler error: Segmentation fault
Summary: cca 30 % python3.15 aarch64 builds fail with internal compiler error: Segment...
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: gcc
Version: rawhide
Hardware: aarch64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2026-01-15 21:45 UTC by Miro Hrončok
Modified: 2026-03-11 11:09 UTC (History)
14 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)
preprocessed sources (1.26 MB, text/x-csrc)
2026-01-15 21:45 UTC, Miro Hrončok
no flags Details
/tmp/cc*.out *.gcda files (4.01 MB, application/x-tar)
2026-01-16 12:28 UTC, Miro Hrončok
no flags Details


Links
System ID Private Priority Status Summary Last Updated
GNU Compiler Collection 123634 0 P3 UNCONFIRMED ICE in cancel_negative_cycle on aarch64 2026-01-16 13:57:33 UTC
Github python cpython issues 145801 0 None open Python should be built with "gcc -fprofile-update=atomic" for PGO to fix random GCC crashes 2026-03-11 11:09:05 UTC
Github python cpython pull 145802 0 None open gh-145801: Use gcc -fprofile-update=atomic for PGO builds 2026-03-11 11:09:05 UTC

Description Miro Hrončok 2026-01-15 21:45:27 UTC
Created attachment 2122283 [details]
preprocessed sources

At least since gcc 16.0.0-0.4.fc44, every now and then (cca 30 % of times), the build of python3.15 fails in Koji on aarch64 with:

*** WARNING *** there are active plugins, do not report this as a bug unless you can reproduce it without enabling any plugins.
Event                            | Plugins
PLUGIN_FINISH_UNIT               | annobin: Generate final annotations
PLUGIN_START_UNIT                | annobin: Generate global annotations
PLUGIN_ALL_PASSES_START          | annobin: Generate per-function annotations
PLUGIN_ALL_PASSES_END            | annobin: Register per-function end symbols
during IPA pass: profile
/builddir/build/BUILD/python3.15-3.15.0_a3-build/Python-3.15.0a3/Modules/_opcode.c: In function ‘_opcode_has_arg’:
/builddir/build/BUILD/python3.15-3.15.0_a3-build/Python-3.15.0a3/Modules/_opcode.c:452:1: internal compiler error: Segmentation fault
  452 | }
      | ^

We were unable to reproduce the problem outside of Koji, so we don't know if it happens without enabling any plugins.

To reproduce, run e.g.

for i in {1..10}; do koji build rawhide --fail-fast --nowait --scratch 'git+https://src.fedoraproject.org/forks/ksurma/rpms/python3.15.git#3914aa0174b9589bee4b441a524dc4af3d4dadcb' --arch-override=aarch64; done


Some of the builds will fail in 15-30 minutes, depending on the beefiness of the builder.

Comment 1 Miro Hrončok 2026-01-16 10:23:26 UTC
Still happens with gcc 16.0.1-0.2.fc44

Comment 2 Jakub Jelinek 2026-01-16 11:05:35 UTC
Ah, so PGO -fprofile-use ICE, that could explain some non-reproduceability, but it also means that the preprocessed source isn't the only thing needed to reproduce, the corresponding *.gcda file (or tarball of all the *.gcda files in the tree after the ICE + /tmp/cc*.out files will be needed.

Comment 3 Jakub Jelinek 2026-01-16 12:20:49 UTC
I have done
--- python3.15.spec	2026-01-16 12:16:57.152258435 +0100
+++ python3.15.spec	2026-01-16 12:47:50.623486885 +0100
@@ -31,6 +31,8 @@
 # we build Python 2x, it's better to just run it once with the "full" build
 %bcond tests %{without bootstrap}
 
+%undefine _annotated_build
+
 # ==================
 # Top-level metadata
 # ==================
@@ -1106,11 +1108,11 @@ EOF
 
 %if %{without bootstrap}
   # Regenerate generated files (needs python3)
-  %make_build %{flags_override} regen-all PYTHON_FOR_REGEN="python%{pybasever}"
+  %make_build %{flags_override} regen-all PYTHON_FOR_REGEN="python%{pybasever}" || ( tar cf - /tmp/cc*.out `find /builddir/build -name \*.gcda` | base64; exit 1 )
 %endif
 
   # Invoke the build
-  %make_build %{flags_override}
+  %make_build %{flags_override} || ( tar cf - /tmp/cc*.out `find /builddir/build -name \*.gcda` | base64; exit 1 )
 
   popd
   echo FINISHED: BUILD OF PYTHON FOR CONFIGURATION: $ConfName
to disable the annobin plugin and grab the cc*.out as well as *.gcda files (should have used | xz -9 too to make it smaller).
Now to find some native aarch64 where I can try to reproduce it in mock, in a cross compiler it doesn't reproduce unfortunately.

Comment 4 Jakub Jelinek 2026-01-16 12:21:18 UTC
https://kojipkgs.fedoraproject.org//work/tasks/2558/141152558/build.log is the log file with the base64 encoded tarball.

Comment 5 Miloš Komarčević 2026-01-16 12:23:23 UTC
> We were unable to reproduce the problem outside of Koji

I don't think this is restricted to Python packages, I saw the same error (intermittently) w/ e.g. darktable builds, including x86_64:

https://koji.fedoraproject.org/koji/taskinfo?taskID=141148367

And on OBS as well for Fedora Rawhide:

https://build.opensuse.org/package/live_build_log/graphics:darktable:master/darktable/Fedora_Rawhide/aarch64

Comment 6 Miro Hrončok 2026-01-16 12:24:01 UTC
I actually have the buidlroot tared up with all the *.gcda files, but I don't have the tmp files. So if you are able to get it you way, even better.

Comment 7 Miro Hrončok 2026-01-16 12:28:38 UTC
Created attachment 2122340 [details]
/tmp/cc*.out *.gcda files

Comment 8 Jakub Jelinek 2026-01-16 13:28:14 UTC
Ok, finally managed to reproduce with the Modules/_opcode.gcda file from the failed build.
The segfault is
#0  cancel_negative_cycle (fixup_graph=0xffffffffe330, pi=<optimized out>, d=<optimized out>, cycle=<optimized out>) at ../../gcc/mcf.cc:890
#1  find_minimum_cost_flow (fixup_graph=0xffffffffe330) at ../../gcc/mcf.cc:1332
#2  mcf_smooth_cfg () at ../../gcc/mcf.cc:1381
#3  0x000000000090b4ac in compute_branch_probabilities (cfg_checksum=<optimized out>, lineno_checksum=<optimized out>) at ../../gcc/profile.cc:663
#4  branch_prob (thunk=<optimized out>) at ../../gcc/profile.cc:1605
#5  0x00000000009a00ec in tree_profiling () at ../../gcc/tree-profile.cc:1945
#6  (anonymous namespace)::pass_ipa_tree_profile::execute (this=<optimized out>) at ../../gcc/tree-profile.cc:2075
#7  0x0000000001070934 in execute_one_pass (pass=pass@entry=0x2c5cd30) at ../../gcc/passes.cc:2656
#8  0x0000000001070654 in execute_ipa_pass_list (pass=0x2c5cd30) at ../../gcc/passes.cc:3118
#9  0x0000000001d0cf40 in ipa_passes () at ../../gcc/cgraphunit.cc:2241
#10 symbol_table::compile (this=0xfffff7806000) at ../../gcc/cgraphunit.cc:2364
#11 0x0000000001d07674 in symbol_table::finalize_compilation_unit (this=0xfffff7806000) at ../../gcc/cgraphunit.cc:2623
#12 0x0000000001ce2edc in compile_file () at ../../gcc/toplev.cc:485
#13 0x0000000001c60c4c in do_compile () at ../../gcc/toplev.cc:2228
#14 toplev::main (this=this@entry=0xffffffffe968, argc=<optimized out>, argv=<optimized out>) at ../../gcc/toplev.cc:2395
#15 0x0000000001c5ff68 in main (argc=<optimized out>, argv=<optimized out>) at ../../gcc/main.cc:39
where on line 889 pfedge is set to NULL and on line 890 dereferenced:
889           pfedge = find_fixup_edge (fixup_graph, cycle[k + 1], cycle[k]);
890           cycle_flow = MIN (cycle_flow, pfedge->rflow);

Comment 9 Jakub Jelinek 2026-01-16 13:56:58 UTC
Unfortunately I'm not familiar with mcf.cc at all (even didn't know it existed until today).  If you want a quick workaround, guess somehow filter out -fprofile-use option on aarch64 when compiling Modules/_opcode.c (if that is the only source on which it segfaults).

Comment 10 Jakub Jelinek 2026-01-16 14:04:25 UTC
Could you try if using -fprofile-update=atomic fixes the gcda corruptions?

Comment 11 Jakub Jelinek 2026-01-16 16:07:59 UTC
(In reply to Miloš Komarčević from comment #5)
> > We were unable to reproduce the problem outside of Koji
> 
> I don't think this is restricted to Python packages, I saw the same error
> (intermittently) w/ e.g. darktable builds, including x86_64:
> 
> https://koji.fedoraproject.org/koji/taskinfo?taskID=141148367
> 
> And on OBS as well for Fedora Rawhide:
> 
> https://build.opensuse.org/package/live_build_log/graphics:darktable:master/
> darktable/Fedora_Rawhide/aarch64

I believe darktable is completely unrelated, I think it is
https://gcc.gnu.org/PR122852 instead, and so should be fixed in gcc-16.0.1-0.3.fc44 which is
currently building in rawhide.

Comment 12 Miro Hrončok 2026-01-16 18:36:45 UTC
Trying 10 builds with: sed -i 's/-fprofile-use /-fprofile-use=atomic /' configure.ac

Comment 13 Jakub Jelinek 2026-01-16 18:52:16 UTC
That is not it.  -fprofile-use= argument is a directory where to find the gcda files.
I meant -fprofile-update=atomic option, so
sed -i 's/-fprofile-generate /& -fprofile-update=atomic /' configure.ac
-fprofile-update= option chooses how -fprofile-generate emitted code updates the profile feedback data counters.
Default is prefer-atomic, which is the same as single if -pthread is not specified and same as atomic
if -pthread is specified.  single means counters are updated non-atomically, faster, but can result in counter corruptions when multiple threads
attempt to increment those counters concurrently.  atomic is slower way, uses atomic instructions, but shouldn't result in corrupted
gcda files.  If python (or its libraries) during the profile feedback phase is/are threaded and some or all of the source files
weren't compiled with -pthread, then -fprofile-update=atomic is desirable (if the compiler supports that, very old gcc versions didn't).

Comment 14 Jakub Jelinek 2026-01-16 18:56:05 UTC
So, -fprofile-generate -fprofile-update=single to the same with =atomic causes e.g. on x86_64 differences like
-	addq	$1, __gcov0.foo(%rip)
+	lock addq	$1, __gcov0.foo(%rip)
and
-	call	__gcov_indirect_call_profiler_v4
+	call	__gcov_indirect_call_profiler_v4_atomic
and
-	movq	__gcov_time_profiler_counter(%rip), %rax
+	movl	$1, %eax
+	lock xaddq	%rax, __gcov_time_profiler_counter(%rip)
 	addq	$1, %rax
-	movq	%rax, __gcov_time_profiler_counter(%rip)

Comment 15 Miro Hrončok 2026-01-16 19:58:25 UTC
I must have confused -fprofile-update with fprofile-use in my head, sorry about that :/ Tring with PGO_PROF_GEN_FLAG="-fprofile-generate" -> PGO_PROF_GEN_FLAG="-fprofile-generate -fprofile-update=atomic"

Comment 16 Miro Hrončok 2026-01-16 22:29:15 UTC
10/10 builds succeeded with that.

Comment 17 Jakub Jelinek 2026-01-19 13:37:22 UTC
Note, the switch to -fprofile-update=atomic is not just some kind of workaround, it is the right fix.  Of course, the compiler should try harder to recover even from bogus gcda and that is what the upstream bug is about.

Comment 18 Miro Hrončok 2026-01-19 13:42:28 UTC
> the switch to -fprofile-update=atomic is not just some kind of workaround, it is the right fix

To clarify, this applies to all architectures, not just aarch64, correct?

Comment 19 Jakub Jelinek 2026-01-19 13:45:55 UTC
Yes.  If the instrumented program(s) or their libraries executed between the -fprofile-generate and -fprofile-use builds are multi-threaded, then either -fprofile-update=atomic or -pthread shopuld have been used next to -fprofile-generate during the compilation.

Comment 20 Miro Hrončok 2026-03-10 19:43:43 UTC
Interestingly, I now see this (or something similar) on an x86_64 build.

https://koji.fedoraproject.org/koji/taskinfo?taskID=143216355

*** WARNING *** there are active plugins, do not report this as a bug unless you can reproduce it without enabling any plugins.
Event                            | Plugins
PLUGIN_FINISH_UNIT               | annobin: Generate final annotations
PLUGIN_START_UNIT                | annobin: Generate global annotations
PLUGIN_ALL_PASSES_START          | annobin: Generate per-function annotations
PLUGIN_ALL_PASSES_END            | annobin: Register per-function end symbols
during IPA pass: profile
/builddir/build/BUILD/python3.15-3.15.0_a7-build/Python-3.15.0a7/Modules/_opcode.c: In function ‘_opcode_has_arg’:
/builddir/build/BUILD/python3.15-3.15.0_a7-build/Python-3.15.0a7/Modules/_opcode.c:449:1: internal compiler error: Segmentation fault
  449 | }
      | ^

Comment 21 Miro Hrončok 2026-03-11 11:09:06 UTC
Victor Stinner opened https://github.com/python/cpython/pull/145802 to use gcc -fprofile-update=atomic for PGO builds


Note You need to log in before you can comment on or make changes to this bug.