I am trying to update rubygem-libffi and the test suite fails on ppc64le / s390x: https://koji.fedoraproject.org/koji/taskinfo?taskID=123659137 This is likely similar / same issue as bug 2040380, but this time only ppc64le / s390x specific. IOW this is the reproducer: ~~~ $ rpm -q ruby ruby-3.3.5-15.fc42.s390x $ cat fiddle_fork.rb require 'fiddle/import' module Fiddle module LIBC extend Importer dlload "libc.so.6", "libm.so.6" CallCallback = bind("void call_callback(void*, void*)"){ | ptr1, ptr2| } end end error, pid, status = IO.pipe do |r, w| pid = fork {} w.close [r.read, *Process.wait2(pid)] end $ gdb --args ruby-mri fiddle_fork.rb GNU gdb (Fedora Linux) 15.1-2.fc42 Copyright (C) 2024 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "s390x-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ruby-mri... This GDB supports auto-downloading debuginfo from the following URLs: <https://debuginfod.fedoraproject.org/> Enable debuginfod for this session? (y or [n]) y Debuginfod has been enabled. To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit. Reading symbols from /root/.cache/debuginfod_client/a15aa9502c1ef56cabe920ca9b2880cc36ecd9e9/debuginfo... (gdb) r Starting program: /usr/bin/ruby-mri fiddle_fork.rb Downloading separate debug info for system-supplied DSO at 0x3ffffffe000 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x3ffde18f8c0 (LWP 2823)] [Thread 0x3ffde18f8c0 (LWP 2823) exited] [Detaching after fork from child process 2824] [New Thread 0x3ffde18f8c0 (LWP 2825)] Thread 1 "ruby-mri" received signal SIGABRT, Aborted. __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x000003fff7839506 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78 #2 0x000003fff77e0210 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x000003fff77c058c in __GI_abort () at abort.c:79 #4 0x000003ffdc465494 in dlfree (mem=<optimized out>) at ../src/dlmalloc.c:4355 #5 0x000003ffdc468c46 in ffi_closure_free (ptr=<optimized out>) at ../src/closures.c:1055 #6 0x000003ffdc47112c in dealloc (ptr=0x2aa00242e80) at /usr/src/debug/ruby-3.3.5-15.fc42.s390x/ext/fiddle/closure.c:33 #7 0x000003fff7bda7f8 in rb_data_free (objspace=0x2aa00009aa0, obj=4397447004680) at /usr/src/debug/ruby-3.3.5-15.fc42.s390x/gc.c:3500 #8 obj_free (objspace=objspace@entry=0x2aa00009aa0, obj=obj@entry=4397447004680) at /usr/src/debug/ruby-3.3.5-15.fc42.s390x/gc.c:3659 #9 0x000003fff7bdd4d8 in rb_objspace_call_finalizer (objspace=0x2aa00009aa0) at /usr/src/debug/ruby-3.3.5-15.fc42.s390x/gc.c:4704 #10 0x000003fff7bc7e90 in rb_ec_finalize (ec=0x2aa0000aa90) at /usr/src/debug/ruby-3.3.5-15.fc42.s390x/eval.c:168 #11 rb_ec_cleanup (ec=ec@entry=0x2aa0000aa90, ex=RUBY_TAG_NONE) at /usr/src/debug/ruby-3.3.5-15.fc42.s390x/eval.c:260 #12 0x000003fff7bc83b8 in ruby_run_node (n=0x3ffdc50fa20) at /usr/src/debug/ruby-3.3.5-15.fc42.s390x/eval.c:328 #13 0x000002aa000011f6 in rb_main (argc=2, argv=0x3ffffff9e68) at /usr/src/debug/ruby-3.3.5-15.fc42.s390x/main.c:39 #14 main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/ruby-3.3.5-15.fc42.s390x/main.c:58 (gdb) ~~~ I wonder what is the reason for the difference between s390x / ppc64le and the other platforms Reproducible: Always Actual Results: The closure fails after fork on s390x / ppc64le Expected Results: The closure after fork works just fine on all platforms.
If you're on a platform that uses shared memory (anonymous or file-backed) for its closures, note that fork() does *not* unshare that memory. Freeing a closure in one thread frees it for all threads! As each platform may have a different way of storing closures, this problem may not exist on all platforms.
(In reply to DJ Delorie from comment #1) > If you're on a platform that uses shared memory (anonymous or file-backed) > for its closures, note that fork() does *not* unshare that memory. Freeing > a closure in one thread frees it for all threads! > As each platform may have a different way of storing closures, this problem > may not exist on all platforms. Do I understand correctly that this is feature not a bug and there is no way to improve the situation? Actually, looking at [1, 2], it seems that static trampolines were not implemented for ppc64le / s390x, so that is likely part of the issue. [1]: https://github.com/libffi/libffi/pull/624 [2]: https://github.com/libffi/libffi/blob/084f36903f56b280283f3f4473a80b8e77727a29/configure.ac#L375-L390
(In reply to Vít Ondruch from comment #2) > Actually, looking at [1, 2], it seems that static trampolines were not > implemented for ppc64le / s390x, so that is likely part of the issue. I'm going to raise this with our IBM colleagues to see what we can do to improve the implementation.
(In reply to Carlos O'Donell from comment #3) Thx a lot 👍
The IBM teams fixed this by implementing static tramplines for s390x: https://github.com/libffi/libffi/commit/458b2ae2829f1916ea3a3e07c944b4668732290f So once we include this it will fix the issue for s390x. I'll have to check back on the ppc64le status.
This bug appears to have been reported against 'rawhide' during the Fedora Linux 42 development cycle. Changing version to 42.
Created attachment 2079176 [details] Patch to add static trampoline support to powerpc*-linux Here's a WIP patch that adds static trampoline support for powerpc-linux, powerpc64-linux and powerpc64le-linux. It passes the libffi testsuite with no errors. Can someone please give this a try on the Ruby issue reported in this bugzilla? It that comes back clean, I'll work on getting it upstream.
This patch seems to be missing src/powerpc/internal.h
Created attachment 2079219 [details] Updated patch to add static trampoline support to powerpc*-linux, with internal.h Sorry about that. Here's an updated patch which does include internal.h as well as a few other cleanups. bergner@begna:~$ diffstat libffi-ppc64.v5.diff Makefile.am | 3 ++- configure.ac | 3 ++- src/powerpc/ffi.c | 13 +++++++++++++ src/powerpc/ffi_linux64.c | 45 ++++++++++++++++++++++++++------------------- src/powerpc/ffi_sysv.c | 42 ++++++++++++++++++++++++++---------------- src/powerpc/internal.h | 6 ++++++ src/powerpc/linux64_closure.S | 39 +++++++++++++++++++++++++++++++++++++++ src/powerpc/ppc_closure.S | 24 ++++++++++++++++++++++++ 8 files changed, 138 insertions(+), 37 deletions(-)
(In reply to Peter Bergner from comment #8) > Can someone please give this a try on the Ruby issue reported in > this bugzilla? To test rubygem-ffi, it should be enough to remove this line from .spec file: https://src.fedoraproject.org/rpms/rubygem-ffi/blob/c3e875b6594ace69c2ce4798748d4c6dbe75e666/f/rubygem-ffi.spec#_49
DJ has a Rawhide build up with the changes: https://bodhi.fedoraproject.org/updates/FEDORA-2025-c66c3c672e
Tried a scratch build as suggested, still fails: build/BUILD/rubygem-ffi-1.17.0-build/ffi-1.17.0/usr/share/gems/gems/ffi-1.17.0/spec/ffi/async_callback_spec.rb:57: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../var/tmp/rpm-tmp.LKXb6s: line 55: 3424 Aborted (core dumped) RUBYOPT="-I$(dirs +1)/usr/lib64/gems/ruby/ffi-1.17.0" rspec spec https://koji.fedoraproject.org/koji/taskinfo?taskID=129947312 According root.log, the new build was used: DEBUG util.py:461: libffi ppc64le 3.4.7-2.fc43 build 90.3 KiB
Looks like configure.ac was patched, but configure was not regenerated. The old configure doesn't enable static tramapolines for powerpc64-linux-gnu.
(In reply to Florian Weimer from comment #14) > Looks like configure.ac was patched, but configure was not regenerated. The > old configure doesn't enable static tramapolines for powerpc64-linux-gnu. So I created the patch against current upstream libffi sources and that doesn't have a configure script. Rather it has you run autogen.sh which creates it for you.
(In reply to Florian Weimer from comment #14) > Looks like configure.ac was patched, but configure was not regenerated. The > old configure doesn't enable static tramapolines for powerpc64-linux-gnu. Can someone regen that configure or just apply the configure.ac changes to configure and rerun the test? If it passes, I'll work on pushing the changes to upstream libffi.
My bad, please try https://koji.fedoraproject.org/koji/taskinfo?taskID=130825673 If that works I'll push it officially.
Created attachment 2082475 [details] Final patch to add static trampoline support to powerpc*-linux, with internal.h (In reply to DJ Delorie from comment #17) > My bad, please try https://koji.fedoraproject.org/koji/taskinfo?taskID=130825673 > > If that works I'll push it officially. I don't really have the ability to run that ruby test case against the updated libffi, so hopefully one of you can do that for us? That said, I've updated the patch again and attached it here. This is the "final" version of the patch. It adds pcrel support in the case you're compiling for power10 and it cleans up the newly added asm code to use relocations against the offsets, rather than just using the offsets as is, in the event someone changes them and they need real relocation handling.
I did another scratch build with the "final" patch: https://koji.fedoraproject.org/koji/taskinfo?taskID=130860329
(In reply to DJ Delorie from comment #19) > I did another scratch build with the "final" patch: > https://koji.fedoraproject.org/koji/taskinfo?taskID=130860329 We need a non-scratch build before we can run a rubygem-ffi scratch build.
(In reply to Peter Bergner from comment #18) > That said, I've updated the patch again and attached it here. This is the > "final" version of the patch. It adds pcrel support in the case you're > compiling for power10 and it cleans up the newly added asm code to use > relocations against the offsets, rather than just using the offsets as is, > in the event someone changes them and they need real relocation handling. I'll note that I built and regtested libffi using the patch on powerpc64le-linux using --with-gcc-arch= from power8 thru power10 and on powerpc64-linux and powrepc-linux from power4 thru power10 with no regressions.
libffi-3.4.7-3 is in rawhide and has the latest patch.
Is there any status on doing a rubygem-ffi scratch build with the updated libffi-3.4.7-3 in rawhide?
Sorry, commented on the wrong (bug 2344471 comment 5): A scratch build of rubygem-ffi succeeded: https://koji.fedoraproject.org/koji/taskinfo?taskID=130949254 I even removed the compatibility patch.
(In reply to Florian Weimer from comment #24) > Sorry, commented on the wrong (bug 2344471 comment 5): > > A scratch build of rubygem-ffi succeeded: > https://koji.fedoraproject.org/koji/taskinfo?taskID=130949254 > > I even removed the compatibility patch. Excellent, thanks for confirming it fixes the rubygem-ffi issue! DJ, did you say you could push the patch to upstream libffi? If not, I can work on doing that.
(In reply to Peter Bergner from comment #25) > DJ, did you say you could push the patch to upstream libffi? > If not, I can work on doing that. I talked with DJ offline and his "push officially" comment above meant pushing into Fedora officially. I'll work on pushing the patch into the upstream libffi sources.
The patch was just pulled into the upstream libffi sources.