Bug 2305877 - Install ISO's die when run on an arm64 machine with BTI
Summary: Install ISO's die when run on an arm64 machine with BTI
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: libffi
Version: rawhide
Hardware: aarch64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Carlos O'Donell
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ARMTracker
TreeView+ depends on / blocked
 
Reported: 2024-08-19 19:01 UTC by Jeremy Linton
Modified: 2024-08-26 16:28 UTC (History)
10 users (show)

Fixed In Version: libffi-3.4.6-3.fc42, libffi-3.4.6-3.fc41
Clone Of:
Environment:
Last Closed: 2024-08-26 16:28:48 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Jeremy Linton 2024-08-19 19:01:08 UTC
Using: Fedora-Server-dvd-aarch64-Rawhide-20240819.n.0.iso
HW: Initially nvidia grace, but duplicated with virt-manager/cpu=max

The installer starts and reports a dead pane (see below). When the kernel is passed 'arm64.nobti' anaconda starts and allows the user to begin the installation process.

At first glance this looks to be less 'python' and more glib/ffi related, but I'm opening it against python since that is the executable that crashed.



Reproducible: Always

Steps to Reproduce:
1. virt-manager, download latest aarch64 install ISO
2. in TCG mode, set cputype=max
3. boot the install ISO
4. Note failure:
  `Pane is dead`
5: Reboot with arm64.nobti and everything works
Actual Results:  
Starting installer, one moment...
anaconda 41.29-1.fc41 for Fedora Rawhide (pre-release) started.
 * installation log files are stored in /tmp during the installation
 * shell is available on TTY2 and in second TMUX pane (ctrl+b, then press 2)
 * when reporting a bug add logs from /tmp as separate text/plain attachments
18:53:41 Message recipient disconnected from message bus without replying

Pane is dead (status 1, Mon Aug 19 18:54:20 2024)

Expected Results:  
Anaconda starts and allows the installation to complete.



anaconda root@localhost ~]# coredumpctl debug
           PID: 2232 (python3)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 4 (ILL)
     Timestamp: Mon 2024-08-19 18:04:41 UTC (7min ago)
  Command Line: python3 -m pyanaconda.modules.boss
    Executable: /usr/bin/python3.13
 Control Group: /system.slice/anaconda.service
          Unit: anaconda.service
         Slice: system.slice
       Boot ID: 127aad7ec3c9430d8220c11860d05133
    Machine ID: dc027d7aaf0c4c5785697b9f130d38cd
      Hostname: localhost.localdomain
       Storage: /var/lib/systemd/coredump/core.python3.0.127aad7ec3c9430d8220c11860d05133.2232.1724090681000000.zst (present)
  Size on Disk: 5.9M
       Package: python3.13/3.13.0~rc1-2.fc41
      build-id: a51fc7b22b0d36c9a16b145e2a92ed9fac81078a
       Message: Process 2232 (python3) of user 0 dumped core.
...
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `python3 -m pyanaconda.modules.boss'.
Program terminated with signal SIGILL, Illegal instruction.
Downloading source file /usr/src/debug/glibc-2.40-3.fc41.aarch64/nptl/pthread_kill.c
Python Exception <class 'NameError'>: Installation error: gdb._execute_unwinders function is missing
#0  __pthread_kill_implementation (Python Exception <class 'NameError'>: Installation error: gdb._execute_unwinders function is missing
threadid=281473881239584, 
    signo=signo@entry=4, no_tid=no_tid@entry=0) at pthread_kill.c:44
44            return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
[Current thread is 1 (Thread 0xffffbeb47020 (LWP 2232))]
(gdb) bt
Python Exception <class 'ModuleNotFoundError'>: No module named 'gdb.frames'
#0  __pthread_kill_implementation (threadid=281473881239584, 
    signo=signo@entry=4, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x0000ffffbe2b8810 [PAC] in __pthread_kill_internal (
    threadid=<optimized out>, signo=4) at pthread_kill.c:78
#2  0x0000ffffbe265a00 in __GI_raise (Python Exception <class 'NameError'>: Installation error: gdb._execute_unwinders function is missing
sig=4) at ../sysdeps/posix/raise.c:26
#3  <signal handler called>
Python Exception <class 'NameError'>: Installation error: gdb._execute_unwinders function is missing
#4  ffi_closure_SYSV_alt () at ../src/aarch64/sysv.S:545
Python Exception <class 'NameError'>: Installation error: gdb._execute_unwinders function is missing
#5  0x0000ffffaee97460 in g_timeout_dispatch (Python Exception <class 'NameError'>: Installation error: gdb._execute_unwinders function is missing
source=0xaaaae84a2e80, 
    callback=0xffffbeb13fa0, user_data=0xffff98003c00) at ../glib/gmain.c:5070
#6  0x0000ffffbeb4a0a0 [PAC] in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Comment 1 Miro Hrončok 2024-08-19 20:38:40 UTC
Assigning to libblockdev for awareness, is this a dupe of bz2247319?

Comment 2 Jeremy Linton 2024-08-20 21:48:11 UTC
So this appears to be broken because its using libffi v3.4.6 which has commit:

98881ec aarch64: add BTI flag to ELF notes (#822)

so BTI is on, but its missing the landing pad in  ../src/aarch64/sysv.S:545

which is fixed by commit:

f64141e Fix bti support (#830)

Comment 3 Jeremy Linton 2024-08-20 22:05:39 UTC
@Bill, I'm guessing that if we pick f64141e the rest of the pac/bti patches should also be picked?





9c9e836 aarch64: Add a missing no-op define of SIGN_LR_LINUX_ONLY (#838)
45d284f aarch64: support pointer authentication (#834)
3873224 ffi: fix spelling mistake (#833)

Comment 4 Carlos O'Donell 2024-08-21 00:26:14 UTC
Jeremy, Thanks for filling this and noting the libffi commits that are missing for BTI.

Comment 5 Carlos O'Donell 2024-08-21 12:26:04 UTC
Jeremy,

Are you able to test out the latest rawhide build?

https://koji.fedoraproject.org/koji/taskinfo?taskID=122261690

I've pulled in 6 the commits from upstream libffi that should fix the aarch64 issues.

All the testsuite results are clean so I'm moving this forward in Rawhide first, but if your feedback is positive I can backport this into F41.

Note that F40 used libffi 3.4.4 and so doesn't have BTI enablement AFAICT.

Comment 6 Jeremy Linton 2024-08-21 13:13:10 UTC
Let me try anaconda standalone, because i'm likely to mess something up with my custom composes.

Comment 7 Bill Roberts 2024-08-21 18:43:27 UTC
@Jeremy Linton

> @Bill, I'm guessing that if we pick f64141e the rest of the pac/bti patches should also be picked?


> 9c9e836 aarch64: Add a missing no-op define of SIGN_LR_LINUX_ONLY (#838)
> 45d284f aarch64: support pointer authentication (#834)
> 3873224 ffi: fix spelling mistake (#833)

Yes and No.

If you just want the BTI fix, you only need f64141e.
If you want the PAC support, then 9c9e836 and 45d284f.

You don't need the spelling mistake fix for a comment, which is commit 3873224.

Comment 8 Carlos O'Donell 2024-08-21 19:30:11 UTC
(In reply to Bill Roberts from comment #7)
> You don't need the spelling mistake fix for a comment, which is commit
> 3873224.

I pulled all commits except the texinfo ones (since they trigger BuildRequires: texinfo which I want to avoid).

There are only 6 commits ahead of v3.4.6.

Fixes are in Rawhide right now https://bodhi.fedoraproject.org/updates/FEDORA-2024-10ad3978d1

Comment 9 Jeremy Linton 2024-08-22 18:53:22 UTC
I ended up just waiting for last nights compose to run and testing that. It looks like its working fine with anaconda+BTI now.

Thanks for the quick fix!

Comment 10 Carlos O'Donell 2024-08-22 19:56:34 UTC
(In reply to Jeremy Linton from comment #9)
> I ended up just waiting for last nights compose to run and testing that. It
> looks like its working fine with anaconda+BTI now.
> 
> Thanks for the quick fix!

Awesome. Marking closed/rawhide then.

Comment 11 Carlos O'Donell 2024-08-22 19:59:35 UTC
Reopening to fix f41.

Comment 12 Carlos O'Donell 2024-08-26 16:28:48 UTC
Fixed in f41.

https://bodhi.fedoraproject.org/updates/FEDORA-2024-450c2d5e28


Note You need to log in before you can comment on or make changes to this bug.