Bug 2130009 - BTI (Branch Target Identification) fault in llint_program_prologue
Summary: BTI (Branch Target Identification) fault in llint_program_prologue
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: webkit2gtk3
Version: 36
Hardware: aarch64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Michael Catanzaro
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-09-26 21:21 UTC by Linus Torvalds
Modified: 2023-05-08 10:28 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2023-04-20 13:50:43 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Fedora Package Sources webkitgtk pull-request 3 0 None None None 2023-03-07 15:58:12 UTC
WebKit Project 245697 0 None None None 2022-09-26 22:28:49 UTC

Description Linus Torvalds 2022-09-26 21:21:17 UTC
Description of problem:

The wifi login process uses webkit to log into open networks with a captive portal. That crashes on arm64 with BTI enabled.

Version-Release number of selected component (if applicable):

webkit2gtk3-jsc-2.36.7-1.fc36.aarch64

How reproducible:

Every time, assuming you have one of those captive portal login things. The workaround then is to just fire up firefox manually, which can also handle the captive portal login.


Steps to Reproduce:
1. Find a captive portal wifi
2. Connect to it
3. Don't profit!

Actual results:

Process 2338 (WebKitWebProces) of user 1000 dumped core.

Expected results:

Login screen for the captive portal

Additional info:

This requires CONFIG_ARM64_BTI=y for the kernel, and hardware that actually supports BTI (eg Apple M2 Macbook Air running Linux).

Then WebKitWebProcess will a BTI fault in
libjavascriptcoregtk-4.0.so.18.20.11[ffff56fe0000+11f0000]

with the backtrace being

   Module libvulkan.so.1 with build-id
67d50cfbcd9385a604b088608e38177128818e19
   Stack trace of thread 2:
   #0  0x0000ffff5711b8b0 llint_program_prologue
(libjavascriptcoregtk-4.0.so.18 + 0x13b8b0)
   #1  0x0000ffff5711844c vmEntryToJavaScript
(libjavascriptcoregtk-4.0.so.18 + 0x13844c)
   #2  0x0000ffff57dcf7d8
_ZN3JSC11Interpreter14executeProgramERKNS_10SourceCodeEPNS_14JSGlobalObje$
   #3  0x0000ffff11600000 n/a (n/a + 0x0)

Comment 1 Michael Catanzaro 2022-09-26 22:28:49 UTC
Hi Linus, I've created upstream bug https://bugs.webkit.org/show_bug.cgi?id=245697 and will ask around to see if the upstream developers who understand this code might be interested in investigating it.

Some context: llint_program_prologue() is an assembly language function implemented in LowLevelInterpreter.asm. This is a custom assembly language implemented by Source/JavaScriptCore/llint, so it is not ARM assembly. Unfortunately, everything under Source/JavaScriptCore is gobbledygook to me. There's really no chance of making any serious progress here on Red Hat Bugzilla because I'm the only WebKit developer here, so I will close this with UPSTREAM resolution to indicate that is where work will need to happen. But if patches are eventually posted upstream, then I can build test RPMs for you to try to see if they actually work, and handle backports if something lands.

It's *probably* possible to work around this issue by using a custom build of WebKitGTK with the JIT disabled and the low level interpreter forced to use a slower C implementation rather than asm. That would not be good to do in Fedora proper, but I can provide an unofficial build if you want one. If you're content with using Firefox and don't use other applications that require WebKitGTK, Firefox is certainly the easier workaround.

Comment 2 Linus Torvalds 2022-09-26 22:40:41 UTC
Heh. Thanks. I didn't even know what the real upstream project for that particular code was, so going by the package I have installed was the easiest way.

Comment 3 Michael Catanzaro 2022-09-26 23:16:20 UTC
CC: Jeremy as change owner of https://fedoraproject.org/wiki/Changes/Aarch64_PointerAuthentication

(In reply to Linus Torvalds from comment #2)
> Heh. Thanks. I didn't even know what the real upstream project for that
> particular code was, so going by the package I have installed was the
> easiest way.

It's WebKit. Unfortunately Apple is telling me that the implementation of BTI is going to be somehow Linux-specific, so that's not great news. The intersection of JSC developers with M2 Macbooks and JSC developers using Linux is probably zero.

Comment 4 Gustavo Noronha Silva 2022-09-27 11:45:46 UTC
Instead of a custom package, a potential work-around is to run with JavaScriptCoreUseJIT=0 in the environment, I believe?

Comment 5 Michael Catanzaro 2022-09-27 13:25:11 UTC
(In reply to Gustavo Noronha Silva from comment #4)
> Instead of a custom package, a potential work-around is to run with
> JavaScriptCoreUseJIT=0 in the environment, I believe?

Not if my understanding is correct. Doesn't hurt to try, but based on the backtrace, I don't think it's trying to use JIT yet, but rather the baseline interpreter tier. To make this work, we would need to disable that at build time and use the llint cloop instead of the asm interpreter, which we already do in RHEL. There's actually a RHEL-specific condition for this in our spec file already. This is not a desirable way to go as the performance will not be satisfactory, but I believe that would be enough to make this work today without requiring anybody to dive into JavaScriptCore and figure out how to make BTI work.

I've posted some more comments in the upstream bug.

Comment 6 Michael Catanzaro 2023-01-06 17:18:00 UTC
A package with a workaround is being prepared here: https://github.com/leifliddy/asahi-fedora-builder/issues/10

Comment 7 Michael Catanzaro 2023-03-07 15:58:13 UTC
Apparently it's possible to resolve this by changing build flags. Eric Curtin has a pull request incoming so you won't need to use a special package from the Asahi repo anymore.

Comment 8 Michael Catanzaro 2023-04-20 13:50:43 UTC
Oops, I forgot to close this bug. This should be fixed in both Fedora 38 and 37.

Comment 10 ecurtin 2023-05-08 10:28:12 UTC
It is fixed (well worked around) in Fedora, we built without BTI basically, although I'd like to see it fixed properly, I do not want users to suffer because it's unfixed though. Upstream bug should remain open as it needs to be fixed properly by someone who cares and has the cycles to do it.

Hopefully we can turn on BTI protection for this package again when it is fixed upstream.


Note You need to log in before you can comment on or make changes to this bug.