Bug 1940964 - FTBFS: LLVM JIT related tests fail mesarably on s390x: incompatible data layouts
Summary: FTBFS: LLVM JIT related tests fail mesarably on s390x: incompatible data layouts
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: postgresql
Version: 34
Hardware: s390x
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Filip Januš
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ZedoraTracker 1571215 1883001
TreeView+ depends on / blocked
 
Reported: 2021-03-19 16:24 UTC by Honza Horak
Modified: 2023-09-15 01:03 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-01-10 23:01:27 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Honza Horak 2021-03-19 16:24:51 UTC
Description of problem:
https://koji.fedoraproject.org/koji/taskinfo?taskID=64122369

ERROR:  failed to JIT module: Added modules have incompatible data layouts: E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-a:8:16-n32:64 (module) vs E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64 (jit)


Version-Release number of selected component (if applicable):
postgresql-13.2-3.fc34

How reproducible:
constantly for few last weeks

Steps to Reproduce:
1. rebuild the postgresql package
2.
3.

Actual results:
Build fails with this error:
ERROR:  failed to JIT module: Added modules have incompatible data layouts: E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-a:8:16-n32:64 (module) vs E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64 (jit)

Expected results:
Build succeeds

Additional info:
This looks like caused by LLVM 12 rebase as the last build that succeeded was the one with llvm 11.x

Comment 1 Honza Horak 2021-03-19 16:28:10 UTC
@sguelton @tstellar Does it sound familiar?

Comment 2 Honza Horak 2021-03-19 16:28:57 UTC
Any tips how to debug/fix? (I'm personally clueless so far)

Comment 3 Tom Lane 2021-03-19 19:13:57 UTC
FWIW, this is thought to work upstream, though not with PG releases older than 13.2.  Our resident expert thinks it sounds like a mismatch between libllvm and clang versions:

https://www.postgresql.org/message-id/20210319190047.7o4bwhbp5dzkqif3%40alap3.anarazel.de

Comment 4 Honza Horak 2021-03-22 16:53:50 UTC
(In reply to Tom Lane from comment #3)
> FWIW, this is thought to work upstream, though not with PG releases older
> than 13.2.  Our resident expert thinks it sounds like a mismatch between
> libllvm and clang versions:
> 
> https://www.postgresql.org/message-id/20210319190047.
> 7o4bwhbp5dzkqif3%40alap3.anarazel.de

Thanks for the pointer, Tom.

However, I see it with these versions that do not seem to be in a mismatch:
$> rpm -q clang llvm
clang-12.0.0-0.7.rc3.fc35.s390x
llvm-12.0.0-0.7.rc3.fc35.s390x

As F34 getting close and plpython2 removal (https://src.fedoraproject.org/rpms/postgresql/pull-request/28) being blocked by this now, it makes me think we can disable llvmjit for s390x till this is solved, as removing plpython2 later will not be possible.

Comment 5 Honza Horak 2021-03-22 17:55:00 UTC
(In reply to Honza Horak from comment #4)
> (In reply to Tom Lane from comment #3)
> > FWIW, this is thought to work upstream, though not with PG releases older
> > than 13.2.  Our resident expert thinks it sounds like a mismatch between
> > libllvm and clang versions:
> > 
> > https://www.postgresql.org/message-id/20210319190047.
> > 7o4bwhbp5dzkqif3%40alap3.anarazel.de
> 
> Thanks for the pointer, Tom.
> 
> However, I see it with these versions that do not seem to be in a mismatch:
> $> rpm -q clang llvm
> clang-12.0.0-0.7.rc3.fc35.s390x
> llvm-12.0.0-0.7.rc3.fc35.s390x

Actually, I indeed see some llvm v11 artifact left in the buildroot:
llvm11-libs

annobin pulls it in. annobin is pulled in by redhat-rpm-config. I didn't investigate properly yet, but hopefully successful rebuild of annobin might help.

Comment 6 Patrik Novotný 2021-03-23 13:25:02 UTC
I will test this in copr. Disabling JIT until this is fixed seems like a reasonable idea to me. I'll update here when I have this tested.

Comment 7 Honza Horak 2021-03-24 07:24:19 UTC
I tried to rebuild annobin in copr to get rid of llvm11-libs and while postgresql is still failing on s390x, the failures look differently: https://copr.fedorainfracloud.org/coprs/hhorak/test-pgsql-llvmjit/build/2092931/

FAILED (test process exited with exit code 2)

https://download.copr.fedorainfracloud.org/results/hhorak/test-pgsql-llvmjit/fedora-34-s390x/02092931-postgresql/build.log.gz

Comment 8 Honza Horak 2021-04-19 08:42:43 UTC
Even after getting rid of llvm11-libs from the buildroot (it is not pulled in in F34 any more) it does not work, still same error.

Comment 9 Honza Horak 2021-04-19 08:43:24 UTC
(In reply to Honza Horak from comment #8)
> Even after getting rid of llvm11-libs from the buildroot (it is not pulled
> in in F34 any more) it does not work, still same error.

Visible on the scratch build:
https://koji.fedoraproject.org/koji/taskinfo?taskID=66082182

Comment 10 Tom Stellard 2021-04-19 16:52:14 UTC
From what I can tell, this is a bug in postgresql. At runtime, it creates a JIT instance using the host CPU target, which has the DataLayout of the host.  However, when compiling JIT code, it is pulling the DataLayout from a bitcode file that is compiled at build time with no specific CPU target and thus a different DataLayout.

Comment 11 Tom Stellard 2021-04-19 20:10:02 UTC
Proposed fix for Fedora: https://src.fedoraproject.org/rpms/postgresql/pull-request/29

Comment 12 Honza Horak 2021-04-22 16:20:23 UTC
Related upstream discussion on bugs list:
https://www.postgresql.org/message-id/20210420225228.qr4x6zv3hqjorh5t%40alap3.anarazel.de

Comment 13 Filip Januš 2021-05-21 06:46:29 UTC
Same issue with postgresql 12.7:
+ERROR:  failed to JIT module: Added modules have incompatible data layouts: E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-a:8:16-n32:64 (module) vs E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64 (jit)

https://koji.fedoraproject.org/koji/taskinfo?taskID=68313568

The proposed workaround[1] needs to be updated to be suitable for postgresql12.7

[1] https://src.fedoraproject.org/rpms/postgresql/blob/41cd60000b91c121e1286c194284bffec770081b/f/postgresql-datalayout-mismatch-on-s390.patch

Comment 14 Honza Horak 2022-01-10 23:01:27 UTC
The postgresql package builds fine for some time on s390x even with JIT:
https://koji.fedoraproject.org/koji/buildinfo?buildID=1866221

Closing this for now, as it looks like it's fixed.

Comment 15 Red Hat Bugzilla 2023-09-15 01:03:44 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.