Description of problem: Both clinfo and rocm-clinfo crash if mesa-libOpenCL and rocm-opencl are installed in parallel. Version-Release number of selected component (if applicable): clinfo-3.0.21.02.21-4.fc37.x86_64 mesa-libOpenCL-22.3.1-1.fc37.x86_64 rocm-clinfo-5.3.2-1.fc37.x86_64 rocm-opencl-5.3.2-1.fc37.x86_64 How reproducible: Always. Steps to Reproduce: 1. dnf install clinfo mesa-libOpenCL rocm-clinfo rocm-opencl 2. clinfo Actual results: mesa: CommandLine Error: Option 'h' registered more than once! LLVM ERROR: inconsistency in registered CommandLine options Aborted (core dumped) Expected results: Number of platforms 1 Platform Name AMD Accelerated Parallel Processing Platform Vendor Advanced Micro Devices, Inc. Platform Version OpenCL 2.1 AMD-APP (3486.0) Platform Profile FULL_PROFILE ...
Same issue still present with the more recent rocm-opencl-5.4.1-1.fc37.x86_64
*** Bug 2143687 has been marked as a duplicate of this bug. ***
This error: > mesa: CommandLine Error: Option 'h' registered more than once! > LLVM ERROR: inconsistency in registered CommandLine options > Aborted (core dumped) Is fixed in this Fedora 38 update: https://bodhi.fedoraproject.org/updates/FEDORA-2023-05720f124e If you already upgraded to Fedora 38, please test. I'll see if I can backport it to Fedora 37.
It's showing another error, but the problem is still present: sudo dnf install mesa-libOpenCL rocm-clinfo : CommandLine Error: Option 'abort-on-max-devirt-iterations-reached' registered more than once! LLVM ERROR: inconsistency in registered CommandLine options fish: Job 1, 'rocm-clinfo' terminated by signal SIGABRT (Abort) sudo dnf rm mesa-libOpenCL rocm-clinfo Number of platforms: 2 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.1 AMD-APP (3513.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Profile: FULL_PROFILE Platform Version: OpenCL 3.0 PoCL 3.1 Linux, Release, RELOC, SPIR, LLVM 16.0.0, SLEEF, FP16, DISTRO, POCL_DEBUG Platform Name: Portable Computing Language Platform Vendor: The pocl project Platform Extensions: cl_khr_icd cl_pocl_content_size Platform Name: AMD Accelerated Parallel Processing Number of devices: 2 Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Board name: AMD Radeon RX 6900 XT ... Versions: dnf list --installed | grep rocm rocm-clinfo.x86_64 5.4.3-2.fc38 @updates rocm-comgr.x86_64 16.0-2.fc38 @updates rocm-comgr-debuginfo.x86_64 5.3.0-1.fc37 @updates-debuginfo rocm-comgr-devel.x86_64 16.0-2.fc38 @updates rocm-compilersupport-debugsource.x86_64 5.3.0-1.fc37 @updates-debuginfo rocm-device-libs.x86_64 16.0-1.fc38 @fedora rocm-opencl.x86_64 5.4.3-2.fc38 @updates rocm-opencl-devel.x86_64 5.4.3-2.fc38 @updates rocm-runtime.x86_64 5.4.1-3.fc38 @fedora rocm-runtime-devel.x86_64 5.4.1-3.fc38 @fedora rocm-smi.noarch 4.0.0-8.fc38 @fedora rocminfo.x86_64 5.4.1-2.fc38 @fedora
Thanks for the feedback, I'll contact upstream with this info.
So I back-ported the fix to f37 and I can't reproduce any error right now with this update: https://bodhi.fedoraproject.org/updates/FEDORA-2023-994e29c721 It's possible the LLVM 16 upgrade in Fedora 38 causes a regression (as compared to Fedora 37's LLVM 15), or maybe there's something unique to your system that makes it not reproduce on my end. I think I might have a AMD Radeon RX 6700 accessible to me that I can test out, as the current HW on my system is from the RX 5xxx series. I also spoke to the upstream developers and the fix that they suggested might require major packaging changes in other fedora packages. Either way, I'll need to reproduce before I can proceed with any fix yet.
I've tested with f37 in toolbox: toolbox create --release 37 sudo dnf install 'rocm-*' rocm-clinfo Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.1 AMD-APP (3513.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Name: AMD Accelerated Parallel Processing Number of devices: 2 sudo dnf install mesa-libOpenCL rocm-clinfo Segmentation fault (core dumped) backtrace: Thread 1 "rocm-clinfo" received signal SIGSEGV, Segmentation fault. 0x0000000000000000 in ?? () (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00007ffff72cc2de in clover::device::supports_ir (ir=PIPE_SHADER_IR_NATIVE, this=0x55555569dc10) at ../src/gallium/frontends/clover/core/device.cpp:502 #2 clover::device::device (this=this@entry=0x55555569dc10, platform=..., ldev=0x5555556a1210) at ../src/gallium/frontends/clover/core/device.cpp:165 #3 0x00007ffff72db5cf in clover::create<clover::device, clover::platform&, pipe_loader_device*&> () at ../src/gallium/frontends/clover/util/pointer.hpp:240 #4 clover::platform::platform ( this=this@entry=0x7ffff7537100 <(anonymous namespace)::_clover_platform>) at ../src/gallium/frontends/clover/core/platform.cpp:41 #5 0x00007ffff729f7fd in __static_initialization_and_destruction_0 (__priority=65535, __initialize_p=1) at ../src/gallium/frontends/clover/api/platform.cpp:34 #6 0x00007ffff7fcccde in call_init (env=0x7fffffffe1d8, argv=0x7fffffffe1c8, argc=1, l=<optimized out>) at dl-init.c:70 #7 call_init (l=<optimized out>, argc=1, argv=0x7fffffffe1c8, env=0x7fffffffe1d8) at dl-init.c:26 #8 0x00007ffff7fccdcc in _dl_init (main_map=0x555555614f20, argc=1, argv=0x7fffffffe1c8, env=0x7fffffffe1d8) at dl-init.c:117 #9 0x00007ffff7ca8f14 in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at /usr/src/debug/glibc-2.36-9.fc37.x86_64/elf/dl-error-skeleton.c:182 #10 0x00007ffff7fd3736 in dl_open_worker (a=a@entry=0x7fffffffd7c0) at dl-open.c:808 #11 0x00007ffff7ca8ebe in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at /usr/src/debug/glibc-2.36-9.fc37.x86_64/elf/dl-error-skeleton.c:208 #12 0x00007ffff7fd3acc in _dl_open (file=0x555555613970 "libMesaOpenCL.so.1", mode=<optimized out>, caller_dlopen=0x7ffff7f9789f <_open_driver+303>, nsid=<optimized out>, argc=1, argv=0x7fffffffe1c8, env=0x7fffffffe1d8) at dl-open.c:884 #13 0x00007ffff7be123c in dlopen_doit (a=a@entry=0x7fffffffda30) at dlopen.c:56 #14 0x00007ffff7ca8ebe in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd990, operate=<optimized out>, args=<optimized out>) at /usr/src/debug/glibc-2.36-9.fc37.x86_64/elf/dl-error-skeleton.c:208 #15 0x00007ffff7ca8f73 in __GI__dl_catch_error (objname=0x7fffffffd9e8, errstring=0x7fffffffd9f0, mallocedp=0x7fffffffd9e7, operate=<optimized out>, args=<optimized out>) at /usr/src/debug/glibc-2.36-9.fc37.x86_64/elf/dl-error-skeleton.c:227 #16 0x00007ffff7be0d0f in _dlerror_run (operate=operate@entry=0x7ffff7be11e0 <dlopen_doit>, args=args@entry=0x7fffffffda30) at dlerror.c:138 #17 0x00007ffff7be12f1 in dlopen_implementation (dl_caller=<optimized out>, --Type <RET> for more, q to quit, c to continue without paging--c mode=<optimized out>, file=<optimized out>) at dlopen.c:71 #18 ___dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:81 #19 0x00007ffff7f9789f in _load_icd (lib_path=0x555555613970 "libMesaOpenCL.so.1", num_icds=1) at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:208 #20 _open_driver (num_icds=num_icds@entry=1, dir_path=dir_path@entry=0x7ffff7fac0a4 "/etc/OpenCL/vendors", file_path=file_path@entry=0x555555578f43 "mesa.icd") at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:261 #21 0x00007ffff7f9ad16 in _open_drivers (dir_path=<optimized out>, dir=<optimized out>) at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:274 #22 __initClIcd () at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:767 #23 _initClIcd_real () at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:824 #24 0x00007ffff7f9ce14 in _initClIcd () at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:853 #25 clGetPlatformIDs (num_entries=0, platforms=0x0, num_platforms=0x7fffffffdc14) at /usr/src/debug/ocl-icd-2.3.1-2.fc37.x86_64/ocl_icd_loader.c:1018 #26 0x000055555555e547 in cl::Platform::get (platforms=platforms@entry=0x7fffffffdd90) at /usr/src/debug/rocm-opencl-5.4.3-1.fc37.x86_64/tools/clinfo/../../khronos/headers/opencl2.2/CL/../CL/cl2.hpp:2474 #27 0x0000555555556f58 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/rocm-opencl-5.4.3-1.fc37.x86_64/tools/clinfo/clinfo.cpp:75
*** Bug 2149162 has been marked as a duplicate of this bug. ***
*** Bug 2157619 has been marked as a duplicate of this bug. ***
Some observations: - I can't reproduce this on up to date Fedora 37 system - I can reproduce with a RX 6750 XT on Fedora 38 - I can't reproduce on Fedora 37 with Fedora 38 toolbox with the same HW Seems strange. I'll update this if I ever figure it out.
* note I can't reproduce on other HW period.
I believe this update fixes the issue: https://bodhi.fedoraproject.org/updates/FEDORA-2023-68012d0819 I can't reproduce it anymore now. Can anyone confirm?
It's working now, with and without mesa. Good job!
FEDORA-2023-68012d0819 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-68012d0819
No problem! I tagged it on the update.
FEDORA-2023-68012d0819 has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report.