Description of problem: System Version : Fedora 38 Kernel : Linux 6.3.2-200.fc38.x86_64 CPU & GPU : AMD Ryzen 5 5600H with Radeon Graphics Component Version : 5.5.0-1.fc38 Behavior : any application that uses OpenCL technology does not detect an error on the part of the client, but the functionality of the program is practically impossible. E.g. : in Davinci Resolve I cannot play any file format as video/audio. I don't get a timeline preview, etc. The program does not see the error, but in the logs it finds a message. Solution is downgrade to 5.4.3 version from F37. ( F38 also has this version, but there is a problem with the dependencies. ) ``` sudo dnf downgrade --releasever=37 rocm-opencl ``` Resolve Error : ``` DVIP Exception: OpenCL error - API: OpenCL - API Error Code: CL_OUT_OF_HOST_MEMORY (-6) - Call stack: 1 resolve 0x71b4152 2 resolve 0x723a1e2 3 resolve 0x7232614 4 resolve 0x723684f 5 resolve 0x5d96067 6 resolve 0x5d98831 7 resolve 0x5d98d2a 8 resolve 0x5d9efbf 9 libc.so.6 0x7ff0c16ae907 10 libc.so.6 0x7ff0c1734870 ``` ############ CLINFO 5.5.x: ############ ``` Platform Name AMD Accelerated Parallel Processing Platform Vendor Advanced Micro Devices, Inc. Platform Version OpenCL 2.1 AMD-APP (3558.0) Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd cl_amd_event_callback Platform Extensions function suffix AMD Platform Host timer resolution 1ns Platform Name AMD Accelerated Parallel Processing Number of devices 1 Device Name gfx90c:xnack- Device Vendor Advanced Micro Devices, Inc. Device Vendor ID 0x1002 Device Version OpenCL 2.0 Driver Version 3558.0 (HSA1.1,LC) Device OpenCL C Version OpenCL C 2.0 Device Type GPU Device Board Name (AMD) AMD Radeon Graphics Device PCI-e ID (AMD) 0x1638 Device Topology (AMD) PCI-E, 0000:03:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 7 SIMD per compute unit (AMD) 4 SIMD width (AMD) 16 SIMD instruction width (AMD) 1 Max clock frequency 1800MHz Graphics IP (AMD) 9.0 Device Partition (core) Max number of sub-devices 7 Supported partition types None Supported affinity domains (n/a) Max work item dimensions 3 Max work item sizes 1024x1024x1024 Max work group size 256 Preferred work group size (AMD) 256 Max work group size (AMD) 1024 === CL_PROGRAM_BUILD_LOG === ERROR: linking module flags 'amdgpu_code_object_version': IDs have conflicting values in '' and 'llvm-link' Error: Linking bitcode failed: linking source & IR libraries. Preferred work group size multiple (kernel) <getWGsizes:1504: create kernel : error -6> Wavefront width (AMD) 64 Preferred / native vector sizes char 4 / 4 short 2 / 2 int 1 / 1 long 1 / 1 half 1 / 1 (cl_khr_fp16) float 1 / 1 double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations Yes Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 2147483648 (2GiB) Global free memory (AMD) 2048000 (1.953GiB) 2048000 (1.953GiB) Global memory channels (AMD) 4 Global memory banks per channel (AMD) 4 Global memory bank width (AMD) 256 bytes Error Correction support No Max memory allocation 1825361096 (1.7GiB) Unified memory for Host and Device No Shared Virtual Memory (SVM) capabilities (core) Coarse-grained buffer sharing Yes Fine-grained buffer sharing Yes Fine-grained system sharing No Atomics No Minimum alignment for any data type 128 bytes Alignment of base address 1024 bits (128 bytes) Preferred alignment for atomics SVM 0 bytes Global 0 bytes Local 0 bytes Max size for global variable 1825361096 (1.7GiB) Preferred total size of global vars 2147483648 (2GiB) Global Memory cache type Read/Write Global Memory cache size 16384 (16KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 5688 Max size for 1D images from buffer 134217728 pixels Max 1D or 2D image array size 8192 images Base address alignment for 2D image buffers 256 bytes Pitch alignment for 2D image buffers 256 pixels Max 2D image size 16384x16384 pixels Max 3D image size 16384x16384x8192 pixels Max number of read image args 128 Max number of write image args 8 Max number of read/write image args 64 Max number of pipe args 16 Max active pipe reservations 16 Max pipe packet size 1825361096 (1.7GiB) Local memory type Local Local memory size 65536 (64KiB) Local memory size per CU (AMD) 65536 (64KiB) Local memory banks (AMD) 32 Max number of constant args 8 Max constant buffer size 1825361096 (1.7GiB) Preferred constant buffer size (AMD) 16384 (16KiB) Max size of kernel argument 1024 Queue properties (on host) Out-of-order execution No Profiling Yes Queue properties (on device) Out-of-order execution Yes Profiling Yes Preferred size 262144 (256KiB) Max size 8388608 (8MiB) Max queues on device 1 Max events on device 1024 Prefer user sync for interop Yes Number of P2P devices (AMD) 0 Profiling timer resolution 1ns Profiling timer offset since Epoch (AMD) 0ns (Thu Jan 1 01:00:00 1970) Execution capabilities Run OpenCL kernels Yes Run native kernels No Thread trace supported (AMD) No Number of async queues (AMD) 8 Max real-time compute queues (AMD) 8 Max real-time compute units (AMD) 7 printf() buffer size 4194304 (4MiB) Built-in kernels (n/a) Device Extensions cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) AMD Accelerated Parallel Processing clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [AMD] clCreateContext(NULL, ...) [default] Success [AMD] clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx90c:xnack- clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx90c:xnack- clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx90c:xnack- ICD loader properties ICD loader Name OpenCL ICD Loader ICD loader Vendor OCL Icd free software ICD loader Version 2.3.1 ICD loader Profile OpenCL 3.0 ``` ############## CLINFO 5.4.X : ############## ``` Number of platforms 1 Platform Name AMD Accelerated Parallel Processing Platform Vendor Advanced Micro Devices, Inc. Platform Version OpenCL 2.1 AMD-APP (3513.0) Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd cl_amd_event_callback Platform Extensions function suffix AMD Platform Host timer resolution 1ns Platform Name AMD Accelerated Parallel Processing Number of devices 1 Device Name gfx90c:xnack- Device Vendor Advanced Micro Devices, Inc. Device Vendor ID 0x1002 Device Version OpenCL 2.0 Driver Version 3513.0 (HSA1.1,LC) Device OpenCL C Version OpenCL C 2.0 Device Type GPU Device Board Name (AMD) AMD Radeon Graphics Device PCI-e ID (AMD) 0x1638 Device Topology (AMD) PCI-E, 0000:03:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 7 SIMD per compute unit (AMD) 4 SIMD width (AMD) 16 SIMD instruction width (AMD) 1 Max clock frequency 1800MHz Graphics IP (AMD) 9.0 Device Partition (core) Max number of sub-devices 7 Supported partition types None Supported affinity domains (n/a) Max work item dimensions 3 Max work item sizes 1024x1024x1024 Max work group size 256 Preferred work group size (AMD) 256 Max work group size (AMD) 1024 Preferred work group size multiple (kernel) 64 Wavefront width (AMD) 64 Preferred / native vector sizes char 4 / 4 short 2 / 2 int 1 / 1 long 1 / 1 half 1 / 1 (cl_khr_fp16) float 1 / 1 double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations Yes Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 2147483648 (2GiB) Global free memory (AMD) 2097152 (2GiB) 2097152 (2GiB) Global memory channels (AMD) 4 Global memory banks per channel (AMD) 4 Global memory bank width (AMD) 256 bytes Error Correction support No Max memory allocation 1825361096 (1.7GiB) Unified memory for Host and Device No Shared Virtual Memory (SVM) capabilities (core) Coarse-grained buffer sharing Yes Fine-grained buffer sharing Yes Fine-grained system sharing No Atomics No Minimum alignment for any data type 128 bytes Alignment of base address 1024 bits (128 bytes) Preferred alignment for atomics SVM 0 bytes Global 0 bytes Local 0 bytes Max size for global variable 1825361096 (1.7GiB) Preferred total size of global vars 2147483648 (2GiB) Global Memory cache type Read/Write Global Memory cache size 16384 (16KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 5688 Max size for 1D images from buffer 134217728 pixels Max 1D or 2D image array size 8192 images Base address alignment for 2D image buffers 256 bytes Pitch alignment for 2D image buffers 256 pixels Max 2D image size 16384x16384 pixels Max 3D image size 16384x16384x8192 pixels Max number of read image args 128 Max number of write image args 8 Max number of read/write image args 64 Max number of pipe args 16 Max active pipe reservations 16 Max pipe packet size 1825361096 (1.7GiB) Local memory type Local Local memory size 65536 (64KiB) Local memory size per CU (AMD) 65536 (64KiB) Local memory banks (AMD) 32 Max number of constant args 8 Max constant buffer size 1825361096 (1.7GiB) Preferred constant buffer size (AMD) 16384 (16KiB) Max size of kernel argument 1024 Queue properties (on host) Out-of-order execution No Profiling Yes Queue properties (on device) Out-of-order execution Yes Profiling Yes Preferred size 262144 (256KiB) Max size 8388608 (8MiB) Max queues on device 1 Max events on device 1024 Prefer user sync for interop Yes Number of P2P devices (AMD) 0 Profiling timer resolution 1ns Profiling timer offset since Epoch (AMD) 0ns (Thu Jan 1 01:00:00 1970) Execution capabilities Run OpenCL kernels Yes Run native kernels No Thread trace supported (AMD) No Number of async queues (AMD) 8 Max real-time compute queues (AMD) 8 Max real-time compute units (AMD) 7 printf() buffer size 4194304 (4MiB) Built-in kernels (n/a) Device Extensions cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) AMD Accelerated Parallel Processing clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [AMD] clCreateContext(NULL, ...) [default] Success [AMD] clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx90c:xnack- clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx90c:xnack- clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx90c:xnack- ICD loader properties ICD loader Name OpenCL ICD Loader ICD loader Vendor OCL Icd free software ICD loader Version 2.3.1 ICD loader Profile OpenCL 3.0 ```
Same problem here after upgrading from F37 to F38.
So to be clear, installing 5.5.0-1.fc38 on an up-to-date Fedora 38 system does not work, but taking the same system and installing the rocm-opencl-5.4.3-2.fc37 package works? Just to confirm as well, can you get me the output of: sudo yum list installed rocm*
(In reply to Jeremy Newton from comment #2) > So to be clear, installing 5.5.0-1.fc38 on an up-to-date Fedora 38 system > does not work, but taking the same system and installing the > rocm-opencl-5.4.3-2.fc37 package works? > > Just to confirm as well, can you get me the output of: > > sudo yum list installed rocm* Right, this package ( rocm-opencl-5.4.3-2.fc37 ) work on the same system, i wasn't change system version, just install one package from F37 repo. I can't right now give to you output from this command line, but i can tell u in 1000% that after this command : sudo dnf downgrade --releasever=37 rocm-opencl i have only this selected package to reinstall (which is rocm-opencl ), so rest of this package's dependency is stil the same as before, nothing else was changed, just rocm-opencl-5.5.0-1.fc38 to rocm-opencl-5.4.3-2.fc37. I hope now it's clear and understood.
BEFORE sudo yum list installed rocm* Installed Packages rocm-comgr.x86_64 16.1-2.fc38 @updates rocm-opencl.x86_64 5.5.0-1.fc38 @updates rocm-runtime.x86_64 5.5.0-1.fc38 @updates AFTER sudo yum list installed rocm* Installed Packages rocm-comgr.x86_64 16.1-2.fc38 @updates rocm-opencl.x86_64 5.4.3-2.fc37 @updates rocm-runtime.x86_64 5.5.0-1.fc38 @updates
There is a similar issue on github: https://github.com/RadeonOpenCompute/ROCm-CompilerSupport/issues/53 Following the procedure in that issue I get the following output for clinfo AMD_COMGR_SAVE_TEMPS=1 AMD_COMGR_REDIRECT_LOGS=stdout AMD_COMGR_EMIT_VERBOSE_LOGS=1 clinfo amd_comgr_do_action: ActionKind: AMD_COMGR_ACTION_ADD_PRECOMPILED_HEADERS IsaName: amdgcn-amd-amdhsa--gfx1010:xnack- Options: "-O3" "-cl-kernel-arg-info" "-D__OPENCL_VERSION__=200" "-D__IMAGE_SUPPORT__=1" "-Xclang" "-cl-ext=+cl_khr_fp64,+cl_khr_global_int32_base_atomics,+cl_khr_global_int32_extended_atomics,+cl_khr_local_int32_base_atomics,+cl_khr_local_int32_extended_atomics,+cl_khr_int64_base_atomics,+cl_khr_int64_extended_atomics,+cl_khr_3d_image_writes,+cl_khr_byte_addressable_store,+cl_khr_fp16,+cl_khr_gl_sharing,+cl_amd_device_attribute_query,+cl_amd_media_ops,+cl_amd_media_ops2,+cl_khr_image2d_from_buffer,+cl_khr_subgroups,+cl_amd_copy_buffer_p2p,+cl_amd_assembly_program" "-mllvm" "-amdgpu-prelink" "-mcode-object-version=5" Path: Language: AMD_COMGR_LANGUAGE_OPENCL_1_2 ReturnStatus: AMD_COMGR_STATUS_SUCCESS amd_comgr_do_action: ActionKind: AMD_COMGR_ACTION_COMPILE_SOURCE_TO_BC IsaName: amdgcn-amd-amdhsa--gfx1010:xnack- Options: "-O3" "-cl-kernel-arg-info" "-D__OPENCL_VERSION__=200" "-D__IMAGE_SUPPORT__=1" "-Xclang" "-cl-ext=+cl_khr_fp64,+cl_khr_global_int32_base_atomics,+cl_khr_global_int32_extended_atomics,+cl_khr_local_int32_base_atomics,+cl_khr_local_int32_extended_atomics,+cl_khr_int64_base_atomics,+cl_khr_int64_extended_atomics,+cl_khr_3d_image_writes,+cl_khr_byte_addressable_store,+cl_khr_fp16,+cl_khr_gl_sharing,+cl_amd_device_attribute_query,+cl_amd_media_ops,+cl_amd_media_ops2,+cl_khr_image2d_from_buffer,+cl_khr_subgroups,+cl_amd_copy_buffer_p2p,+cl_amd_assembly_program" "-mllvm" "-amdgpu-prelink" "-mcode-object-version=5" Path: Language: AMD_COMGR_LANGUAGE_OPENCL_1_2 COMGR::executeInProcessDriver argv: clang "-cc1" "-mcode-object-version=5" "-mllvm" "--amdhsa-code-object-version=5" "-triple" "amdgcn-amd-amdhsa" "-emit-llvm-bc" "-emit-llvm-uselists" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CompileSource" "-mrelocation-model" "pic" "-pic-level" "1" "-fhalf-no-semantic-interposition" "-mframe-pointer=none" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-fvisibility=hidden" "-fapply-global-visibility-to-externs" "-target-cpu" "gfx1010" "-target-feature" "-xnack" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-resource-dir" "lib64/clang/16" "-include-pch" "/tmp/comgr-b43a8d/include/opencl1.2-c.pch" "-I" "/tmp/comgr-b43a8d/include" "-D" "__OPENCL_VERSION__=200" "-D" "__IMAGE_SUPPORT__=1" "-O3" "-std=cl1.2" "-fdebug-compilation-dir=/home/Philipp/Test/2020-11-22_ROCM/ROCclr" "-ferror-limit" "19" "-cl-kernel-arg-info" "-nogpulib" "-fgnuc-version=4.2.1" "-fno-threadsafe-statics" "-fcolor-diagnostics" "-vectorize-loops" "-vectorize-slp" "-fno-validate-pch" "-cl-ext=+cl_khr_fp64,+cl_khr_global_int32_base_atomics,+cl_khr_global_int32_extended_atomics,+cl_khr_local_int32_base_atomics,+cl_khr_local_int32_extended_atomics,+cl_khr_int64_base_atomics,+cl_khr_int64_extended_atomics,+cl_khr_3d_image_writes,+cl_khr_byte_addressable_store,+cl_khr_fp16,+cl_khr_gl_sharing,+cl_amd_device_attribute_query,+cl_amd_media_ops,+cl_amd_media_ops2,+cl_khr_image2d_from_buffer,+cl_khr_subgroups,+cl_amd_copy_buffer_p2p,+cl_amd_assembly_program" "-mllvm" "-amdgpu-prelink" "-faddrsig" "-o" "/tmp/comgr-b43a8d/output/CompileSource.bc" "-x" "cl" "/tmp/comgr-b43a8d/input/CompileSource" ReturnStatus: AMD_COMGR_STATUS_SUCCESS amd_comgr_do_action: ActionKind: AMD_COMGR_ACTION_ADD_DEVICE_LIBRARIES IsaName: amdgcn-amd-amdhsa--gfx1010:xnack- Options: "code_object_v5" Path: Language: AMD_COMGR_LANGUAGE_OPENCL_1_2 ReturnStatus: AMD_COMGR_STATUS_SUCCESS amd_comgr_do_action: ActionKind: AMD_COMGR_ACTION_LINK_BC_TO_BC IsaName: amdgcn-amd-amdhsa--gfx1010:xnack- Options: "code_object_v5" Path: Language: AMD_COMGR_LANGUAGE_OPENCL_1_2 ERROR: linking module flags 'amdgpu_code_object_version': IDs have conflicting values in '' and 'llvm-link' ReturnStatus: AMD_COMGR_STATUS_ERROR === CL_PROGRAM_BUILD_LOG === Error: Linking bitcode failed: linking source & IR libraries. It looks like most stuff is compiled as a code object v5, but linking with device libs fails. Interestingly llvm itself does not make v5 the default (at least according to the docs): https://www.llvm.org/docs/AMDGPUUsage.html#code-object-v5-metadata But even with v5 code objects built, comgr should in theory link in matching v5 device libs: https://github.com/RadeonOpenCompute/ROCm-CompilerSupport/commit/5a1beb6417b7680c29ac131933dd99791141995e However, if I run llvm-dis -o - /usr/lib64/amdgcn/bitcode/oclc_abi_version_500.bc I get ; ModuleID = '/usr/lib64/amdgcn/bitcode/oclc_abi_version_500.bc' source_filename = "llvm-link" target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7" target triple = "amdgcn-amd-amdhsa" @__oclc_ABI_version = linkonce_odr protected local_unnamed_addr addrspace(4) constant i32 500, align 4 !opencl.ocl.version = !{!0} !llvm.ident = !{!1} !llvm.module.flags = !{!2, !3, !4} !0 = !{i32 2, i32 0} !1 = !{!"clang version 16.0.0 (Fedora 16.0.0-2.fc38)"} !2 = !{i32 1, !"amdgpu_code_object_version", i32 400} !3 = !{i32 1, !"wchar_size", i32 4} !4 = !{i32 8, !"PIC Level", i32 1} Wondering where this goes wrong that amdgpu_code_object_version is set to 400 even for the v5 file. I suspect that clang-16 does not support the -mcode-object-version=none command which is used in https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/087bef746cd0422b0bfef2f3f713e4deb38803d1/cmake/OCL.cmake#L42 That's at least what the kodi builds suggest: https://kojipkgs.fedoraproject.org//packages/rocm-device-libs/16.1/1.fc38/data/logs/x86_64/build.log Until a workaround for -mcode-object-version=none becomes generally available, I tried to patch the .bc files and strip the amdgpu_code_object_version line with sth like for file in oclc*; do llvm-dis -o - $file | sed 's/^!2.*//g' | sed 's/!{!2, !3/!{!3/g' | llvm-as -o $file -; done and editing the remaining bc files likewise. However, the issue persists. Maybe sb else has an idea?
Hi. I have had this problem on my fedora box too. I grabbed the source rpm and hardcoded the 2 references to "-mcode-object-version=" in ROCclr-rocm-5.5.0/device/devprogram.cpp to be 4, and built it and it works. // driverOptions.push_back("-mcode-object-version=" + std::to_string(options->oVariables->LCCodeObjectVersion)); driverOptions.push_back("-mcode-object-version=" + std::to_string(4)); and // codegenOptions.push_back("-mcode-object-version=" + std::to_string(options->oVariables->LCCodeObjectVersion)); codegenOptions.push_back("-mcode-object-version=" + std::to_string(4));
Actually, it seems to work if you hardcode the version in both places to '5'.
Sorry, I missed a step in rebuilding the rpm.. it does not work with both set to 5.
Wow thanks! That really helped narrow down the issue for me. I was busy trying to get HIP working in Fedora (RHBZ#2209759), so I wasn't paying enough attention to this. I'll push an update later today and link this bug. I believe I just need to cherry-pick this to comgr: https://github.com/RadeonOpenCompute/ROCm-CompilerSupport/commit/79948e1807bca7108722982b9018d61dde9420f2 Mike, if you can, can you try applying that? Or wait for my bodhi update and test that?
It's building: https://koji.fedoraproject.org/koji/taskinfo?taskID=101538720 If that doesn't work, I'll revert and just hardcode to 4 in rocclr until I can contact upstream for suggestions.
FEDORA-2023-f4164e5c06 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-f4164e5c06
FEDORA-2023-f4164e5c06 has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.
Hi. I can test, but I need a fedora 38 build.
I installed rocm-comgr-16.1-3.fc39.x86_64 and still am getting the errors.
ok, I am not sure what I need to do here too. In the spec file for opencl, I see these dependencies. BuildRequires: rocm-comgr-devel BuildRequires: rocm-runtime-devel I assume for you fix to work that opencl needs to be rebuilt with the fixed comgr?
I don't think you need to rebuild openCL on comgr change. I might need to just sit down and experiment with a few combinations to understand the real cause. I'm curious why upstream is not seeing this issue, but maybe opencl from 5.5.0 is just buggy with object version 5, and we need to force it to use 4. I think for now I'll roll back comgr, since that seems to have done nothing, and just revert this change in my next update: https://github.com/ROCm-Developer-Tools/ROCclr/commit/041c00465b7adcee78085dc42253d42d1bb1f250 It doesn't effectively the same thing as hardcoding to 4.
FEDORA-2023-68012d0819 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-68012d0819
I pushed the F38 update first, please try that and let me know if that resolves your issue.
Looks good to me! I installed this over the one I built and everything is still working.
FEDORA-2023-80eb7f41de has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-80eb7f41de
FEDORA-2023-80eb7f41de has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2023-68012d0819 has been pushed to the Fedora 38 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-68012d0819` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-68012d0819 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
Thanks! I pushed the fix to rawhide, but the f38 update will be in testing for the next week
FEDORA-2023-68012d0819 has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2023-d024444040 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-d024444040
FEDORA-2023-d024444040 has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.