There is a patch in the rocfft package which, according to comments, was added to help with building blender. It turns out that the patch interferes with the current packaging effort for rocfft in that the rocfft kernel cache building doesn't work. The upstream issue is https://github.com/ROCmSoftwarePlatform/rocFFT/issues/422 The patch in question is https://src.fedoraproject.org/rpms/rocclr/blob/rawhide/f/0001-add-uint64_t-variant-for-__ffsll.patch Reproducing is simple - build the upstream project on a Fedora system using Fedora packaged dependencies and the build process will fail. If rocclr is rebuilt without the uint64_t-variant patch, the rocfft build finishes (eventually - the kernel cache process takes a long time - at least an hour on my system). Disabling the kernel cache isn't a realistic solution because of the time required to build the kernels and if those kernels aren't cached, they will be built at runtime. On my system (ryzen 7 5700X, 64GB memory), the kernel cache build takes over an hour. While building the kernels at runtime wouldn't take that long, it still seems like an unreasonable demand of users. Additionally, I have been trying to triage an issue with rocfft that was exclusive to Fedora - built without the cached kernels. Simple code from the documentation (https://rocfft.readthedocs.io/en/rocm-5.6.0/#example) would throw errors and 100% of the test suite would fail with rocfft built against rocclr as packaged in Fedora. It isn't filed anywhere because I was still triaging the issue. When I built rocfft against the de-patched rocclr, the example code no longer errors out and produces the same results as when it is built and run on Debian sid built with Debian packaged dependencies and RHEL 9.2 built with AMD supplied dependencies. Additionally, the test suite passes when built against rocfft without the patch. I can provide details on the runtime errors I was seeing if desired.
This bug appears to have been reported against 'rawhide' during the Fedora Linux 39 development cycle. Changing version to 39.
I believe Tom Rix landed this patch and I pulled it in. I can revert it, but I would need to look into this a bit more to understand the issue. Sorry I've been sick so I'm un-burying myself in unanswered emails.
This change was to fix a build error with blender. Reverting it will likely break blender again. IIRC - the default hander was a template for any int type, the rocm handler handled just 2 type. A better solution would be for the rocm handler to be more like the template handler and handle any int type.