Bug 2227061 - uit64_t-variant patch inteferes with rocfft build
Summary: uit64_t-variant patch inteferes with rocfft build
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: rocclr
Version: 39
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jeremy Newton
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-07-27 16:00 UTC by Tim Flink
Modified: 2023-09-15 18:38 UTC (History)
2 users (show)

Fixed In Version: rocclr-5.5.1-10.fc38 rocclr-5.6.0-4.fc39
Clone Of:
Environment:
Last Closed: 2023-09-01 01:29:02 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Tim Flink 2023-07-27 16:00:50 UTC
There is a patch in the rocfft package which, according to comments, was added to help with building blender. It turns out that the patch interferes with the current packaging effort for rocfft in that the rocfft kernel cache building doesn't work. The upstream issue is https://github.com/ROCmSoftwarePlatform/rocFFT/issues/422

The patch in question is https://src.fedoraproject.org/rpms/rocclr/blob/rawhide/f/0001-add-uint64_t-variant-for-__ffsll.patch

Reproducing is simple - build the upstream project on a Fedora system using Fedora packaged dependencies and the build process will fail. If rocclr is rebuilt without the uint64_t-variant patch, the rocfft build finishes (eventually - the kernel cache process takes a long time - at least an hour on my system).

Disabling the kernel cache isn't a realistic solution because of the time required to build the kernels and if those kernels aren't cached, they will be built at runtime. On my system (ryzen 7 5700X, 64GB memory), the kernel cache build takes over an hour. While building the kernels at runtime wouldn't take that long, it still seems like an unreasonable demand of users.

Additionally, I have been trying to triage an issue with rocfft that was exclusive to Fedora - built without the cached kernels. Simple code from the documentation (https://rocfft.readthedocs.io/en/rocm-5.6.0/#example) would throw errors and 100% of the test suite would fail with rocfft built against rocclr as packaged in Fedora. It isn't filed anywhere because I was still triaging the issue.

When I built rocfft against the de-patched rocclr, the example code no longer errors out and produces the same results as when it is built and run on Debian sid built with Debian packaged dependencies and RHEL 9.2 built with AMD supplied dependencies. Additionally, the test suite passes when built against rocfft without the patch.

I can provide details on the runtime errors I was seeing if desired.

Comment 1 Fedora Release Engineering 2023-08-16 08:06:01 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 39 development cycle.
Changing version to 39.

Comment 2 Jeremy Newton 2023-08-16 19:42:22 UTC
I believe Tom Rix landed this patch and I pulled it in.
I can revert it, but I would need to look into this a bit more to understand the issue.

Sorry I've been sick so I'm un-burying myself in unanswered emails.

Comment 3 Tom Rix 2023-08-16 21:30:16 UTC
This change was to fix a build error with blender.
Reverting it will likely break blender again.
IIRC - the default hander was a template for any int type, the rocm handler handled just 2 type.
A better solution would be for the rocm handler to be more like the template handler and handle any int type.

Comment 4 Tom Rix 2023-08-20 20:43:49 UTC
I have an update to the blender patch here.
https://src.fedoraproject.org/rpms/rocclr/pull-request/3

Comment 5 Jeremy Newton 2023-08-21 14:44:34 UTC
Thanks Tom

@tflink Can you confirm if this fixes it?
If so I can merge it in.

Comment 6 Tim Flink 2023-08-22 18:52:03 UTC
The build does complete against rocclr built using the new patch and the example code from the rocFFT docs does run without error.

The test suite does not pass (see https://github.com/ROCmSoftwarePlatform/rocFFT/issues/439) but I can't say if that's due to the patch or not. My gut says that something else is the problem but I'm testing anyways.

I'm currently running a rocFFT build against rocclr that was built without any of the blender patches but the rocFFT build takes about 1.2 hours on my dev machine and the test suite takes almost 4 so it'll be a little while before I have any results.

Comment 7 Tim Flink 2023-08-23 00:55:54 UTC
I ran the rocFFT test suite against rocclr built without any of the blender patches and I'm seeing the same level of failure.

Comment 8 Jeremy Newton 2023-08-23 14:26:51 UTC
Ok thanks Tim!

I'll pull in the patch to replace the old one. Once we get FFT working, I can offer the patch upstream.

Comment 9 Fedora Update System 2023-08-23 14:51:55 UTC
FEDORA-2023-52ccdb8487 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-52ccdb8487

Comment 10 Jeremy Newton 2023-08-23 14:54:21 UTC
Update pushed to f38/f39/rawhide.

Rawhide should be already pushed, but expect some delay for f39 due to package freeze.

Comment 11 Jeremy Newton 2023-08-23 14:54:57 UTC
Also thanks Tom!

Comment 12 Fedora Update System 2023-08-24 01:22:58 UTC
FEDORA-2023-52ccdb8487 has been pushed to the Fedora 39 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-52ccdb8487`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-52ccdb8487

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 13 Fedora Update System 2023-08-24 01:37:46 UTC
FEDORA-2023-7bd0e27742 has been pushed to the Fedora 38 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-7bd0e27742`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-7bd0e27742

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 14 Fedora Update System 2023-09-01 01:29:02 UTC
FEDORA-2023-7bd0e27742 has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 15 Fedora Update System 2023-09-15 18:38:53 UTC
FEDORA-2023-52ccdb8487 has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.