Spec URL: https://trix.fedorapeople.org/rocblas.spec SRPM URL: https://trix.fedorapeople.org/rocblas-5.6.0-1.fc40.src.rpm rocBLAS is the AMD library for Basic Linear Algebra Subprograms (BLAS) on the ROCm platform. It is implemented in the HIP programming language and optimized for AMD GPUs. Size of *.so was causing relocation errors, so scaled back the gpus to build. Reproducible: Always
Spec URL: https://trix.fedorapeople.org/rocblas.spec SRPM URL: https://trix.fedorapeople.org/rocblas-5.6.0-2.fc40.src.rpm Reworked to split *.so into gpu family's and to load the appropriate one at install time. Add an option --with test to create a rocblas-test.rpm with the tests and benchmarks Testing is ok on my gfx1102 Building is ok with the WIP rocsparse.spec here https://github.com/trixirt/rocSPARSE/blob/fedora/rocm-rel-5.6/rocsparse.spec
So I'm not totally opposed to the idea of splitting up the SO libs... but I'm a bit concerned with the logic in %post. My gut says we shouldn't copy, but use symlinks instead, but if you're going to use symlinks, I would suggest "alternatives". It's not 100% a traditional use for alternatives, but it at least allows the user/system maintainer to switch easily if the detect logic doesn't work. It also reduces some possible packaging pitfalls. I also suggest to have a bunch of rocblas subpackages. E.g rocblas-gfx11, rocblas-gfx10, etc, with a metapackage of "rocblas" that just pulls them all in with the detection logic. This allows easier selection by the user installing the packages. My gut also says to default to gfx11, but this is completely subjective and I have no good reason to feel this way. Last point, please update to 5.7, I just pushed the update to hip this morning.
I am wondering if rocm shouldn't be handled like mpi by using modules. We are always going to switch a set of libs from multiple packages at the same time. instead of the mpi, we would have rocm gpu family backends. ex/ setenv ROCM_LIB /usr/lib64/rocm/gfx11/lib prepend-path LD_LIBRARY_PATH /usr/lib64/rocm/gfx11/lib and there would be new package like rocm-module to set this up. Here is an example of what Benson suggested for gloo https://bugzilla.redhat.com/show_bug.cgi?id=2240302 https://copr-dist-git.fedorainfracloud.org/cgit/fed500/gloo/gloo.git/tree/gloo.spec?id=5f4092aba4f411ac9a10fc05d14aeab8909df3d9
(In reply to Tom Rix from comment #3) > I am wondering if rocm shouldn't be handled like mpi by using modules. > We are always going to switch a set of libs from multiple packages at the > same time. > instead of the mpi, we would have rocm gpu family backends. > ex/ > setenv ROCM_LIB /usr/lib64/rocm/gfx11/lib > prepend-path LD_LIBRARY_PATH /usr/lib64/rocm/gfx11/lib > > and there would be new package like rocm-module to set this up. > > Here is an example of what Benson suggested for gloo > https://bugzilla.redhat.com/show_bug.cgi?id=2240302 > https://copr-dist-git.fedorainfracloud.org/cgit/fed500/gloo/gloo.git/tree/ > gloo.spec?id=5f4092aba4f411ac9a10fc05d14aeab8909df3d9 I'm not sure that this would work. As Cordell pointed out many users have multiple different GPUs installed in their system. This won't be as much of a problem in the enterprise space where graphics aren't really a thing but many systems are built with APUs (combined CPU+GPU) packages and then they install a DGPU like a Radeon 7900 XT. How would that work with your alternative's ligic? You would need two to be active at once. I agree, that you should roll everything forward to 5.7 and in the back of your mind consider how to maintain compatibility when rocm 6 rolls out. It is supposed to be a break API release.
Everyone is not going to be happy. There is 10 kilos flour going into a single 5 kilo bag. This is better than the first option where the older gpus were dropped.
Most coverage of gpu will happen in the default, see https://github.com/trixirt/rocm-rpm-macros/blob/main/modules.rocm/default I think this is the best we can do, a combo of the last 2 families, if you have a gfx8 or gfx9, then you have to use that specific environment. There will be no combo support for gfx8 or gfx9. When gfx12 rolls out the default will be gfx11 and gfx12 combo
The new build infra has been linked as a depends-on package reviews for module infra that splits out these gpu families rpm macros a general improvement, not specifically needed, but I am tired of cut-n-pasting boiler plate An example of use for rocBLAS is https://github.com/trixirt/rocBLAS/commit/deba70490ff6af3f866553fe703f4e00e6da3067 And then the followup rocSPARSE that uses rocBLAS is https://github.com/trixirt/rocSPARSE/commit/969e09adf17d5c1993fb409904743bf443cb8d92
Spec URL: https://trix.fedorapeople.org/rocblas.spec SRPM URL: https://trix.fedorapeople.org/rocblas-5.7.1-1.fc40.src.rpm Uses the new infra.
So it seems pretty sane, except you put the .so files in the regular package, and the .so.* in the devel. A suggestion could be to add a macro for %files so you don't need to specify all the families. I'm just building it in copr to check the output: https://copr.fedorainfracloud.org/coprs/mystro256/playground/build/6552139/
Spec URL: https://trix.fedorapeople.org/rocblas.spec SRPM URL: https://trix.fedorapeople.org/rocblas-5.7.1-2.fc40.src.rpm fixed the so location. Use a glob instead of macro to simplify the gfx locations
Ok so after looking into this further, I see some remaining issues: - please change the license field to "MIT AND BSD-3-Clause" to match SPDX style (capitalise AND) - I'd much prefer you use the URL https://github.com/ROCmSoftwarePlatform/rocBLAS as the upstream, as it makes it easier to find the clone-able code. I need to fix a few of the other ROCm packages to match. - There's a bunch of orphaned directories and we should make sure they're owned by a package as per the guidelines. I'd suggest either add them to the rocm-rpm-macros-modules package, or add a filesystem package to own these directories: /usr/lib64/rocm/gfx11 /usr/lib64/rocm/gfx11/lib /usr/lib64/rocm/gfx11/lib/cmake /usr/lib64/rocm/gfx10 /usr/lib64/rocm/gfx10/lib /usr/lib64/rocm/gfx10/lib/cmake /usr/lib64/rocm/gfx9 /usr/lib64/rocm/gfx9/lib /usr/lib64/rocm/gfx9/lib/cmake /usr/lib64/rocm/gfx8 /usr/lib64/rocm/gfx8/lib /usr/lib64/rocm/gfx8/lib/cmake I'd much rather we fix this before approving this package, or else we're run into orphaned directories.
Spec URL: https://trix.fedorapeople.org/rocblas.spec SRPM URL: https://trix.fedorapeople.org/rocblas-5.7.1-3.fc40.src.rpm I added the dirs here https://src.fedoraproject.org/rpms/rocm-rpm-macros/c/d3ed3ad508df78409a909b30ad3a2693d7398a6c?branch=rawhide license and url changes made
LGTM, although you put AMD instead of AND in the changelog. Approved
The Pagure repository was created at https://src.fedoraproject.org/rpms/rocblas