Bug 2295820 - Review Request: hipblaslt - ROCm general matrix operations beyond BLAS
Summary: Review Request: hipblaslt - ROCm general matrix operations beyond BLAS
Keywords:
Status: RELEASE_PENDING
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Tim Flink
QA Contact: Fedora Extras Quality Assurance
URL: https://github.com/ROCmSoftwarePlatfo...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-07-04 15:43 UTC by Tom Rix
Modified: 2024-08-18 21:27 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: ---
Embargoed:
tflink: fedora-review+


Attachments (Terms of Use)
specfile changes for discussion (1.66 KB, patch)
2024-07-29 18:18 UTC, Tim Flink
no flags Details | Diff
The .spec file difference from Copr build 7794839 to 7809096 (4.13 KB, patch)
2024-07-31 18:07 UTC, Fedora Review Service
no flags Details | Diff
clean up test buildrequires, update libomp buildrequires (622 bytes, patch)
2024-08-01 23:31 UTC, Tim Flink
no flags Details | Diff
clean up test buildrequires, update libomp buildrequires correctly (628 bytes, patch)
2024-08-02 01:07 UTC, Tim Flink
no flags Details | Diff
The .spec file difference from Copr build 7809096 to 7819168 (597 bytes, patch)
2024-08-02 14:19 UTC, Fedora Review Service
no flags Details | Diff

Description Tom Rix 2024-07-04 15:43:25 UTC
Spec URL: https://trix.fedorapeople.org/hipblaslt.spec
SRPM URL: https://trix.fedorapeople.org/hipblaslt-6.1.2-1.fc41.src.rpm

hipblastlt is a requirement for pytorch. See here for its use
https://src.fedoraproject.org/rpms/python-torch/blob/rawhide/f/python-torch.spec#_358

This should resolve this upstream issue
https://github.com/pytorch/pytorch/pull/120551

Reproducible: Always

Comment 1 Fedora Review Service 2024-07-04 18:14:04 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/7708413
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2295820-hipblaslt/fedora-rawhide-x86_64/07708413-hipblaslt/fedora-review/review.txt

Found issues:

- No gcc, gcc-c++ or clang found in BuildRequires
  Read more: https://docs.fedoraproject.org/en-US/packaging-guidelines/C_and_C++/

Please know that there can be false-positives.

---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 2 Tim Flink 2024-07-25 18:29:58 UTC
Overall, it looks good to me but there are 2 bigger things I'm concerned about:
1. Do the files in tensilite/Tensile/CustomKernels/ count as pre-generated code or content?
   - Is it possible to delete the CustomKernels directory and still build for gfx941 and gfx942 albeit with a less optimized setup? If so, I'd lean more towards content

2. How do we handle the "bundled" tensile. It's never installed by hipblaslt and only used during the build process and I don't know off hand how to handle that. I don't think that it needs to be added to the provides but I also don't know if there is anything preventing this from happening.

There are a few minor things like BuildRequires and Provides that needed to be changed to build in mock but I'll submit a patch for that once the major questions are figured out.

Also, are there any plans (upstream or just in Fedora) to start building and offering kernels for something beyond gfx90a?

Comment 3 Jeremy Newton 2024-07-25 20:37:46 UTC
So some more information from a ROCm developer, we were in a meeting that included Tom and me:

The bundled tensile is a fork, and is expected to diverge from tensile (if it hasn't already) and become a replacement for tensile in rocm (other libs are supposed to eventually call rocblaslt).
I'd rather just move forward, and we can drop tensile later if need be.

> How do we handle the "bundled" tensile ... I don't think that it needs to be added to the provides but I also don't know if there is anything preventing this from happening

A bundle is a bundle, you need to add it regardless. The provides is more of a flag. I believe the intention is if they a security issue or similar critical issue in the library, the maintainer can look for the provides to notify the other maintainer that there's a critical bug that needs fixing. This could also be a CVE, a license issue, a copyright issue, a legal issue, etc. Providing a version is pretty important too for tracking.

E.g. say library A has a CVE and package B bundles A, then it's easy to query for "provides: A" to see what packages need updating, which would be A and B.

Comment 4 Jeremy Newton 2024-07-25 20:47:50 UTC
Other comments:

BuildRequires:  clang%{llvm_maj_ver}
BuildRequires:  lld%{llvm_maj_ver}

If you're trying to use an LLVM version as hipcc, this is not necessary, since hipcc requires this already.

If so, I suggest that we could add a metapackage in rocm-compilersupport.spec called hipcc-tools-extra to require the correct clang package to reduce packaging complexity in this package. I would rather keep all the llvm version information in rocm-compilersupport.spec if possible, since LLVM is a bit complicated to deal with in the rocm stack.

And as well, assuming everything above makes sense, I would add:
BuildRequires:  rocm-compilersupport-macros

and replace all %{llvm_maj_ver} with %{rocmllvm_version}

Comment 5 Jeremy Newton 2024-07-25 20:53:26 UTC
To be clear, I meant to say:

> If you're trying to use an LLVM version THE SAME as hipcc

Comment 6 Tim Flink 2024-07-25 23:35:32 UTC
(In reply to Jeremy Newton from comment #3)
> So some more information from a ROCm developer, we were in a meeting that
> included Tom and me:
> 
> The bundled tensile is a fork, and is expected to diverge from tensile (if
> it hasn't already) and become a replacement for tensile in rocm (other libs
> are supposed to eventually call rocblaslt).
> I'd rather just move forward, and we can drop tensile later if need be.

Yeah, that's mostly what I had understood when I talked to trix about it.

> > How do we handle the "bundled" tensile ... I don't think that it needs to be added to the provides but I also don't know if there is anything preventing this from happening
> 
> A bundle is a bundle, you need to add it regardless. The provides is more of
> a flag. I believe the intention is if they a security issue or similar
> critical issue in the library, the maintainer can look for the provides to
> notify the other maintainer that there's a critical bug that needs fixing.
> This could also be a CVE, a license issue, a copyright issue, a legal issue,
> etc. Providing a version is pretty important too for tracking.
> 
> E.g. say library A has a CVE and package B bundles A, then it's easy to
> query for "provides: A" to see what packages need updating, which would be A
> and B.

Sure, bundled libs should be handled like bundled libs and there's a reason that process exists but that's not my argument here. I don't think that the Tensile in tensilelite _is_ a bundle in the sense that we care about for packaging.

As near as I can tell, this package never installs the forked, bundled Tensile - it just installs it in a source dir subdirectory, adds that subdirectory to PATH and PYTHONPATH before doing the actual hipblaslt build where I think it's used to generate the platform-specific kernels. The Fedora package doesn't even provide Tensile which is why I don't think it needs to have a bundled provides in the spec.

Even if there were a CVE in this bundled Tensile, it's never distributed in a Fedora package - it just exists on builders for a short time before that disk is reclaimed post-build. We can have a discussion about whether that's a good practice or not but unless there's something I'm missing here, this package in its current state isn't bundling Tensile.

Comment 7 Tim Flink 2024-07-25 23:36:50 UTC
(In reply to Tim Flink from comment #6)

> As near as I can tell, this package never installs the forked, bundled
> Tensile

Never installs the forked, bundled Tensile into the Buildroot

Comment 8 Jeremy Newton 2024-07-26 00:33:05 UTC
Isn't it technically a static bundling though?

Honestly having the provides does no harm, but if you feel strongly, no strong feelings from me. I'm much more concerned about the hipcc/llvm related issues.

Comment 9 Tom Rix 2024-07-26 12:51:52 UTC
(In reply to Jeremy Newton from comment #4)
> Other comments:
> 
> BuildRequires:  clang%{llvm_maj_ver}
> BuildRequires:  lld%{llvm_maj_ver}
> 
> If you're trying to use an LLVM version as hipcc, this is not necessary,
> since hipcc requires this already.
> 
> If so, I suggest that we could add a metapackage in
> rocm-compilersupport.spec called hipcc-tools-extra to require the correct
> clang package to reduce packaging complexity in this package. I would rather
> keep all the llvm version information in rocm-compilersupport.spec if
> possible, since LLVM is a bit complicated to deal with in the rocm stack.
> 
> And as well, assuming everything above makes sense, I would add:
> BuildRequires:  rocm-compilersupport-macros
> 
> and replace all %{llvm_maj_ver} with %{rocmllvm_version}

Yes.
This is cleaner.

Comment 10 Tom Rix 2024-07-26 13:11:51 UTC
(In reply to Tim Flink from comment #2)
> Overall, it looks good to me but there are 2 bigger things I'm concerned
> about:
> 1. Do the files in tensilite/Tensile/CustomKernels/ count as pre-generated
> code or content?
>    - Is it possible to delete the CustomKernels directory and still build
> for gfx941 and gfx942 albeit with a less optimized setup? If so, I'd lean
> more towards content

No it does not build when the kernels are removed.

Comment 11 Tom Rix 2024-07-26 13:21:28 UTC
(In reply to Tim Flink from comment #6)
> (In reply to Jeremy Newton from comment #3)
> > So some more information from a ROCm developer, we were in a meeting that
> > included Tom and me:
> > 
> > The bundled tensile is a fork, and is expected to diverge from tensile (if
> > it hasn't already) and become a replacement for tensile in rocm (other libs
> > are supposed to eventually call rocblaslt).
> > I'd rather just move forward, and we can drop tensile later if need be.
> 
> Yeah, that's mostly what I had understood when I talked to trix about it.
> 
> > > How do we handle the "bundled" tensile ... I don't think that it needs to be added to the provides but I also don't know if there is anything preventing this from happening
> > 
> > A bundle is a bundle, you need to add it regardless. The provides is more of
> > a flag. I believe the intention is if they a security issue or similar
> > critical issue in the library, the maintainer can look for the provides to
> > notify the other maintainer that there's a critical bug that needs fixing.
> > This could also be a CVE, a license issue, a copyright issue, a legal issue,
> > etc. Providing a version is pretty important too for tracking.
> > 
> > E.g. say library A has a CVE and package B bundles A, then it's easy to
> > query for "provides: A" to see what packages need updating, which would be A
> > and B.
> 
> Sure, bundled libs should be handled like bundled libs and there's a reason
> that process exists but that's not my argument here. I don't think that the
> Tensile in tensilelite _is_ a bundle in the sense that we care about for
> packaging.
> 
> As near as I can tell, this package never installs the forked, bundled
> Tensile - it just installs it in a source dir subdirectory, adds that
> subdirectory to PATH and PYTHONPATH before doing the actual hipblaslt build
> where I think it's used to generate the platform-specific kernels. The
> Fedora package doesn't even provide Tensile which is why I don't think it
> needs to have a bundled provides in the spec.
> 
> Even if there were a CVE in this bundled Tensile, it's never distributed in
> a Fedora package - it just exists on builders for a short time before that
> disk is reclaimed post-build. We can have a discussion about whether that's
> a good practice or not but unless there's something I'm missing here, this
> package in its current state isn't bundling Tensile.

I will replace bundled: with a strongly worded comment :P

Comment 12 Tom Rix 2024-07-26 13:41:25 UTC
> 
> Also, are there any plans (upstream or just in Fedora) to start building and
> offering kernels for something beyond gfx90a?

I would not hold your breath on this. I raised this issue over a year ago.
Here is a recent ask.
https://github.com/ROCm/hipBLASLt/issues/312

Comment 13 Tim Flink 2024-07-26 17:42:43 UTC
(In reply to Tom Rix from comment #11)
> (In reply to Tim Flink from comment #6)
> > (In reply to Jeremy Newton from comment #3)

<snip>
 
> I will replace bundled: with a strongly worded comment :P

eh, now that I'm re-reading this today and looking into it more, I just misunderstood what bundled(<libname>) does. For some reason, I thought it would show up as a regular provides and someone could install it instead of Tensile by accident. I withdraw my concern, it's not worth worrying about for now and we can deal with any Tensile replacement if/when it ever happens.

(In reply to Tom Rix from comment #10)
> (In reply to Tim Flink from comment #2)
> > Overall, it looks good to me but there are 2 bigger things I'm concerned
> > about:
> > 1. Do the files in tensilite/Tensile/CustomKernels/ count as pre-generated
> > code or content?
> >    - Is it possible to delete the CustomKernels directory and still build
> > for gfx941 and gfx942 albeit with a less optimized setup? If so, I'd lean
> > more towards content
> 
> No it does not build when the kernels are removed.

Did we ever find an answer to what we can do about these kinds of binary files and whether they're package-able? I remember it coming up with miopen but I don't remember what the conclusion was. The review went through, so I assume that it was determined to be kosher-enough?

Comment 14 Tom Rix 2024-07-26 18:37:18 UTC
(In reply to Tim Flink from comment #13)
> (In reply to Tom Rix from comment #11)
> > (In reply to Tim Flink from comment #6)
> > > (In reply to Jeremy Newton from comment #3)
> 
> <snip>
>  
> > I will replace bundled: with a strongly worded comment :P
> 
> eh, now that I'm re-reading this today and looking into it more, I just
> misunderstood what bundled(<libname>) does. For some reason, I thought it
> would show up as a regular provides and someone could install it instead of
> Tensile by accident. I withdraw my concern, it's not worth worrying about
> for now and we can deal with any Tensile replacement if/when it ever happens.

Then i will keep my strongly worded comment and add bundled: back.
> 
> (In reply to Tom Rix from comment #10)
> > (In reply to Tim Flink from comment #2)
> > > Overall, it looks good to me but there are 2 bigger things I'm concerned
> > > about:
> > > 1. Do the files in tensilite/Tensile/CustomKernels/ count as pre-generated
> > > code or content?
> > >    - Is it possible to delete the CustomKernels directory and still build
> > > for gfx941 and gfx942 albeit with a less optimized setup? If so, I'd lean
> > > more towards content
> > 
> > No it does not build when the kernels are removed.
> 
> Did we ever find an answer to what we can do about these kinds of binary
> files and whether they're package-able? I remember it coming up with miopen
> but I don't remember what the conclusion was. The review went through, so I
> assume that it was determined to be kosher-enough?

You asked for them to be removed, I removed them. this crippled miopen optimizations. Now it is on Fedora to regenerate the results which needs hardware to do that. Fedora does not have this hw.  Eventually someone will notice and ask for it to be put back.

So I would rather not do the same other places.

Comment 15 Tom Rix 2024-07-26 20:41:12 UTC
Spec URL: https://trix.fedorapeople.org/hipblaslt.spec
SRPM URL: https://trix.fedorapeople.org/hipblaslt-6.1.2-1.fc40.src.rpm

Please see update with I believe all changes.

Comment 16 Fedora Review Service 2024-07-27 01:44:13 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/7794839
(failed)

Build log:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2295820-hipblaslt/fedora-rawhide-x86_64/07794839-hipblaslt/builder-live.log.gz

Please make sure the package builds successfully at least for Fedora Rawhide.

- If the build failed for unrelated reasons (e.g. temporary network
  unavailability), please ignore it.
- If the build failed because of missing BuildRequires, please make sure they
  are listed in the "Depends On" field


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 17 Jeremy Newton 2024-07-27 13:32:14 UTC
I don't see any of the clang related changes. My suggestion was something like this:

--- hipblaslt.spec	2024-07-27 09:07:58.103294755 -0400
+++ hipblaslt-new.spec	2024-07-27 09:29:35.897124803 -0400
@@ -6,7 +6,6 @@
 %global toolchain rocm
 # hipcc does not support some clang flags
 %global build_cxxflags %(echo %{optflags} | sed -e 's/-fstack-protector-strong/-Xarch_host -fstack-protector-strong/' -e 's/-fcf-protection/-Xarch_host -fcf-protection/')
-%global llvm_maj_ver 17
 
 # It is necessary to use this with a local build
 # export QA_RPATHS=0xff
@@ -25,18 +24,17 @@
 Source0:        %{url}/archive/rocm-%{rocm_version}.tar.gz#/%{upstreamname}-%{rocm_version}.tar.gz
 
 BuildRequires:  cmake
-BuildRequires:  clang%{llvm_maj_ver}
-BuildRequires:  clang%{llvm_maj_ver}-tools-extra
 BuildRequires:  git
 BuildRequires:  hipblas-devel
 BuildRequires:  hipcc
-BuildRequires:  lld%{llvm_maj_ver}
+BuildRequires:  hipcc-tools-extra
 BuildRequires:  msgpack-devel
 BuildRequires:  ninja-build
 BuildRequires:  rocblas-devel
 BuildRequires:  rocminfo
 BuildRequires:  rocm-cmake
 BuildRequires:  rocm-comgr-devel
+BuildRequires:  rocm-compilersupport-macros
 BuildRequires:  rocm-hip-devel
 BuildRequires:  rocm-runtime-devel
 BuildRequires:  rocm-rpm-macros
@@ -87,8 +85,8 @@
 
 # rocm path
 sed -i -e 's@rocm_path=/opt/rocm@rocm_path=/usr@'                              tensilelite/Tensile/Ops/gen_assembly.sh
-# No llvm/bin/clang, use clang++-17
-sed -i -e 's@toolchain=${rocm_path}/llvm/bin/clang++@toolchain=clang++-17@'    tensilelite/Tensile/Ops/gen_assembly.sh
+# No llvm/bin/clang, use clang++ from ROCm
+sed -i -e 's@toolchain=${rocm_path}/llvm/bin/clang++@toolchain=clang++-%{rocmllvm_version}@'    tensilelite/Tensile/Ops/gen_assembly.sh
 # Remove venv
 sed -i -e 's@. ${venv}/bin/activate@@'                                         tensilelite/Tensile/Ops/gen_assembly.sh
 sed -i -e 's@deactivate@@'                                                     tensilelite/Tensile/Ops/gen_assembly.sh
@@ -96,9 +94,9 @@
 # change rocm path from /opt/rocm to /usr
 # need to be able to find hipcc, rocm-smi, extractkernel, rocm_agent_enumerator
 sed -i -e 's@opt/rocm@usr@'                                                    tensilelite/Tensile/Common.py
-# look for clang things is 'usr' + '/lib64/llv17/bin'  or similar
+# look for clang things is 'usr' + '/lib64/llvm*/bin'  or similar
 # need to be able to find clang++, ld.lld, clang-offload-bundler
-sed -i -e 's@llvm/bin@lib64/llvm%{llvm_maj_ver}/bin@'                          tensilelite/Tensile/Common.py
+sed -i -e 's@llvm/bin@lib64/llvm%{rocmllvm_version}/bin@'                          tensilelite/Tensile/Common.py
 # Use PATH to find where TensileGetPath and other tensile bins are
 sed -i -e 's@${Tensile_PREFIX}/bin/TensileGetPath@TensileGetPath@g'            tensilelite/Tensile/cmake/TensileConfig.cmake
 
@@ -157,7 +155,7 @@
        -DCMAKE_BUILD_TYPE=RelWithDebInfo \
        -DCMAKE_C_COMPILER=hipcc \
        -DCMAKE_CXX_COMPILER=hipcc \
-       -DCMAKE_CXX_FLAGS="-fuse-ld=/usr/bin/ld.lld-%{llvm_maj_ver}" \
+       -DCMAKE_CXX_FLAGS="-fuse-ld=/usr/bin/ld.lld-%{rocmllvm_version}" \
        -DHIP_PLATFORM=amd \
        -DROCM_SYMLINK_LIBS=OFF \
        -DBUILD_WITH_TENSILE=ON \

Comment 18 Tom Rix 2024-07-29 15:58:18 UTC
Spec URL: https://trix.fedorapeople.org/hipblaslt.spec
SRPM URL: https://trix.fedorapeople.org/hipblaslt-6.1.2-1.fc40.src.rpm

Sorry, I gooffed up the scp for the last update, please see this update

Comment 19 Tim Flink 2024-07-29 18:18:53 UTC
Created attachment 2041182 [details]
specfile changes for discussion

I've attached an initial patch for the hipblaslt specfile that contains a few things, three are pretty minor:

1. I changed a few of the BuildRequires to use pkgconfig where appropriate
2. I added a different way to disable the rpath check disable when building --with test that works with mock. I'd like to have this in all of the rocm packages that have -test subpackages to make building them easier but this can wait for a more general discussion, it doesn't have to be added now if there are objections.
3. I added some missing deps from the -test subpackage

The last one is more of a discussion:

One of the BuildRequires for the -test subpackage (libomp) needs to match the llvm version AFAIK. I understand why that information was removed from the specfile and I agree with it but I'm left with a problem - the BuildRequires are evaluated before rocm-compilersupport-macros is installed and I can't use %{rocmllvm_version} macro to specify the llvm version for a BuildRequires package.

Any thoughts on how to address this? I can't think of much more than local macro for something like %{rocm_llvmversion_test} that wouldn't impact the non-test builds but whoever is doing the test builds would need to be aware of the potential/eventual sync break WRT the locally specified version.

Comment 20 Jeremy Newton 2024-07-29 20:09:22 UTC
I noticed you dropped the extra-tools package, is this on purpose?

Regarding libomp, we can make another metapackage if that's ok with you.

I'd prefer more hipcc related metapackages than maintaining llvm version in multiple specs

Comment 21 Tim Flink 2024-07-29 20:35:08 UTC
(In reply to Jeremy Newton from comment #20)
> Regarding libomp, we can make another metapackage if that's ok with you.
> 
> I'd prefer more hipcc related metapackages than maintaining llvm version in
> multiple specs

That works for me, I have no desire to have the llvm version recorded in more places than it needs to be; it's going to cause problems at some point. I only suggested putting it back in for test only because I'm not sure if anything else uses libomp.

Comment 22 Tom Rix 2024-07-29 21:34:23 UTC
(In reply to Jeremy Newton from comment #20)
> I noticed you dropped the extra-tools package, is this on purpose?
It did not exist as of this morning, i went ahead without it and it seems to work ok.

> 
> Regarding libomp, we can make another metapackage if that's ok with you.
> 
> I'd prefer more hipcc related metapackages than maintaining llvm version in
> multiple specs

libomp and all the compat vs normal clang things should be done all in one place, imo hipcc.

Comment 23 Tom Rix 2024-07-29 21:42:34 UTC
(In reply to Tim Flink from comment #19)
> Created attachment 2041182 [details]
> specfile changes for discussion
> 
> I've attached an initial patch for the hipblaslt specfile that contains a
> few things, three are pretty minor:
> 
> 1. I changed a few of the BuildRequires to use pkgconfig where appropriate
Can you explain (likely again) why pkgconfig ?  I would prefer to keep the BuildRequires as-is unless there is a MUST item somewhere you can point to.

> 2. I added a different way to disable the rpath check disable when building
> --with test that works with mock. I'd like to have this in all of the rocm
> packages that have -test subpackages to make building them easier but this
> can wait for a more general discussion, it doesn't have to be added now if
> there are objections.

Having a mockable set of -test subpackages is generally good, then we can talk about getting them into the distro.  Did you want to do this for the set ?
 
> 3. I added some missing deps from the -test subpackage
> 

I'll add these if there is another revision, but these would fall out of #2 if you wanted to do #2.

Comment 24 Tim Flink 2024-07-29 21:52:11 UTC
(In reply to Tom Rix from comment #23)
> (In reply to Tim Flink from comment #19)
> > Created attachment 2041182 [details]
> > specfile changes for discussion
> > 
> > I've attached an initial patch for the hipblaslt specfile that contains a
> > few things, three are pretty minor:
> > 
> > 1. I changed a few of the BuildRequires to use pkgconfig where appropriate
> Can you explain (likely again) why pkgconfig ?  I would prefer to keep the
> BuildRequires as-is unless there is a MUST item somewhere you can point to.

It's a SHOULD according to the packaging guidelines. I understand why its a thing but I don't personally have any strong feelings about it.

https://docs.fedoraproject.org/en-US/packaging-guidelines/PkgConfigBuildRequires/

> > 2. I added a different way to disable the rpath check disable when building
> > --with test that works with mock. I'd like to have this in all of the rocm
> > packages that have -test subpackages to make building them easier but this
> > can wait for a more general discussion, it doesn't have to be added now if
> > there are objections.
> 
> Having a mockable set of -test subpackages is generally good, then we can
> talk about getting them into the distro.  Did you want to do this for the
> set ?

Beyond getting the -test packages into the distro, it makes them easier to build in copr and makes them easier to build for me personally since I almost always use mock for local package builds.

Like I said, I don't care a ton if it's done right now. It's something that I'm definitely going to be proposing for all the ROCm packages once I'm able to to spend more time testing them - the builtin packages seem to be the best place to start.

> > 3. I added some missing deps from the -test subpackage
> > 
> 
> I'll add these if there is another revision, but these would fall out of #2
> if you wanted to do #2.

I don't understand what those deps have to do with making -test buildable in mock. The current package doesn't build --with test even if  %global __brp_check_rpaths %{nil} is used instead of QA_RPATHS=0xff. The build deps I added fix that build problem.

Comment 25 Tim Flink 2024-07-29 21:58:24 UTC
This feels like a dumb question but I'll ask it anyways - do we care if hipblaslt is usable when installed on its own?

I assume that we do but at the same time, it sounds like the only know use case for it right now is using new pytorch on datacenter cards.

I know that this package at least needs hipblas and I think it needs python3-tensile but neither of those are installed if you install hipblaslt (or hipblaslt-test) on its own.

On a related note, environment-modules (needed for 'module load rocm/gfx11') isn't installed either but I think that's an issue with more rocm packages than just hipblaslt. That's on my todo list but I haven't seen complaints about it so it's not terribly high on the priority list.

Comment 26 Tom Rix 2024-07-31 01:44:48 UTC
(In reply to Tim Flink from comment #25)
> This feels like a dumb question but I'll ask it anyways - do we care if
> hipblaslt is usable when installed on its own?
> 
> I assume that we do but at the same time, it sounds like the only know use
> case for it right now is using new pytorch on datacenter cards.
The main reason that i know of is that pytorch assumes it is there.
Yes afaik this is only usable by data center cards.

A middle term use would be to run fedora pytorch containers on rhel datacenter cards.
Another middle term use would be this would be the set needed to run on rhel.
 
> 
> I know that this package at least needs hipblas and I think it needs
> python3-tensile but neither of those are installed if you install hipblaslt
> (or hipblaslt-test) on its own.

python-tensile is needed only i think to generate blas routines
hibblaslt has its own fork to do similar.
it would be hard for me to check hipbaslt-test now.
> 
> On a related note, environment-modules (needed for 'module load rocm/gfx11')
> isn't installed either but I think that's an issue with more rocm packages
> than just hipblaslt. That's on my todo list but I haven't seen complaints
> about it so it's not terribly high on the priority list.

Comment 27 Tom Rix 2024-07-31 12:03:52 UTC
Spec URL: https://trix.fedorapeople.org/hipblaslt.spec
SRPM URL: https://trix.fedorapeople.org/hipblaslt-6.1.2-1.fc40.src.rpm

I took changes for -test mostly as-is.

Comment 28 Fedora Review Service 2024-07-31 18:07:51 UTC
Created attachment 2043178 [details]
The .spec file difference from Copr build 7794839 to 7809096

Comment 29 Fedora Review Service 2024-07-31 18:07:54 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/7809096
(failed)

Build log:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2295820-hipblaslt/fedora-rawhide-x86_64/07809096-hipblaslt/builder-live.log.gz

Please make sure the package builds successfully at least for Fedora Rawhide.

- If the build failed for unrelated reasons (e.g. temporary network
  unavailability), please ignore it.
- If the build failed because of missing BuildRequires, please make sure they
  are listed in the "Depends On" field


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 30 Jeremy Newton 2024-08-01 17:06:03 UTC
> It did not exist as of this morning, i went ahead without it and it seems to work ok.

OK I'll add the hipcc-libomp package, and drop the extra tools subpackage

Comment 31 Tim Flink 2024-08-01 23:31:36 UTC
Created attachment 2043248 [details]
clean up test buildrequires, update libomp buildrequires

I changed the buildrequires for -test to match the rest of the buildrequires and updated the libomp buildrequires to match the change that Jeremy is making. This won't build --with test until hipcc-libomp package is available.

Comment 32 Jeremy Newton 2024-08-02 00:25:35 UTC
Ok, added hipcc-libomp-devel here:
https://koji.fedoraproject.org/koji/taskinfo?taskID=121371311

Note that Tim's patch excludes the "devel" (I realise my last comment was confusing).

Comment 33 Tim Flink 2024-08-02 01:07:37 UTC
Created attachment 2043251 [details]
clean up test buildrequires, update libomp buildrequires correctly

libomp-devel is important; patch is updated

Thanks for the catch, Jeremy.

Comment 34 Tom Rix 2024-08-02 12:14:07 UTC
Spec URL: https://trix.fedorapeople.org/hipblaslt.spec
SRPM URL: https://trix.fedorapeople.org/hipblaslt-6.1.2-1.fc40.src.rpm

changes to test taken as-is

Comment 35 Fedora Review Service 2024-08-02 14:19:55 UTC
Created attachment 2043325 [details]
The .spec file difference from Copr build 7809096 to 7819168

Comment 36 Fedora Review Service 2024-08-02 14:19:58 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/7819168
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2295820-hipblaslt/fedora-rawhide-x86_64/07819168-hipblaslt/fedora-review/review.txt

Found issues:

- No gcc, gcc-c++ or clang found in BuildRequires
  Read more: https://docs.fedoraproject.org/en-US/packaging-guidelines/C_and_C++/

Please know that there can be false-positives.

---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 37 Jeremy Newton 2024-08-02 14:41:05 UTC
Looks good from my side, Tim do you want to proceed with review?

Comment 38 Tim Flink 2024-08-02 17:01:48 UTC
I'm not crazy about the generated bits being in /usr/lib instead of being in /usr/libexec but that does seem to be how both Debian and Fedora handle rocblas, so I conclude that it's not really an issue for review since it would take upstream changes to fix that in any realistic way.

The rest of the package is fine, though.

Comment 39 Fedora Admin user for bugzilla script actions 2024-08-02 17:29:14 UTC
The Pagure repository was created at https://src.fedoraproject.org/rpms/hipblaslt

Comment 40 djip007 2024-08-18 21:27:01 UTC
Is it possible to add gfx11+ (gfx1103 in my case)?


Note You need to log in before you can comment on or make changes to this bug.