Bug 2277002 - rocm-opencl 6.0.2 does not work with gfx803 cards currently
Summary: rocm-opencl 6.0.2 does not work with gfx803 cards currently
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: rocclr
Version: 40
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Jeremy Newton
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-04-24 20:41 UTC by Mike Hedman
Modified: 2024-10-06 10:38 UTC (History)
4 users (show)

Fixed In Version: rocclr-6.2.1-3.fc42 rocclr-6.1.2-2.fc40
Clone Of:
Environment:
Last Closed: 2024-09-27 18:17:37 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
unified diff for the fix. (1.66 KB, patch)
2024-04-24 23:13 UTC, Mike Hedman
no flags Details | Diff

Description Mike Hedman 2024-04-24 20:41:59 UTC
I upgraded to fedora 40 yesterday, and clinfo nolonger showed my GPU as available
(gfx803, AMD RX 570, polaris). I see the spec file in the source rpm has a sed command to enable the parameter still, but the parameter no longer exists in device.hpp
I downloaded and debugged the source rpm, and I suggest the following diff to fix the issue:

diff -r clr-rocm-6.0.2/rocclr/device/device.hpp clr-rocm-6.0.2.fixed/rocclr/device/device.hpp
1390c1390
<     if (!IS_HIP && (versionMajor_ == 8)) {
---
>     if (!IS_HIP && !ROC_ENABLE_PRE_VEGA && (versionMajor_ == 8)) {
diff -r clr-rocm-6.0.2/rocclr/utils/flags.hpp clr-rocm-6.0.2.fixed/rocclr/utils/flags.hpp
193a194,195
> release(bool, ROC_ENABLE_PRE_VEGA, false,                                     \
>         "Enable support of pre-vega ASICs in ROCm path")                      \
238a241
> 


(I have built and tested this on my own desktop and it now works).


Mike Hedman

Reproducible: Always

Steps to Reproduce:
1.clpeak command
2.clinfo command
3.darktable-cltest command
Actual Results:  
They don't show my card available

Expected Results:  
clpeak runs tests on the GPU performance. we expect to see the gpu as available.

Comment 1 Mike Hedman 2024-04-24 22:44:42 UTC
this is probably more useful:

MATH16-16 >> diff -cr clr-rocm-6.0.2/ clr-rocm-6.0.2.fixed/ 
diff -cr clr-rocm-6.0.2/rocclr/device/device.hpp clr-rocm-6.0.2.fixed/rocclr/device/device.hpp
*** clr-rocm-6.0.2/rocclr/device/device.hpp	2024-01-04 21:16:33.000000000 -0600
--- clr-rocm-6.0.2.fixed/rocclr/device/device.hpp	2024-04-24 14:41:20.762112175 -0500
***************
*** 1387,1393 ****
  
    /// @returns If the ROCm runtime supports the ISA.
    bool runtimeRocSupported() const {
!     if (!IS_HIP && (versionMajor_ == 8)) {
        return false;
      }
      return runtimeRocSupported_;
--- 1387,1393 ----
  
    /// @returns If the ROCm runtime supports the ISA.
    bool runtimeRocSupported() const {
!     if (!IS_HIP && !ROC_ENABLE_PRE_VEGA && (versionMajor_ == 8)) {
        return false;
      }
      return runtimeRocSupported_;
diff -cr clr-rocm-6.0.2/rocclr/utils/flags.hpp clr-rocm-6.0.2.fixed/rocclr/utils/flags.hpp
*** clr-rocm-6.0.2/rocclr/utils/flags.hpp	2024-01-04 21:16:33.000000000 -0600
--- clr-rocm-6.0.2.fixed/rocclr/utils/flags.hpp	2024-04-24 14:26:38.483569254 -0500
***************
*** 191,196 ****
--- 191,198 ----
          "Enable system scope for signals (uses interrupts).")                 \
  release(bool, GPU_FORCE_QUEUE_PROFILING, false,                               \
          "Force command queue profiling by default")                           \
+ release(bool, ROC_ENABLE_PRE_VEGA, false,                                     \
+         "Enable support of pre-vega ASICs in ROCm path")                      \
  release(bool, HIP_MEM_POOL_SUPPORT, false,                                    \
          "Enables memory pool support in HIP")                                 \
  release(bool, HIP_MEM_POOL_USE_VM, IS_WINDOWS,                                \
***************
*** 237,242 ****
--- 239,245 ----
  release(cstring, HIPRTC_LINK_OPTIONS_APPEND, "",                              \
          "Set link options needed for hiprtc compilation")                     \
  
+ 
  namespace amd {
  
  extern bool IS_HIP;

Comment 2 Mike Hedman 2024-04-24 23:13:30 UTC
Created attachment 2028965 [details]
unified diff for the fix.

Comment 3 Mike Hedman 2024-04-25 15:21:48 UTC
I probably should have shown better this bug.  Here it is with the fedora built rocm-opencl, and then with the one I built with the fixes:

MATH16-16 >> sudo dnf erase rocm-opencl
Dependencies resolved.
==============================================================================================================================================================================================================================================================================================
 Package                                                               Architecture                                                     Version                                                                 Repository                                                               Size
==============================================================================================================================================================================================================================================================================================
Removing:
 rocm-opencl                                                           x86_64                                                           6.0.2-2.fc40                                                            @@commandline                                                           1.7 M

Transaction Summary
==============================================================================================================================================================================================================================================================================================
Remove  1 Package

Freed space: 1.7 M
Is this ok [y/N]: y
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                                                                                                                      1/1 
  Erasing          : rocm-opencl-6.0.2-2.fc40.x86_64                                                                                                                                                                                                                                      1/1 

Removed:
  rocm-opencl-6.0.2-2.fc40.x86_64                                                                                                                                                                                                                                                             

Complete!
MATH16-16 >> sudo dnf install rocm-opencl
Last metadata expiration check: 1:21:17 ago on Thu 25 Apr 2024 08:50:45 AM CDT.
Dependencies resolved.
==============================================================================================================================================================================================================================================================================================
 Package                                                                Architecture                                                      Version                                                                    Repository                                                          Size
==============================================================================================================================================================================================================================================================================================
Installing:
 rocm-opencl                                                            x86_64                                                            6.0.2-2.fc40                                                               updates                                                            560 k

Transaction Summary
==============================================================================================================================================================================================================================================================================================
Install  1 Package

Total download size: 560 k
Installed size: 1.7 M
Is this ok [y/N]: y
Downloading Packages:
rocm-opencl-6.0.2-2.fc40.x86_64.rpm                                                                                                                                                                                                                           1.7 MB/s | 560 kB     00:00    
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                                                                                                                         636 kB/s | 560 kB     00:00     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                                                                                                                      1/1 
  Installing       : rocm-opencl-6.0.2-2.fc40.x86_64                                                                                                                                                                                                                                      1/1 
  Running scriptlet: rocm-opencl-6.0.2-2.fc40.x86_64                                                                                                                                                                                                                                      1/1 

Installed:
  rocm-opencl-6.0.2-2.fc40.x86_64                                                                                                                                                                                                                                                             

Complete!
MATH16-16 >> clpeak

Platform: AMD Accelerated Parallel Processing
clCreateContextFromType (-1)
MATH16-16 >> sudo dnf erase rocm-opencl
Dependencies resolved.
==============================================================================================================================================================================================================================================================================================
 Package                                                                Architecture                                                      Version                                                                   Repository                                                           Size
==============================================================================================================================================================================================================================================================================================
Removing:
 rocm-opencl                                                            x86_64                                                            6.0.2-2.fc40                                                              @updates                                                            1.7 M

Transaction Summary
==============================================================================================================================================================================================================================================================================================
Remove  1 Package

Freed space: 1.7 M
Is this ok [y/N]: y
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                                                                                                                      1/1 
  Erasing          : rocm-opencl-6.0.2-2.fc40.x86_64                                                                                                                                                                                                                                      1/1 

Removed:
  rocm-opencl-6.0.2-2.fc40.x86_64                                                                                                                                                                                                                                                             

Complete!
MATH16-16 >> sudo su
root@math16-16:/home/mejh# sudo dnf install /root/rpmbuild/RPMS/x86_64/rocm-opencl-6.0.2-2.fc40.x86_64.rpm 
Last metadata expiration check: 1:23:22 ago on Thu 25 Apr 2024 08:50:45 AM CDT.
Dependencies resolved.
==============================================================================================================================================================================================================================================================================================
 Package                                                               Architecture                                                     Version                                                                  Repository                                                              Size
==============================================================================================================================================================================================================================================================================================
Installing:
 rocm-opencl                                                           x86_64                                                           6.0.2-2.fc40                                                             @commandline                                                           558 k

Transaction Summary
==============================================================================================================================================================================================================================================================================================
Install  1 Package

Total size: 558 k
Installed size: 1.7 M
Is this ok [y/N]: y
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                                                                                                                      1/1 
  Installing       : rocm-opencl-6.0.2-2.fc40.x86_64                                                                                                                                                                                                                                      1/1 
  Running scriptlet: rocm-opencl-6.0.2-2.fc40.x86_64                                                                                                                                                                                                                                      1/1 

Installed:
  rocm-opencl-6.0.2-2.fc40.x86_64                                                                                                                                                                                                                                                             

Complete!
root@math16-16:/home/mejh# clpeak

Platform: AMD Accelerated Parallel Processing
  Device: gfx803
    Driver version  : 3602.0 (HSA1.1,LC) (Linux x64)
    Compute units   : 32
    Clock frequency : 1280 MHz

    Global memory bandwidth (GBPS)
      float   : 182.74
      float2  : 184.54
      float4  : 173.95
      float8  : 175.74
      float16 : 155.37

    Single-precision compute (GFLOPS)
      float   : 5235.18
      float2  : 5224.56
      float4  : 5216.28
      float8  : 5190.12
      float16 : 5094.68

    Half-precision compute (GFLOPS)
      half   : 5247.79
      half2  : 4977.14
      half4  : 4750.65
      half8  : 4745.49
      half16 : 4687.70

    Double-precision compute (GFLOPS)
      double   : 332.45
      double2  : 332.25
      double4  : 331.75
      double8  : 330.94
      double16 : 329.69

    Integer compute (GIOPS)
      int   : 1060.33
      int2  : 1059.92
      int4  : 1058.45
      int8  : 1052.36
      int16 : 1051.34

    Integer compute Fast 24bit (GIOPS)
      int   : 5093.88
      int2  : 4697.76
      int4  : 4505.15
      int8  : 4464.24
      int16 : 4144.24

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 18.61
      enqueueReadBuffer               : 4.78
      enqueueWriteBuffer non-blocking : 18.67
      enqueueReadBuffer non-blocking  : 4.78
      enqueueMapBuffer(for read)      : 776749.38
        memcpy from mapped ptr        : 4.78
      enqueueUnmap(after write)       : 1404123.75
        memcpy to mapped ptr          : 18.73

    Kernel launch latency : 7.84 us

root@math16-16:/home/mejh#

Comment 4 Mike Hedman 2024-04-25 15:24:01 UTC
MATH16-16 >> ./rocm_agent_enumerator 
gfx000
gfx803

Comment 5 Jeremy Newton 2024-05-07 19:24:36 UTC
Thanks, yeah I was talking with upstream and they're ok re-enabling GFX8 by default (I think?).

I'm working on upgrading to 6.1 in rawhide right now, I can put this on my todo list, and backport any fixes to Fedora 40.

Comment 6 Mike Hedman 2024-05-07 19:42:35 UTC
Very cool!  Thanks!

Comment 7 Mike Hedman 2024-07-28 23:30:22 UTC
the 6.1.2-1.fc40.x86_64 update to rocm packages breaks my fix, but in applying the patch to the new source rpm and building them, I am able to use gfx800 card  (rx570) at this patch level.
Wouls be great to have this card enabled.

Comment 8 Jeremy Newton 2024-08-01 20:48:42 UTC
sorry I forgot about this.

You applied attachment 2028965 [details] and it works, right?

Comment 9 Mike Hedman 2024-08-01 22:30:13 UTC
Yes.  I downloaded the source rpm and applied the patch/rebuilt.  I have a working version on my desktop.

Comment 10 Fedora Admin user for bugzilla script actions 2024-08-22 18:49:01 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.

Comment 11 Jeremy Newton 2024-09-27 18:01:36 UTC
Sorry I missed this because rocm-opencl moved to rocclr source package.

Looks like the fix required a bit more investigation:
https://github.com/ROCm/clr/pull/97/commits/909fa3dcb644f7ca422ed1a980a54ac426d831b1

I have an update coming down the pipe for f40/41/rawhide

Comment 12 Fedora Update System 2024-09-27 18:10:56 UTC
FEDORA-2024-6d3ba84c4e (rocclr-6.1.2-2.fc40) has been submitted as an update to Fedora 40.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-6d3ba84c4e

Comment 13 Fedora Update System 2024-09-27 18:12:44 UTC
FEDORA-2024-84b3f56e16 (rocclr-6.2.1-3.fc42) has been submitted as an update to Fedora 42.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-84b3f56e16

Comment 14 Fedora Update System 2024-09-27 18:17:37 UTC
FEDORA-2024-84b3f56e16 (rocclr-6.2.1-3.fc42) has been pushed to the Fedora 42 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 15 Jeremy Newton 2024-09-27 19:53:53 UTC
Fixed in rawhide, see:
https://bodhi.fedoraproject.org/updates/FEDORA-2024-6d3ba84c4e

For fixing fedora 40

Comment 16 Fedora Update System 2024-09-28 02:34:14 UTC
FEDORA-2024-6d3ba84c4e has been pushed to the Fedora 40 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-6d3ba84c4e`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-6d3ba84c4e

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 17 Mike Hedman 2024-09-28 12:52:59 UTC
Hi.  It looks good!

Comment 18 Fedora Update System 2024-10-06 02:11:29 UTC
FEDORA-2024-6d3ba84c4e (rocclr-6.1.2-2.fc40) has been pushed to the Fedora 40 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 19 Germano Massullo (Thetra) 2024-10-06 10:38:04 UTC
On F40 I am testing the patch
https://github.com/ROCm/ROCm/issues/3664#issuecomment-2395030931
I did not have the time to test it a lot, but with darktable seems to work well


Note You need to log in before you can comment on or make changes to this bug.