Bug 1433632

Summary: clpeak hangs trying to access AMD GPU using mesa-libOpenCL
Product: [Fedora] Fedora Reporter: M. Edward (Ed) Borasky <znmeb>
Component: clpeakAssignee: Fabian Deutsch <fabian.deutsch>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 25CC: abrahm.scully, fabian.deutsch
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-12 10:42:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
clinfo for the system
none
log of successful clpeak run on platform 1 (pocl - uses CPU, not GPU) none

Description M. Edward (Ed) Borasky 2017-03-19 00:00:33 UTC
Created attachment 1264468 [details]
clinfo for the system

Description of problem: clpeak hangs with 100% CPU trying to access "Clover" GPU


Version-Release number of selected component (if applicable):
Name        : clpeak
Arch        : x86_64
Epoch       : 0
Version     : 0.1
Release     : 11.20160207git1f90347.fc24
Size        : 135 k
Repo        : @System
From repo   : fedora

Name        : mesa-libOpenCL
Arch        : x86_64
Epoch       : 0
Version     : 13.0.4
Release     : 2.fc25
Size        : 1.9 M
Repo        : @System
From repo   : updates


How reproducible: always happens


Steps to Reproduce:
1. Install 'clinfo', 'clpeak', 'pocl' and 'mesa-libOpenCL'
2. Run clinfo. On my system, it shows the GPU and "pocl", which uses the CPU. Output file is attached.
3. On my system 'pocl' is platform 1 and 'Clover' (the GPU) is platform 0. clpeak runs fine on platform 1 (log attached) but on platform 0 it starts up and then sits there at 100% CPU. So on my system 'clpeak -p 1' works and 'clpeak -p 0' fails. 

Actual results:
$ clpeak -p 0

Platform: Clover
  Device: AMD BONAIRE (DRM 2.48.0 / 4.9.14-200.fc25.x86_64, LLVM 3.9.1)
    Driver version  : 13.0.4 (Linux x64)
    Compute units   : 14
    Clock frequency : 1075 MHz

followed by nothing happening


Expected results: a full run of clpeak against the AMD Bonaire GPU


Additional info: I don't think this is a clpeak problem but I couldn't find mesa-libOpenCL in the bug report interface.

Comment 1 M. Edward (Ed) Borasky 2017-03-19 00:03:00 UTC
Created attachment 1264469 [details]
log of successful clpeak run on platform 1 (pocl - uses CPU, not GPU)

Comment 2 M. Edward (Ed) Borasky 2017-04-25 08:12:59 UTC
I've got some free time this week - is there a how-to on tracing OpenCL / AMD GPU bugs like this with GDB?

Comment 3 M. Edward (Ed) Borasky 2017-04-25 22:23:53 UTC
I'm moving ahead with the GDB stuff ... in the process I've run into a gdb bug, which I'll search for. I'm also going to dual-boot the machine with Fedora 26 and see if this is still there; not much point in troubleshooting it on 25 when 26 is in alpha. ;-)

Comment 4 M. Edward (Ed) Borasky 2017-04-26 06:55:24 UTC
gdb issue is  bug 1367131

Comment 5 M. Edward (Ed) Borasky 2017-04-27 22:29:37 UTC
This upstream bug looks like the same issue: https://bugs.freedesktop.org/show_bug.cgi?id=96897

Comment 6 M. Edward (Ed) Borasky 2017-05-08 06:02:02 UTC
Also on GitHub: https://github.com/krrishnarraj/clpeak/issues/32

Comment 7 Fedora End Of Life 2017-11-16 18:58:26 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 8 Fedora End Of Life 2017-12-12 10:42:01 UTC
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.