Bug 1368500 - random radeon drivers fail. drivers/gpu/drm/radeon/radeon_object.c:84 radeon_ttm_bo_destroy+0xe3/0xf0 [NEEDINFO]
Summary: random radeon drivers fail. drivers/gpu/drm/radeon/radeon_object.c:84 radeon_...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 23
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-19 15:00 UTC by Andrej Manduch
Modified: 2016-10-26 16:58 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-26 16:58:14 UTC
Type: Bug
labbott: needinfo? (amanduch)


Attachments (Terms of Use)

Description Andrej Manduch 2016-08-19 15:00:25 UTC
Description of problem:

Sometimes radeon driver fails on drivers/gpu/drm/radeon/radeon_object.c:84 and renders machine unusable.

Example of stacktrace:
Aug 19 10:38:40 borg kernel: ------------[ cut here ]------------
Aug 19 10:38:40 borg kernel: WARNING: CPU: 7 PID: 3384 at drivers/gpu/drm/radeon/radeon_object.c:84 radeon_ttm_bo_destroy+0xe3/0xf0 [radeon]
Aug 19 10:38:40 borg kernel: Modules linked in: fuse tun ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter xt_conntrack ip_set nfnetlink ebtable_broute bridge stp llc ebtable_nat ip6table_mangle ip6table_raw ip6table_nat nf_conntrack_ipv6 nf_defr
Aug 19 10:38:40 borg kernel:  snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd mei_me soundcore mei tpm_tis wmi e1000e ptp tpm shpchp lpc_ich i2c_i801 pps_core nfsd auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc btrfs xor raid6
Aug 19 10:38:40 borg kernel: CPU: 7 PID: 3384 Comm: ImgDecoder #9 Tainted: G        W       4.6.6-200.fc23.x86_64 #1
Aug 19 10:38:40 borg kernel: Hardware name: MSI MS-7885/X99S SLI PLUS (MS-7885), BIOS 1.80 03/20/2015
Aug 19 10:38:40 borg kernel:  0000000000000286 00000000197134c1 ffff8803f37bf950 ffffffff813d93fe
Aug 19 10:38:40 borg kernel:  0000000000000000 0000000000000000 ffff8803f37bf990 ffffffff810a730b
Aug 19 10:38:40 borg kernel:  00000054c00fcd99 ffff8803f59d0868 ffff8803f59d0800 ffffffffffffffff
Aug 19 10:38:40 borg kernel: Call Trace:
Aug 19 10:38:40 borg kernel:  [<ffffffff813d93fe>] dump_stack+0x63/0x85
Aug 19 10:38:40 borg kernel:  [<ffffffff810a730b>] __warn+0xcb/0xf0
Aug 19 10:38:40 borg kernel:  [<ffffffff810a743d>] warn_slowpath_null+0x1d/0x20
Aug 19 10:38:40 borg kernel:  [<ffffffffc00fcdb3>] radeon_ttm_bo_destroy+0xe3/0xf0 [radeon]
Aug 19 10:38:40 borg kernel:  [<ffffffffc008e194>] ttm_bo_release_list+0xa4/0x140 [ttm]
Aug 19 10:38:40 borg kernel:  [<ffffffffc008e59b>] ttm_bo_release+0x19b/0x230 [ttm]
Aug 19 10:38:40 borg kernel:  [<ffffffffc008e654>] ttm_bo_unref+0x24/0x30 [ttm]
Aug 19 10:38:40 borg kernel:  [<ffffffffc00fd2e9>] radeon_bo_unref+0x39/0x70 [radeon]
Aug 19 10:38:40 borg kernel:  [<ffffffffc01105e7>] radeon_gem_object_free+0x57/0x70 [radeon]
Aug 19 10:38:40 borg kernel:  [<ffffffffc003c2b0>] drm_gem_object_free+0x30/0x50 [drm]
Aug 19 10:38:40 borg kernel:  [<ffffffffc003cc74>] drm_gem_object_handle_unreference_unlocked+0xc4/0x110 [drm]
Aug 19 10:38:40 borg kernel:  [<ffffffffc003cd15>] drm_gem_object_release_handle+0x55/0xa0 [drm]
Aug 19 10:38:40 borg kernel:  [<ffffffff813d9e7e>] idr_for_each+0xae/0x110
Aug 19 10:38:40 borg kernel:  [<ffffffffc003ccc0>] ? drm_gem_object_handle_unreference_unlocked+0x110/0x110 [drm]
Aug 19 10:38:40 borg kernel:  [<ffffffffc003d3e0>] drm_gem_release+0x20/0x30 [drm]
Aug 19 10:38:40 borg kernel:  [<ffffffffc003c195>] drm_release+0x385/0x470 [drm]
Aug 19 10:38:40 borg kernel:  [<ffffffff81249e8f>] __fput+0xdf/0x1f0
Aug 19 10:38:40 borg kernel:  [<ffffffff81249fde>] ____fput+0xe/0x10
Aug 19 10:38:40 borg kernel:  [<ffffffff810c5121>] task_work_run+0x81/0xa0
Aug 19 10:38:40 borg kernel:  [<ffffffff810ab082>] do_exit+0x2d2/0xb50
Aug 19 10:38:40 borg kernel:  [<ffffffff810ab987>] do_group_exit+0x47/0xb0
Aug 19 10:38:40 borg kernel:  [<ffffffff810b6ca1>] get_signal+0x291/0x610
Aug 19 10:38:40 borg kernel:  [<ffffffff8102e137>] do_signal+0x37/0x710
Aug 19 10:38:40 borg kernel:  [<ffffffff81129619>] ? do_futex+0x2d9/0xb40
Aug 19 10:38:40 borg kernel:  [<ffffffff8100320c>] exit_to_usermode_loop+0x8c/0xd0
Aug 19 10:38:40 borg kernel:  [<ffffffff81003d21>] syscall_return_slowpath+0xa1/0xb0
Aug 19 10:38:40 borg kernel:  [<ffffffff817dac7a>] entry_SYSCALL_64_fastpath+0xa2/0xa4
Aug 19 10:38:40 borg kernel: ---[ end trace b6515ba81564391a ]---


After some time of working it prints out several(100+) of those stacktraces ^^ in the row. It cause to fail Xserver and it renders machine unusable until I reboot it.

Version-Release number of selected component (if applicable):
burlak@borg ~ $ uname -r
4.6.6-200.fc23.x86_64
burlak@borg ~ $ lspci | grep -i radeon
06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT / Grenada XT [Radeon R9 290X/390X] (rev 80)


How reproducible:
Low reproducibility, it usually happens 1-3 times per day under normal use of computer.

Additional info:
Usually when this happens it prints out lot of stacktraces but I saw little pattern there, There usually only two different PID numbers mentioned in those stack traces.

Comment 1 Laura Abbott 2016-09-23 19:51:27 UTC
*********** MASS BUG UPDATE **************
 
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 23 kernel bugs.
 
Fedora 23 has now been rebased to 4.7.4-100.fc23.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 24 or 25, and are still experiencing this issue, please change the version to Fedora 24 or 25.
 
If you experience different issues, please open a new bug report for those.

Comment 2 Laura Abbott 2016-10-26 16:58:14 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.


Note You need to log in before you can comment on or make changes to this bug.