Bug 892233 - radeon modeset crashes on A4-3400 HD6410D with kernel 3.7.1
Summary: radeon modeset crashes on A4-3400 HD6410D with kernel 3.7.1
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 18
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-06 02:34 UTC by Mikko Tiihonen
Modified: 2013-03-13 17:24 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2013-03-13 17:24:43 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
kernel divide error oops (4.56 KB, text/plain)
2013-01-06 02:34 UTC, Mikko Tiihonen
no flags Details
Patch that works around the division by zero (1.01 KB, patch)
2013-01-11 18:35 UTC, Mikko Tiihonen
no flags Details | Diff

Description Mikko Tiihonen 2013-01-06 02:34:18 UTC
Created attachment 673228 [details]
kernel divide error oops

Description of problem:
Enabling radeon modeset on my HD6410D causes a kernel crash.
Not a regression, since it has not worked with previous kernel versions.

Version-Release number of selected component (if applicable):
kernel-3.7.1-2.fc18.x86_64

How reproducible:
always

Steps to Reproduce:
1. boot with nomodeset (seems to use vesafb)
2. rmmod radeon
3. modprobe radeon modeset=1
  
Actual results:
kernel crashes with divide error in evergreen_startup and machine pretty much locks up.
Call Trace:
 [<ffffffffa047ed02>] evergreen_startup+0x632/0x1660 [radeon]
 [<ffffffffa047feb3>] evergreen_init+0x183/0x2a0 [radeon]
 [<ffffffffa041fbf4>] radeon_device_init+0x554/0x640 [radeon]
 [<ffffffffa042164d>] radeon_driver_load_kms+0x9d/0x1a0 [radeon]
 [<ffffffffa0027d16>] drm_get_pci_dev+0x186/0x2d0 [drm]
 [<ffffffff8117fae9>] ? kfree+0x49/0x170
 [<ffffffffa0496b99>] radeon_pci_probe+0xb1/0xb9 [radeon]
 [<ffffffff81314cb9>] local_pci_probe+0x79/0x100
 [<ffffffff81314e61>] pci_device_probe+0x121/0x130
 [<ffffffff813d2feb>] driver_probe_device+0x8b/0x390
 [<ffffffff813d339b>] __driver_attach+0xab/0xb0
 [<ffffffff813d32f0>] ? driver_probe_device+0x390/0x390
 [<ffffffff813d1075>] bus_for_each_dev+0x55/0x90
 [<ffffffff813d295e>] driver_attach+0x1e/0x20
 [<ffffffff813d2590>] bus_add_driver+0x1a0/0x290
 [<ffffffffa04e6000>] ? 0xffffffffa04e5fff
 [<ffffffffa04e6000>] ? 0xffffffffa04e5fff
 [<ffffffff813d3a67>] driver_register+0x77/0x170
 [<ffffffffa04e6000>] ? 0xffffffffa04e5fff
 [<ffffffff81313be8>] __pci_register_driver+0x48/0x50
 [<ffffffffa0027f7a>] drm_pci_init+0x11a/0x130 [drm]
 [<ffffffffa04e6000>] ? 0xffffffffa04e5fff
 [<ffffffffa04e6000>] ? 0xffffffffa04e5fff
 [<ffffffffa04e60ec>] radeon_init+0xec/0x1000 [radeon]
 [<ffffffff8100216a>] do_one_initcall+0x12a/0x180
 [<ffffffff810c2ac0>] sys_init_module+0xc0/0x220
 [<ffffffff8163d9d9>] system_call_fastpath+0x16/0x1b
Code: 09 c5 45 89 e9 66 0f 1f 44 00 00 44 89 cb 41 d1 e9 83 e3 01 41 01 db 83 ee 01 75 ef 89 d1 44 29 d9 41 39 cf 72 70 31 d2 44 89 f8 <f7> f1 0f af c8 41 89 c0 44 89 f8 29 c8 83 bf c0 00 00 00 27 19 
RIP  [<ffffffffa0464e98>] r6xx_remap_render_backend+0x78/0xf0 [radeon]
 RSP <ffff88020c755b20>

Expected results:
radeon driver initializes itself correctly

Additional info:
lspci: 0300: 1002:9644
full kernel oops attached

Comment 1 Mikko Tiihonen 2013-01-07 20:43:27 UTC
Just tried with 3.8.0-0.rc2.git1.1.fc19.x86_64 - still fails at the same place.

Comment 2 Mikko Tiihonen 2013-01-07 21:34:52 UTC
The attachment seems to point to r600.c:r6xx_remap_render_backend

The function contains only one divide:

u32 r6xx_remap_render_backend(struct radeon_device *rdev,
                              u32 tiling_pipe_num,
                              u32 max_rb_num,
                              u32 total_max_rb_num,
                              u32 disabled_rb_mask)
{
        u32 rendering_pipe_num, rb_num_width, req_rb_num;
...
        /* mask out the RBs that don't exist on that asic */
        disabled_rb_mask |= (0xff << max_rb_num) & 0xff;

        rendering_pipe_num = 1 << tiling_pipe_num;
        req_rb_num = total_max_rb_num - r600_count_pipe_bits(disabled_rb_mask);
        BUG_ON(rendering_pipe_num < req_rb_num);

        pipe_rb_ratio = rendering_pipe_num / req_rb_num;

I added a printk to see what actual parameters are passed in:

r6xx_remap_render_backend: tiling_pipe_num=2, max_rb_num=1, total_max_rb_num=8, disabled_rb_mask=253

Using those to calculate the divide by zero comes from:
disabled_rb_mask |= 254; -> 255
req_rb_num = 8 - 8;

Comment 3 Mikko Tiihonen 2013-01-11 18:35:43 UTC
Created attachment 677052 [details]
Patch that works around the division by zero

The attached patch works and allows me to boot with kernel modeset enabled without errors. The patch is very safe since it only changes the functionality in the cases that would have resulted in division by zero.

I think the proper fix would be to not modify the given disabled_rb_mask unless it has more than max_rb_num zero bits. My guess is that the mask modification has been added as a workaround for some other cases, but it seems that it can disable RBs that should be active - such as in my case the only available RB.

Comment 4 Mikko Tiihonen 2013-02-05 21:16:47 UTC
The fix is now in stable queue http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=commitdiff;h=ea8a8e923e4108853590c7d5b6ea6765b4585839

So this bug can be closed when 3.7.7 kernel is available in Fedora.

Comment 5 Josh Boyer 2013-02-05 21:23:10 UTC
Earlier you tested 3.8.0-0.rc2.git1.1.  This commit you mention should be in 3.8.0-0.rc6.git2.1.  Can you test that to see if it solves the issue for you?

If so, we can bring those patches in before 3.7.7 is released.

Comment 6 Mikko Tiihonen 2013-02-06 16:48:20 UTC
just tested kernel-3.8.0-0.rc6.git3.1.fc19.x86_64 - works very nicely

Comment 7 Josh Boyer 2013-02-11 13:11:10 UTC
3.7.7 should be released early this week.  We'll pick those patches up from there.  We wouldn't have been able to get an update ready and pushed before then anyway.


Note You need to log in before you can comment on or make changes to this bug.