Bug 906296

Summary: BUG: sleeping function called from invalid context at mm/slub.c:925
Product: [Fedora] Fedora Reporter: Yanko Kaneti <yaneti>
Component: xorg-x11-drv-atiAssignee: X/OpenGL Maintenance List <xgl-maint>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: bugs.michael, clydekunkel7734, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, nicolas.mailhot, schaiba, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-11 08:30:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Don't alloc while holding spinlock
none
Fix none

Description Yanko Kaneti 2013-01-31 11:57:56 UTC
Description of problem:
[  120.093741] BUG: sleeping function called from invalid context at mm/slub.c:925
[  120.093754] in_atomic(): 1, irqs_disabled(): 0, pid: 941, name: Xorg
[  120.093758] 3 locks held by Xorg/941:
[  120.093761]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff816db7c3>] __do_page_fault+0xf3/0x570
[  120.093782]  #1:  (&rdev->pm.mclk_lock){.+.+.+}, at: [<ffffffffa011661b>] radeon_ttm_fault+0x3b/0x70 [radeon]
[  120.093839]  #2:  (&(&bdev->fence_lock)->rlock){+.+.+.}, at: [<ffffffffa007dd67>] ttm_bo_move_accel_cleanup+0x57/0x330 [ttm]
[  120.093861] Pid: 941, comm: Xorg Not tainted 3.8.0-0.rc5.git2.1.fc19.x86_64 #1
[  120.093864] Call Trace:
[  120.093877]  [<ffffffff8109f899>] __might_sleep+0x179/0x230
[  120.093884]  [<ffffffff811b115e>] kmem_cache_alloc_trace+0x4e/0x300
[  120.093897]  [<ffffffffa007dee3>] ? ttm_bo_move_accel_cleanup+0x1d3/0x330 [ttm]
[  120.093915]  [<ffffffffa007dee3>] ttm_bo_move_accel_cleanup+0x1d3/0x330 [ttm]
[  120.094024]  [<ffffffffa0116912>] radeon_move_blit.isra.7+0xc2/0x160 [radeon]
[  120.094054]  [<ffffffffa0117130>] radeon_bo_move+0xb0/0x200 [radeon]
[  120.094080]  [<ffffffffa007bed5>] ttm_bo_handle_move_mem+0x275/0x640 [ttm]
[  120.094098]  [<ffffffffa007c909>] ? ttm_bo_mem_space+0x179/0x360 [ttm]
[  120.094114]  [<ffffffffa007d117>] ttm_bo_move_buffer+0x127/0x150 [ttm]
[  120.094126]  [<ffffffffa007d1e2>] ttm_bo_validate+0xa2/0x130 [ttm]
[  120.094150]  [<ffffffffa01187dd>] radeon_bo_fault_reserve_notify+0x9d/0xd0 [radeon]
[  120.094163]  [<ffffffffa007f140>] ttm_bo_vm_fault+0x60/0x370 [ttm]
[  120.094185]  [<ffffffffa011662c>] radeon_ttm_fault+0x4c/0x70 [radeon]
[  120.094190]  [<ffffffff81184450>] __do_fault+0x70/0x560
[  120.094196]  [<ffffffff81187579>] handle_pte_fault+0x89/0x9d0
[  120.094200]  [<ffffffff81188c45>] handle_mm_fault+0x2a5/0x5c0
[  120.094204]  [<ffffffff816db832>] __do_page_fault+0x162/0x570
[  120.094211]  [<ffffffff811dff05>] ? do_vfs_ioctl+0x305/0x530
[  120.094217]  [<ffffffff816e0945>] ? sysret_check+0x22/0x5d
[  120.094226]  [<ffffffff8135099d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[  120.094230]  [<ffffffff816dbc4e>] do_page_fault+0xe/0x10
[  120.094234]  [<ffffffff816d7e48>] page_fault+0x28/0x30


After the first such occurence the 

Version-Release number of selected component (if applicable):
3.8.0-0.rc5.git2.1.fc19.x86_64

# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-3.8.0-0.rc5.git2.1.fc19.x86_64 root=/dev/sda1 ro SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us slub_debug=-


Booting with slub_debug=- because otherwise the rawhide kernels are unusable...

Comment 1 Josh Boyer 2013-01-31 14:32:14 UTC
Dave, do you happen to have a fix for this upstream?

Comment 2 Yanko Kaneti 2013-01-31 15:35:35 UTC
*** Bug 906400 has been marked as a duplicate of this bug. ***

Comment 3 Clyde E. Kunkel 2013-02-01 16:20:48 UTC
still in 3.8.0-0.rc5.git3.1.fc19.x86_64

From dmesg:

[   77.774524] BUG: sleeping function called from invalid context at mm/slub.c:925
[   77.774530] in_atomic(): 1, irqs_disabled(): 0, pid: 1525, name: Xorg
[   77.774533] 2 locks held by Xorg/1525:
[   77.774535]  #0:  (&rdev->exclusive_lock){.+.+.+}, at: [<ffffffffa00f4a89>] radeon_cs_ioctl+0x39/0xa10 [radeon]
[   77.774566]  #1:  (&(&bdev->fence_lock)->rlock){+.+.+.}, at: [<ffffffffa006cd67>] ttm_bo_move_accel_cleanup+0x57/0x330 [ttm]
[   77.774581] Pid: 1525, comm: Xorg Not tainted 3.8.0-0.rc5.git3.1.fc19.x86_64 #1
[   77.774584] Call Trace:
[   77.774592]  [<ffffffff8109f8b9>] __might_sleep+0x179/0x230
[   77.774597]  [<ffffffff811b11be>] kmem_cache_alloc_trace+0x4e/0x300
[   77.774611]  [<ffffffffa006cee3>] ? ttm_bo_move_accel_cleanup+0x1d3/0x330 [ttm]
[   77.774630]  [<ffffffffa006cee3>] ttm_bo_move_accel_cleanup+0x1d3/0x330 [ttm]
[   77.774658]  [<ffffffffa00de912>] radeon_move_blit.isra.7+0xc2/0x160 [radeon]
[   77.774680]  [<ffffffffa00df130>] radeon_bo_move+0xb0/0x200 [radeon]
[   77.774694]  [<ffffffffa006aed5>] ttm_bo_handle_move_mem+0x275/0x640 [ttm]
[   77.774707]  [<ffffffffa006b909>] ? ttm_bo_mem_space+0x179/0x360 [ttm]
[   77.774721]  [<ffffffffa006c117>] ttm_bo_move_buffer+0x127/0x150 [ttm]
[   77.774734]  [<ffffffffa006feed>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
[   77.774748]  [<ffffffffa006c1e2>] ttm_bo_validate+0xa2/0x130 [ttm]
[   77.774771]  [<ffffffffa00e0379>] radeon_bo_list_validate+0x79/0xc0 [radeon]
[   77.774795]  [<ffffffffa00f4da4>] radeon_cs_ioctl+0x354/0xa10 [radeon]
[   77.774815]  [<ffffffffa0014321>] drm_ioctl+0x501/0x5c0 [drm]
[   77.774845]  [<ffffffffa00f4a50>] ? radeon_cs_finish_pages+0xf0/0xf0 [radeon]
[   77.774850]  [<ffffffff81024811>] ? init_fpu+0x51/0xb0
[   77.774854]  [<ffffffff81025938>] ? __restore_xstate_sig+0x258/0x590
[   77.774859]  [<ffffffff816d70cc>] ? _raw_spin_unlock_irq+0x2c/0x50
[   77.774865]  [<ffffffff811dff65>] do_vfs_ioctl+0x305/0x530
[   77.774869]  [<ffffffff811ebebc>] ? fget_light+0x3ac/0x520
[   77.774873]  [<ffffffff811e0211>] sys_ioctl+0x81/0xa0
[   77.774878]  [<ffffffff816e0999>] system_call_fastpath+0x16/0x1b
[   94.515423] fuse init (API version 7.20)
[  139.315339] BUG: sleeping function called from invalid context at mm/slub.c:925
[  139.315346] in_atomic(): 1, irqs_disabled(): 0, pid: 1525, name: Xorg
[  139.315349] 2 locks held by Xorg/1525:
[  139.315351]  #0:  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa0023d05>] drm_mode_cursor_ioctl+0x55/0x150 [drm]
[  139.315374]  #1:  (&(&bdev->fence_lock)->rlock){+.+.+.}, at: [<ffffffffa006cd67>] ttm_bo_move_accel_cleanup+0x57/0x330 [ttm]
[  139.315389] Pid: 1525, comm: Xorg Not tainted 3.8.0-0.rc5.git3.1.fc19.x86_64 #1
[  139.315392] Call Trace:
[  139.315399]  [<ffffffff8109f8b9>] __might_sleep+0x179/0x230
[  139.315405]  [<ffffffff811b11be>] kmem_cache_alloc_trace+0x4e/0x300
[  139.315419]  [<ffffffffa006cee3>] ? ttm_bo_move_accel_cleanup+0x1d3/0x330 [ttm]
[  139.315438]  [<ffffffffa006cee3>] ttm_bo_move_accel_cleanup+0x1d3/0x330 [ttm]
[  139.315470]  [<ffffffffa00de912>] radeon_move_blit.isra.7+0xc2/0x160 [radeon]
[  139.315493]  [<ffffffffa00df130>] radeon_bo_move+0xb0/0x200 [radeon]
[  139.315507]  [<ffffffffa006aed5>] ttm_bo_handle_move_mem+0x275/0x640 [ttm]
[  139.315520]  [<ffffffffa006b909>] ? ttm_bo_mem_space+0x179/0x360 [ttm]
[  139.315534]  [<ffffffffa006c117>] ttm_bo_move_buffer+0x127/0x150 [ttm]
[  139.315548]  [<ffffffffa006c1e2>] ttm_bo_validate+0xa2/0x130 [ttm]
[  139.315572]  [<ffffffffa00dff40>] radeon_bo_pin_restricted+0xf0/0x1b0 [radeon]
[  139.315596]  [<ffffffffa00eda14>] radeon_crtc_cursor_set+0x94/0x4c0 [radeon]
[  139.315615]  [<ffffffffa0023da5>] drm_mode_cursor_ioctl+0xf5/0x150 [drm]
[  139.315632]  [<ffffffffa0014321>] drm_ioctl+0x501/0x5c0 [drm]
[  139.315656]  [<ffffffffa0023cb0>] ? drm_mode_setcrtc+0x5a0/0x5a0 [drm]
[  139.315660]  [<ffffffff8109707f>] ? up_read+0x1f/0x40
[  139.315666]  [<ffffffff816db964>] ? __do_page_fault+0x214/0x570
[  139.315671]  [<ffffffff811dff65>] do_vfs_ioctl+0x305/0x530
[  139.315676]  [<ffffffff811ebebc>] ? fget_light+0x3ac/0x520
[  139.315680]  [<ffffffff811e0211>] sys_ioctl+0x81/0xa0
[  139.315685]  [<ffffffff816e0999>] system_call_fastpath+0x16/0x1b

Comment 4 Jérôme Glisse 2013-02-04 16:33:35 UTC
Created attachment 692871 [details]
Don't alloc while holding spinlock

Can you verify that attached patch fix the issue ?

Comment 5 Jérôme Glisse 2013-02-04 18:38:49 UTC
Created attachment 692950 [details]
Fix

This one actualy build

Comment 6 Yanko Kaneti 2013-02-04 22:45:10 UTC
attachment 692950 [details] on top of kernel-3.8.0-0.rc6.git1.1.fc19.x86_64 seems to fix the issue here. Again with slub_debug=-

Comment 7 Jérôme Glisse 2013-02-05 19:08:59 UTC
*** Bug 908015 has been marked as a duplicate of this bug. ***

Comment 8 Nicolas Mailhot 2013-02-06 11:59:25 UTC
It is not fixed in abrt-gui-2.1.0-1.fc19.x86_64 and the abrt storm continues

Comment 9 Nicolas Mailhot 2013-02-06 11:59:55 UTC
It is not fixed in 3.8.0-0.rc6.git2.1.fc19.x86_64 and the abrt storm continues

Comment 10 Yanko Kaneti 2013-02-11 08:30:50 UTC
Fixed in rc7 upstream and in 3.8.0-0.rc7.git0.1.fc19