Bug 730679 - BUG: unable to handle kernel NULL pointer dereference at 0000000000000038, set_next_entity [NEEDINFO]
Summary: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038, se...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-08-15 11:09 UTC by yangyi
Modified: 2014-12-17 23:24 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-12-17 23:24:19 UTC
kzhang: needinfo? (yiyang.specific)


Attachments (Terms of Use)
config files (63.18 KB, application/octet-stream)
2011-08-15 11:09 UTC, yangyi
no flags Details

Description yangyi 2011-08-15 11:09:12 UTC
Created attachment 518252 [details]
config files

Description of problem:
kernel report NULL pointer dereference

Version-Release number of selected component (if applicable):
2.6.32-71.7.1.el6

How reproducible:
Not clear

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
panic log
BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1083
in_atomic(): 0, irqs_disabled(): 1, pid: 11934, name: perl
Pid: 11934, comm: perl Tainted: G        W  ----------------  2.6.32_1-1-0-0-kdump #4
Call Trace:
 [<ffffffff81039467>] ? __might_sleep+0xc6/0xcb
 [<ffffffff8136eedc>] ? do_page_fault+0x1ea/0x414
 [<ffffffff810d4ab2>] ? mnt_want_write+0x3a/0x65
 [<ffffffff810d2b73>] ? touch_atime+0x12c/0x133
 [<ffffffff810c6d49>] ? pipe_read+0x37a/0x38e
 [<ffffffff8136d07f>] ? page_fault+0x1f/0x30
 [<ffffffff8102fc71>] ? set_next_entity+0x9/0x33
 [<ffffffff810306c1>] ? pick_next_task_fair+0x70/0x96
 [<ffffffff8136af7c>] ? schedule+0x53c/0x804
 [<ffffffff810c074b>] ? vfs_read+0x133/0x162
 [<ffffffff810c0a2e>] ? sys_read+0x45/0x6e
 [<ffffffff8100c589>] ? retint_careful+0xd/0x21
BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: [<ffffffff8102fc71>] set_next_entity+0x9/0x33
PGD 2da468067 PUD 1ce812067 PMD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/kernel/kexec_crash_loaded
CPU 3 
Modules linked in: dm_mirror dm_region_hash dm_log dm_mod button battery ac ehci_hcd uhci_hcd usbcore shpchp igb e
1000 sata_promise [last unloaded: x_tables]

Modules linked in: dm_mirror dm_region_hash dm_log dm_mod button battery ac ehci_hcd uhci_hcd usbcore shpchp igb e1000 sata_promise [last unloaded: x_tables]
Pid: 11934, comm: perl Tainted: G        W  ----------------  2.6.32_1-1-0-0-kdump #4 ProLiant DL180 G6  
RIP: 0010:[<ffffffff8102fc71>]  [<ffffffff8102fc71>] set_next_entity+0x9/0x33
RSP: 0000:ffff880785cd1e90  EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: fffffffffffffff0
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800b814d380
RBP: ffff8800b814d380 R08: 0000000000000000 R09: ffff8800b814d450
R10: 0000000000000000 R11: 00000001094f767b R12: 0000000000000000
R13: ffff880044873200 R14: 0000000000000003 R15: 0000000006060006
FS:  00007f891d0696e0(0000) GS:ffff880044860000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000038 CR3: 0000000575cdf000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process perl (pid: 11934, threadinfo ffff880785cd0000, task ffff8800b4b49c40)
Stack:
 0000000000000000 ffff8800b814d380 0000000000000000 ffffffff810306c1
<0> 0000000000000002 ffff880044873200 0000000000000000 ffffffff8137bf40
<0> 000000000224f138 ffffffff8136af7c 0000000000000001 0000000000001000
Call Trace:
 [<ffffffff810306c1>] ? pick_next_task_fair+0x70/0x96
 [<ffffffff8136af7c>] ? schedule+0x53c/0x804
 [<ffffffff810c074b>] ? vfs_read+0x133/0x162
 [<ffffffff810c0a2e>] ? sys_read+0x45/0x6e
 [<ffffffff8100c589>] ? retint_careful+0xd/0x21
Code: 60 00 00 00 00 48 85 f6 74 06 48 39 70 58 75 08 48 c7 40 58 00 00 00 00 48 8b b6 90 00 00 00 eb c5 c3 55 48 89 fd 53 48 89 f3 50 <83> 7e 38 00 74 05 e8 5a fd ff ff 48 8b 45 70 48 8b 80 30 08 00 
RIP  [<ffffffff8102fc71>] set_next_entity+0x9/0x33
 RSP <ffff880785cd1e90>
CR2: 0000000000000038

Comment 2 Zhang Kexin 2011-08-25 08:52:05 UTC
Hi, you hit this bug for just one time? Thanks!

Comment 3 yangyi 2011-08-25 09:57:32 UTC
No, about 10+ times.

Comment 4 Zhang Kexin 2011-08-30 09:59:04 UTC
Thanks for reply, then could you reproduce it steadily? If yes, could you tell the steps? and could you please also check whether you can trigger the bug on 6.1 kernel 2.6.32-131.0.15? Many thanks!

Comment 5 RHEL Product and Program Management 2011-10-07 15:44:37 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 6 Brian Maly 2012-01-18 17:59:52 UTC
More info on this...

We were seeing this bug constantly (daily) in older kernels (on HS21/HS22 hardware). 

In addition to set_next_entity(), the NULL sched_entity was also observed frequently in put_prev_task_fair(), task_tick_fair() and enqueue_task_fair().

The issue seems to be resolved in linux-2.6.32-220.el6. We have not seen a single instance of the NULL sched_entity on the latest kernel. Seems to resolve this issue for us anyway.

Comment 7 Jiri Benc 2014-12-17 23:24:19 UTC
Closing per comment 6.


Note You need to log in before you can comment on or make changes to this bug.