Bug 132463

Summary: NFS client in 1-549 kernel causes kerel OOPs
Product: [Fedora] Fedora Reporter: H.J. Lu <hongjiu.lu>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: pfrields, steved, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: RHEL4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-05 23:33:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description H.J. Lu 2004-09-13 18:07:06 UTC
While using NFS client in 1-549 SMP kernel on a P4 HT machine, I
got

kernel BUG at include/asm/spinlock.h:135!
invalid operand: 0000 [#1]
SMP
Modules linked in: nls_utf8 loop nfs nfsd exportfs lockd md5 ipv6
parport_pc lp parport autofs4 sunrpc e100 mii floppy sg scsi_mod
microcode dm_mod uhci_hcd ehci_hcd ohci_hcd button battery asus_acpi
ac ext3 jbd
CPU:    0
EIP:    0060:[<c02b8d71>]    Not tainted VLI
EFLAGS: 00010002   (2.6.8-1.549.hjl.0.0smp)
EIP is at _spin_lock+0x1f/0x39
eax: c02b8d52   ebx: c1104f58   ecx: c02cc5f5   edx: c02cc5f5
esi: deeaeb80   edi: c125e3a8   ebp: 20001024   esp: db2e2f68
ds: 007b   es: 007b   ss: 0068
Process master (pid: 1827, threadinfo=db2e2000 task=db31af10)
Stack: c125e398 c125e398 c125e398 c013fb15 c125e398 c1104e98 c125e398
deeaeb80
       c125e3a8 00000282 c013ff38 db2e2000 deeaeb80 dfe76280 db2e2000
c012c4fd
       dfe76280 00000000 00000000 c012c5c4 00000001 0000000b c02ba1df
00000001
Call Trace:
 [<c013fb15>] cache_flusharray+0x1a/0x9c
 [<c013ff38>] kfree+0x43/0x51
 [<c012c4fd>] set_current_groups+0x5f/0x65
 [<c012c5c4>] sys_setgroups+0x49/0x64
 [<c02ba1df>] syscall_call+0x7/0xb
Code: f0 81 02 00 00 00 01 30 c9 89 c8 c3 53 89 c3 52 52 81 78 04 ad
4e ad de 74 19 68 52 8d 2b c0 68 f5 c5 2c c0 e8 1f 71 e6 ff 59 58 <0f>
0b 87 00 ae b5 2c c0 f0 fe 0b 79 09 f3 90 80 3b 00 7e f9 eb
 <1>Unable to handle kernel NULL pointer dereference at virtual
address 0000000c printing eip:
c013fa5d
*pde = 1c2e5001
Oops: 0000 [#2]
SMP
Modules linked in: nls_utf8 loop nfs nfsd exportfs lockd md5 ipv6
parport_pc lp parport autofs4 sunrpc e100 mii floppy sg scsi_mod
microcode dm_mod uhci_hcd ehci_hcd ohci_hcd button battery asus_acpi
ac ext3 jbd
CPU:    1
EIP:    0060:[<c013fa5d>]    Not tainted VLI
EFLAGS: 00010006   (2.6.8-1.549.hjl.0.0smp)
EIP is at free_block+0x38/0xd6
eax: 00800000   ebx: 00000008   ecx: 00000000   edx: c1000000
esi: dfe96580   edi: 00000004   ebp: 00000000   esp: dfe5cf28
ds: 007b   es: 007b   ss: 0068
Process events/1 (pid: 7, threadinfo=dfe5c000 task=dfe5b6f0)
Stack: dbae8090 dbae8090 00000004 dbae8080 dfe96580 c01401ef c1411920
dfe96580
       dfe96118 dfe96674 c014028d 00000001 dfe06080 c1411920 c1411924
00000283
       dfe06080 c012d687 00000000 c014020b ffffffff ffffffff 00000001
00000000
Call Trace:
 [<c01401ef>] drain_array_locked+0x59/0x75
 [<c014028d>] cache_reap+0x82/0x1a1
 [<c012d687>] worker_thread+0x168/0x1d5
 [<c014020b>] cache_reap+0x0/0x1a1
 [<c011c669>] default_wake_function+0x0/0xc
 [<c011c669>] default_wake_function+0x0/0xc
 [<c012d51f>] worker_thread+0x0/0x1d5
 [<c0130b55>] kthread+0x73/0x9b
 [<c0130ae2>] kthread+0x0/0x9b
 [<c01041f1>] kernel_thread_helper+0x5/0xb
Code: 24 01 88 a0 00 00 00 39 fd 0f 8d b4 00 00 00 8b 04 24 8b 15 10
b8 41 c0 8b 0c a8 8d 81 00 00 00 40 c1 e8 0c c1 e0 05 8b 5c 10 1c <8b>
53 04 8b 03 89 50 04 89 02 31 d2 2b 4b 0c c7 03 00 01 10 00
 ------------[ cut here ]------------
kernel BUG at mm/rmap.c:473!
invalid operand: 0000 [#3]
SMP
Modules linked in: nls_utf8 loop nfs nfsd exportfs lockd md5 ipv6
parport_pc lp parport autofs4 sunrpc e100 mii floppy sg scsi_mod
microcode dm_mod uhci_hcd ehci_hcd ohci_hcd button battery asus_acpi
ac ext3 jbd
CPU:    1
EIP:    0060:[<c014a3e0>]    Not tainted VLI
EFLAGS: 00010286   (2.6.8-1.549.hjl.0.0smp)
EIP is at page_remove_rmap+0x23/0x4a
eax: ffffffff   ebx: 00002ed0   ecx: c140f100   edx: c105da00
esi: c105da00   edi: df74e030   ebp: 00000000   esp: db6bbebc
ds: 007b   es: 007b   ss: 0068
Process queryrpm (pid: 14285, threadinfo=db6bb000 task=dfe5b6f0)
Stack: c01443e5 02ed0067 00000000 00006000 b7a00000 c140f100 c030c080
00000002
       b7a00000 b7a71000 dab93df0 c140f100 c01444e1 00071000 00000000
b79f1000
       dfe247d8 b7a71000 c140f100 c0144540 00080000 00000000 db6bbf74
b79f1000
Call Trace:
 [<c01443e5>] zap_pte_range+0x206/0x2a9
 [<c01444e1>] zap_pmd_range+0x59/0x7c
 [<c0144540>] unmap_page_range+0x3c/0x5f
 [<c0144654>] unmap_vmas+0xf1/0x205
 [<c01481e2>] unmap_region+0x7e/0xef
 [<c0148485>] do_munmap+0x119/0x137
 [<c01484f5>] sys_munmap+0x52/0x6a
 [<c02ba1df>] syscall_call+0x7/0xb
 [<c02b007b>] xfrm_alloc_spi+0x118/0x15d
Code: 3a c0 ff 42 10 51 9d c3 89 c2 8b 00 f6 c4 08 74 08 0f 0b d6 01
1f dd 2c c0 f0 83 42 08 ff 0f 98 c0 84 c0 74 2c 8b 42 08 40 79 08 <0f>
0b d9 01 1f dd 2c c0 9c 59 fa b8 00 f0 ff ff 21 e0 8b 40 10
 <3>Debug: sleeping function called from invalid context at
include/linux/rwsem.h:43
in_atomic():1[expected: 0], irqs_disabled():0
 [<c011d9b2>] __might_sleep+0x7d/0x87
 [<c01204ae>] profile_task_exit+0x1a/0x4a
 [<c0121ab8>] do_exit+0x18/0x3bd
 [<c0106073>] do_divide_error+0x0/0xe6
 [<c0106345>] do_invalid_op+0x0/0xd5
 [<c0106345>] do_invalid_op+0x0/0xd5
 [<c0106411>] do_invalid_op+0xcc/0xd5
 [<c014a3e0>] page_remove_rmap+0x23/0x4a
 [<c013bf67>] free_hot_cold_page+0xbc/0xee
 [<c011c652>] scheduler_tick+0x3ce/0x3e5
 [<c02bacbb>] error_code+0x2f/0x38
 [<c014a3e0>] page_remove_rmap+0x23/0x4a
 [<c01443e5>] zap_pte_range+0x206/0x2a9
 [<c01444e1>] zap_pmd_range+0x59/0x7c
 [<c0144540>] unmap_page_range+0x3c/0x5f
 [<c0144654>] unmap_vmas+0xf1/0x205
 [<c01481e2>] unmap_region+0x7e/0xef
 [<c0148485>] do_munmap+0x119/0x137
 [<c01484f5>] sys_munmap+0x52/0x6a
 [<c02ba1df>] syscall_call+0x7/0xb
 [<c02b007b>] xfrm_alloc_spi+0x118/0x15d
note: queryrpm[14285] exited with preempt_count 1
bad: scheduling while atomic!
 [<c02b7cad>] schedule+0x2d/0x86b
 [<c01efdec>] poke_blanked_console+0x8f/0x9a
 [<c01ef1b2>] vt_console_print+0x294/0x2a5
 [<c01eef1e>] vt_console_print+0x0/0x2a5
 [<c011fc7f>] __call_console_drivers+0x36/0x40
 [<c02b8ba6>] rwsem_down_read_failed+0x143/0x162
 [<c01230d4>] .text.lock.exit+0x6b/0xcb
 [<c0106073>] do_divide_error+0x0/0xe6
 [<c0106345>] do_invalid_op+0x0/0xd5
 [<c0106345>] do_invalid_op+0x0/0xd5
 [<c0106411>] do_invalid_op+0xcc/0xd5
 [<c014a3e0>] page_remove_rmap+0x23/0x4a
 [<c013bf67>] free_hot_cold_page+0xbc/0xee
 [<c011c652>] scheduler_tick+0x3ce/0x3e5
 [<c02bacbb>] error_code+0x2f/0x38
 [<c014a3e0>] page_remove_rmap+0x23/0x4a
 [<c01443e5>] zap_pte_range+0x206/0x2a9
 [<c01444e1>] zap_pmd_range+0x59/0x7c
 [<c0144540>] unmap_page_range+0x3c/0x5f
 [<c0144654>] unmap_vmas+0xf1/0x205
 [<c01481e2>] unmap_region+0x7e/0xef
 [<c0148485>] do_munmap+0x119/0x137
 [<c01484f5>] sys_munmap+0x52/0x6a
 [<c02ba1df>] syscall_call+0x7/0xb
 [<c02b007b>] xfrm_alloc_spi+0x118/0x15d

Comment 1 Dave Jones 2004-11-27 21:49:20 UTC
is this repeatable in the *unmodified* 2.6.9 based update ?

Comment 2 H.J. Lu 2004-11-28 22:18:49 UTC
It didn't happen with current 2.6.9 based kernel.