Bug 189689 - Dell SC1420 x86_64 dual core server crashes frequently.
Summary: Dell SC1420 x86_64 dual core server crashes frequently.
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 4
Hardware: x86_64
OS: Linux
medium
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-04-23 03:55 UTC by Phil
Modified: 2007-11-30 22:11 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-04-27 03:08:19 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Phil 2006-04-23 03:55:10 UTC
Description of problem:

Dell SC1420 server running FC4 crashes frequently (usually under heavy load). 
This server's primary function is to backup several other systems using Bacula.  
This server routinely crashes, most often during full backups but the problem
occurs under other not yet identified situations.  It can usually be reproduced
by running stress 0.18.8 (using a variey of command line options) or
memtester-4.0.5 (by supplying 2 as it's only arg which is the # of Megabytes to
test-- if testing 1 MB there is no problem, anything larger and the server
usually crashes rather quickly).  

The SC1420 has a dual core x86_64 Xeon CPU, 2 GB RAM, (2) 233 GB SATA drives in
a JBOD configuration.  Although this server is under warranty from Dell they
will no longer support this problem because FC4 isn't a supported O/S.  They
have already swapped out the memory 3 times, motherboard twice and video card
once so that would seem to rule out any obvious hardware related problem.

Version-Release number of selected component (if applicable):

I've tried several kernels including:

kernel-2.6.15-1.1830_FC4
kernel-2.6.16-1.2069_FC4
kernel-2.6.16-1.2096_FC4
kernel-smp-2.6.16-1.2069_FC4
kernel-smp-2.6.16-1.2096_FC4
(also earlier kernels that are no longer installed).

Currently running: 2.6.16-1.2096_FC4smp #1 SMP Wed Apr 19 16:01:54 EDT 2006
x86_64 x86_64 x86_64 GNU/Linux

How reproducible:

Often enough and seems to depend on server activity.


Steps to Reproduce:
1. Run memtester-4.0.5 w/ 2, ie. "memtester 2"
2. -or- stress 0.18.8 supplying either --vm or --hdd parameters

  
Actual results:

Crash #1 using stress 0.18.8 with --vm 4


Apr 22 08:04:10 sdwbackup kernel: Unable to handle kernel paging request at
ffffabab31111136 RIP:
Apr 22 08:04:10 sdwbackup kernel: <ffffffff80177fae>{get_vma_policy+62}
Apr 22 08:04:10 sdwbackup kernel: PGD 0
Apr 22 08:04:10 sdwbackup kernel: Oops: 0002 [1] SMP
Apr 22 08:04:10 sdwbackup kernel: last sysfs file: /class/vc/vcsa7/dev
Apr 22 08:04:10 sdwbackup kernel: CPU 1
Apr 22 08:04:10 sdwbackup kernel: Modules linked in: parport_pc lp parport
autofs4 sunrpc acpi_cpufreq ipv6 dm_mod video button battery ac uhci_hcd
ehci_hcd e752x_edac edac_mc i2c_i801 i2c_core e1000 floppy ext3 jbd ata_piix
libata sd_mod scsi_mod
Apr 22 08:04:10 sdwbackup kernel: Pid: 3144, comm: stress Not tainted
2.6.16-1.2096_FC4smp #1
Apr 22 08:04:10 sdwbackup kernel: RIP: 0010:[<ffffffff80177fae>]
<ffffffff80177fae>{get_vma_policy+62}
Apr 22 08:04:10 sdwbackup kernel: RSP: 0000:ffff81006a363d78  EFLAGS: 00010246
Apr 22 08:04:10 sdwbackup kernel: RAX: 00002aaaba0da010 RBX: ffff81006e5d6550
RCX: 0000000000000000
Apr 22 08:04:10 sdwbackup kernel: RDX: 00002aaaba0da010 RSI: 0000000000000000
RDI: ffff8100770370c0
Apr 22 08:04:10 sdwbackup kernel: RBP: 0000000000000000 R08: ffff81006a8c2870
R09: 000000316d70c160
Apr 22 08:04:10 sdwbackup kernel: R10: 0000000000000000 R11: 0000000000000246
R12: ffff81006a8c2870
Apr 22 08:04:10 sdwbackup kernel: R13: 00000000000280d2 R14: ffff81006a8c2870
R15: ffff81006887e6d0
Apr 22 08:04:10 sdwbackup kernel: FS:  00002aaaaaabf720(0000)
GS:ffff81007fe26ec0(0000) knlGS:0000000000000000
Apr 22 08:04:10 sdwbackup kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 22 08:04:10 sdwbackup kernel: CR2: ffffabab31111136 CR3: 000000006e4da000
CR4: 00000000000006e0
Apr 22 08:04:10 sdwbackup kernel: Process stress (pid: 3144, threadinfo
ffff81006a362000, task ffff8100770370c0)
Apr 22 08:04:10 sdwbackup kernel: Stack: ffff81006a8c2870 ffffffff80178a91
ffff81006e5d6550 ffff81006e5d6550
Apr 22 08:04:10 sdwbackup kernel:        0000000000000000 00003ffffffff000
00002aaaba0da010 ffffffff8016ca46
Apr 22 08:04:10 sdwbackup kernel:        ffff81006a363f58 0000000100000000
Apr 22 08:04:10 sdwbackup kernel: Call Trace:
<ffffffff80178a91>{alloc_page_vma+33} <ffffffff8016ca46>{__handle_mm_fault+439}
Apr 22 08:04:10 sdwbackup kernel:        <ffffffff803541d5>{do_page_fault+1029}
<ffffffff8012babd>{__wake_up+56}
Apr 22 08:04:10 sdwbackup kernel:       
<ffffffff8012ed18>{default_wake_function+0} <ffffffff80352315>{_spin_unlock_irq+9}
Apr 22 08:04:10 sdwbackup kernel:        <ffffffff8035105d>{thread_return+171}
<ffffffff8010b93d>{error_exit+0}
Apr 22 08:04:10 sdwbackup kernel:
Apr 22 08:04:10 sdwbackup kernel: Code: c0 74 07 66 83 78 04 00 75 0e 48 c7 c0
60 d1 3d 80 48 85 f6
Apr 22 08:04:10 sdwbackup kernel: RIP <ffffffff80177fae>{get_vma_policy+62} RSP
<ffff81006a363d78>
Apr 22 08:04:10 sdwbackup kernel: CR2: ffffabab31111136
Apr 22 08:04:10 sdwbackup kernel:  <1>Unable to handle kernel paging request at
ffffabab2347d846 RIP:
Apr 22 08:04:10 sdwbackup kernel: <ffffffff80177fae>{get_vma_policy+62}
Apr 22 08:04:10 sdwbackup kernel: PGD 0
Apr 22 08:04:10 sdwbackup kernel: Oops: 0002 [2] SMP
Apr 22 08:04:10 sdwbackup kernel: last sysfs file: /class/vc/vcsa7/dev
Apr 22 08:04:10 sdwbackup kernel: CPU 0
Apr 22 08:04:10 sdwbackup kernel: Modules linked in: parport_pc lp parport
autofs4 sunrpc acpi_cpufreq ipv6 dm_mod video button battery ac uhci_hcd
ehci_hcd e752x_edac edac_mc i2c_i801 i2c_core e1000 floppy ext3 jbd ata_piix
libata sd_mod scsi_mod
Apr 22 08:04:10 sdwbackup kernel: Pid: 2100, comm: nifd Not tainted
2.6.16-1.2096_FC4smp #1
Apr 22 08:04:10 sdwbackup kernel: RIP: 0010:[<ffffffff80177fae>]
<ffffffff80177fae>{get_vma_policy+62}
Apr 22 08:04:10 sdwbackup kernel: RSP: 0018:ffff8100781a5c08  EFLAGS: 00010246
Apr 22 08:04:10 sdwbackup kernel: RAX: 00002aaaaaaac000 RBX: ffff81007cfc9550
RCX: 0000000000000000
Apr 22 08:04:10 sdwbackup kernel: RDX: 00002aaaaaaac000 RSI: 0000000000000000
RDI: ffff8100789d17e0
Apr 22 08:04:10 sdwbackup kernel: RBP: 0000000000000000 R08: ffff81007d017420
R09: 2020202020202065
Apr 22 08:04:10 sdwbackup kernel: R10: 2020202020202020 R11: 207c2d7265746e49
R12: ffff81007d017420
Apr 22 08:04:10 sdwbackup kernel: R13: 00000000000280d2 R14: ffff81007d017420
R15: ffff810078999560
Apr 22 08:04:10 sdwbackup kernel: FS:  00002aaaaaabf7e0(0000)
GS:ffffffff8051e000(0000) knlGS:0000000000000000
Apr 22 08:04:10 sdwbackup kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 22 08:04:10 sdwbackup kernel: CR2: ffffabab2347d846 CR3: 0000000078991000
CR4: 00000000000006e0
Apr 22 08:04:10 sdwbackup kernel: Process nifd (pid: 2100, threadinfo
ffff8100781a4000, task ffff8100789d17e0)
Apr 22 08:04:10 sdwbackup kernel: Stack: ffff81007d017420 ffffffff80178a91
ffff81007cfc9550 ffff81007cfc9550
Apr 22 08:04:10 sdwbackup kernel:        0000000000000000 00003ffffffff000
00002aaaaaaac000 ffffffff8016ca46
Apr 22 08:04:10 sdwbackup kernel:        0000000000000000 00000001801999d1
Apr 22 08:04:10 sdwbackup kernel: Call Trace:
<ffffffff80178a91>{alloc_page_vma+33} <ffffffff8016ca46>{__handle_mm_fault+439}
Apr 22 08:04:10 sdwbackup kernel:       
<ffffffff8019264a>{__link_path_walk+3503} <ffffffff803541d5>{do_page_fault+1029}
Apr 22 08:04:10 sdwbackup kernel:        <ffffffff80210b5c>{vsnprintf+1487}
<ffffffff801a282f>{seq_printf+102}
Apr 22 08:04:10 sdwbackup kernel:        <ffffffff80163346>{__alloc_pages+112}
<ffffffff8010b93d>{error_exit+0}
Apr 22 08:04:10 sdwbackup kernel:       
<ffffffff80211a5f>{copy_user_generic+63} <ffffffff801a2dcb>{seq_read+589}
Apr 22 08:04:10 sdwbackup kernel:        <ffffffff8018335d>{vfs_read+206}
<ffffffff80183e08>{sys_read+69}
Apr 22 08:04:10 sdwbackup kernel:        <ffffffff8010ab71>{tracesys+209}
Apr 22 08:04:10 sdwbackup kernel:
Apr 22 08:04:10 sdwbackup kernel: Code: c0 74 07 66 83 78 04 00 75 0e 48 c7 c0
60 d1 3d 80 48 85 f6
Apr 22 08:04:10 sdwbackup kernel: RIP <ffffffff80177fae>{get_vma_policy+62} RSP
<ffff8100781a5c08>
Apr 22 08:04:10 sdwbackup kernel: CR2: ffffabab2347d846
Apr 22 08:04:10 sdwbackup kernel:  <1>Unable to handle kernel paging request at
ffffabab2c687896 RIP:
Apr 22 08:04:10 sdwbackup kernel: <ffffffff80177fae>{get_vma_policy+62}
Apr 22 08:04:10 sdwbackup kernel: PGD 0
Apr 22 08:04:10 sdwbackup kernel: Oops: 0002 [3] SMP
Apr 22 08:04:10 sdwbackup kernel: last sysfs file: /class/vc/vcsa7/dev
Apr 22 08:04:10 sdwbackup kernel: CPU 0
Apr 22 08:04:10 sdwbackup kernel: Modules linked in: parport_pc lp parport
autofs4 sunrpc acpi_cpufreq ipv6 dm_mod video button battery ac uhci_hcd
ehci_hcd e752x_edac edac_mc i2c_i801 i2c_core e1000 floppy ext3 jbd ata_piix
libata sd_mod scsi_mod
Apr 22 08:04:10 sdwbackup kernel: Pid: 3147, comm: stress Not tainted
2.6.16-1.2096_FC4smp #1
Apr 22 08:04:10 sdwbackup kernel: RIP: 0010:[<ffffffff80177fae>]
<ffffffff80177fae>{get_vma_policy+62}
Apr 22 08:04:10 sdwbackup kernel: RSP: 0000:ffff81006a313d78  EFLAGS: 00010246
Apr 22 08:04:10 sdwbackup kernel: RAX: 00002aaab55c7010 RBX: ffff81006a31b550
RCX: 0000000000000000
Apr 22 08:04:10 sdwbackup kernel: RDX: 00002aaab55c7010 RSI: 0000000000000000
RDI: ffff8100770c0820
Apr 22 08:04:10 sdwbackup kernel: RBP: 0000000000000000 R08: ffff81006b209ee8
R09: 000000316d70c160
Apr 22 08:04:10 sdwbackup kernel: R10: 0000000000000000 R11: 0000000000000246
R12: ffff81006b209ee8
Apr 22 08:04:10 sdwbackup kernel: R13: 00000000000280d2 R14: ffff81006b209ee8
R15: ffff810037a1ae38
Apr 22 08:04:10 sdwbackup kernel: FS:  00002aaaaaabf720(0000)
GS:ffffffff8051e000(0000) knlGS:0000000000000000
Apr 22 08:04:10 sdwbackup kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 22 08:04:10 sdwbackup kernel: CR2: ffffabab2c687896 CR3: 000000006a33c000
CR4: 00000000000006e0
Apr 22 08:04:10 sdwbackup kernel: Process stress (pid: 3147, threadinfo
ffff81006a312000, task ffff8100770c0820)
Apr 22 08:04:10 sdwbackup kernel: Stack: ffff81006b209ee8 ffffffff80178a91
ffff81006a31b550 ffff81006a31b550
Apr 22 08:04:10 sdwbackup kernel:        0000000000000000 00003ffffffff000
00002aaab55c7010 ffffffff8016ca46
Apr 22 08:04:10 sdwbackup kernel:        ffff81006a313f58 0000000100000000
Apr 22 08:04:10 sdwbackup kernel: Call Trace:
<ffffffff80178a91>{alloc_page_vma+33} <ffffffff8016ca46>{__handle_mm_fault+439}
Apr 22 08:04:10 sdwbackup kernel:        <ffffffff803541d5>{do_page_fault+1029}
<ffffffff8012babd>{__wake_up+56}
Apr 22 08:04:10 sdwbackup kernel:       
<ffffffff8012ed18>{default_wake_function+0} <ffffffff80352315>{_spin_unlock_irq+9}
Apr 22 08:04:10 sdwbackup kernel:        <ffffffff80350fb2>{thread_return+0}
<ffffffff80351010>{thread_return+94}
Apr 22 08:04:10 sdwbackup kernel:        <ffffffff8010b93d>{error_exit+0}
Apr 22 08:04:10 sdwbackup kernel:
Apr 22 08:04:10 sdwbackup kernel: Code: c0 74 07 66 83 78 04 00 75 0e 48 c7 c0
60 d1 3d 80 48 85 f6
Apr 22 08:04:10 sdwbackup kernel: RIP <ffffffff80177fae>{get_vma_policy+62} RSP
<ffff81006a313d78>
Apr 22 08:04:11 sdwbackup kernel: CR2: ffffabab2c687896
Apr 22 08:44:49 sdwbackup kernel:  <0>BUG: spinlock bad magic on CPU#1,
gnome-terminal/3577 (Not tainted)
Apr 22 08:44:49 sdwbackup kernel: Unable to handle kernel paging request at
000000010000011b RIP:
Apr 22 08:44:49 sdwbackup kernel: <ffffffff80213339>{spin_bug+138}
Apr 22 08:44:49 sdwbackup kernel: PGD 3009b067 PUD 0
Apr 22 08:44:49 sdwbackup kernel: Oops: 0000 [4] SMP
Apr 22 08:44:49 sdwbackup kernel: last sysfs file: /class/vc/vcs7/dev
Apr 22 08:44:49 sdwbackup kernel: CPU 1
Apr 22 08:44:49 sdwbackup kernel: Modules linked in: parport_pc lp parport
autofs4 sunrpc acpi_cpufreq ipv6 dm_mod video button battery ac uhci_hcd
ehci_hcd e752x_edac edac_mc i2c_i801 i2c_core e1000 floppy ext3 jbd ata_piix
libata sd_mod scsi_mod
Apr 22 08:44:49 sdwbackup kernel: Pid: 3577, comm: gnome-terminal Not tainted
2.6.16-1.2096_FC4smp #1
Apr 22 08:44:49 sdwbackup kernel: RIP: 0010:[<ffffffff80213339>]
<ffffffff80213339>{spin_bug+138}
Apr 22 08:44:49 sdwbackup kernel: RSP: 0018:ffff810045ebfec8  EFLAGS: 00010206
Apr 22 08:44:49 sdwbackup kernel: RAX: 0000000000000001 RBX: 00000000ffffffff
RCX: ffffffff803d62f8
Apr 22 08:44:49 sdwbackup kernel: RDX: 0000000000000001 RSI: 0000000000000292
RDI: ffffffff803d62e0
Apr 22 08:44:49 sdwbackup kernel: RBP: ffff8100704e1a70 R08: 0000000000000002
R09: 0000000000000000
Apr 22 08:44:49 sdwbackup kernel: R10: 000000003b9aca00 R11: ffff81007ee81b80
R12: ffffffff803865cc
Apr 22 08:44:49 sdwbackup kernel: R13: ffff81003f366be0 R14: 0000000000000c18
R15: 0000000000000000
Apr 22 08:44:49 sdwbackup kernel: FS:  00002aaaaaad53a0(0000)
GS:ffff81007fe26ec0(0000) knlGS:0000000000000000
Apr 22 08:44:49 sdwbackup kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 22 08:44:49 sdwbackup kernel: CR2: 000000010000011b CR3: 00000000551df000
CR4: 00000000000006e0
Apr 22 08:44:49 sdwbackup kernel: Process gnome-terminal (pid: 3577, threadinfo
ffff810045ebe000, task ffff8100555077e0)
Apr 22 08:44:49 sdwbackup kernel: Stack: ffff8100704e1a70 ffff8100704e1a78
0000000000000002 ffffffff80213655
Apr 22 08:44:49 sdwbackup kernel:        ffff8100704e1a70 ffffffff801bb733
ffff8100704e1a70 0000000000000c18
Apr 22 08:44:49 sdwbackup kernel:        0000000000000002 ffffffff80183593
Apr 22 08:44:49 sdwbackup kernel: Call Trace:
<ffffffff80213655>{_raw_spin_lock+25} <ffffffff801bb733>{dnotify_parent+32}
Apr 22 08:44:49 sdwbackup kernel:        <ffffffff80183593>{vfs_write+287}
<ffffffff80183e79>{sys_write+69}
Apr 22 08:44:49 sdwbackup kernel:        <ffffffff8010ab71>{tracesys+209}
Apr 22 08:44:49 sdwbackup kernel:
Apr 22 08:44:49 sdwbackup kernel: Code: 44 8b 83 1c 01 00 00 48 8d 8b 00 03 00
00 8b 55 04 41 89 c1
Apr 22 08:44:49 sdwbackup kernel: RIP <ffffffff80213339>{spin_bug+138} RSP
<ffff810045ebfec8>
Apr 22 08:44:49 sdwbackup kernel: CR2: 000000010000011b
Apr 22 08:47:04 sdwbackup kernel:  <0>BUG: spinlock lockup on CPU#1, X/3369,
ffff81007b3aa7b8 (Not tainted)
Apr 22 08:47:04 sdwbackup kernel:
Apr 22 08:47:04 sdwbackup kernel: Call Trace:
<ffffffff80213738>{_raw_spin_lock+252} <ffffffff801bb733>{dnotify_parent+32}
Apr 22 08:47:04 sdwbackup kernel:        <ffffffff801833ae>{vfs_read+287}
<ffffffff80183e08>{sys_read+69}
Apr 22 08:47:04 sdwbackup kernel:        <ffffffff8010ab71>{tracesys+209}





Crash #2 using stress 0.18.8 with --hdd 2:


Apr 22 17:06:06 sdwbackup kernel: gam_server[3368]: segfault at 0000000000000010
rip 00000031707229b1 rsp 00007fffffa09c50 error 4
Apr 22 17:07:09 sdwbackup kernel: gam_server[3457]: segfault at 0000000000000010
rip 00000031707229b1 rsp 00007fffffe360b0 error 4
Apr 22 17:08:16 sdwbackup kernel: Unable to handle kernel paging request at
00007fffffc33678 RIP:
Apr 22 17:08:16 sdwbackup kernel: [<00007fffffc33678>]
Apr 22 17:08:16 sdwbackup kernel: PGD 6b00a067 PUD 6aae4067 PMD 6aae6067 PTE
800000006aae7067
Apr 22 17:08:16 sdwbackup kernel: Oops: 0011 [1] SMP
Apr 22 17:08:16 sdwbackup kernel: last sysfs file: /class/vc/vcs7/dev
Apr 22 17:08:16 sdwbackup kernel: CPU 0
Apr 22 17:08:16 sdwbackup kernel: Modules linked in: parport_pc lp parport
autofs4 sunrpc acpi_cpufreq ipv6 dm_mod video button battery ac uhci_hcd
ehci_hcd e752x_edac edac_mc i2c_i801 i2c_core e1000 floppy ext3 jbd ata_piix
libata sd_mod scsi_mod
Apr 22 17:08:16 sdwbackup kernel: Pid: 3468, comm: gam_server Not tainted
2.6.16-1.2096_FC4smp #1
Apr 22 17:08:16 sdwbackup kernel: RIP: 0010:[<00007fffffc33678>]
[<00007fffffc33678>]
Apr 22 17:08:16 sdwbackup kernel: RSP: 0018:ffff81006ae61e50  EFLAGS: 00010246
Apr 22 17:08:16 sdwbackup kernel: RAX: 0000000000000000 RBX: 0000000000000000
RCX: 0000000000000000
Apr 22 17:08:16 sdwbackup kernel: RDX: 0000000000020003 RSI: ffff81006ae61e78
RDI: 00007fffffc337b0
Apr 22 17:08:16 sdwbackup kernel: RBP: ffff81006ae61f58 R08: 0000000000000000
R09: 0000000000000000
Apr 22 17:08:16 sdwbackup kernel: R10: 0000000000020003 R11: ffffffff80349ae1
R12: ffff810075e47678
Apr 22 17:08:16 sdwbackup kernel: R13: 0000000000000000 R14: 0000000000000000
R15: ffff81006ae61e78
Apr 22 17:08:16 sdwbackup kernel: FS:  00002aaaaaabf720(0000)
GS:ffffffff8051e000(0000) knlGS:0000000000000000
Apr 22 17:08:16 sdwbackup kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 22 17:08:16 sdwbackup kernel: CR2: 00007fffffc33678 CR3: 000000006de52000
CR4: 00000000000006e0
Apr 22 17:08:16 sdwbackup kernel: Process gam_server (pid: 3468, threadinfo
ffff81006ae60000, task ffff810075e47080)
Apr 22 17:08:16 sdwbackup kernel: Stack: ffffffff8010a341 0000000000000282
0000000000000022 0000002200000296
Apr 22 17:08:16 sdwbackup kernel:        ffff810075e47080 0000000000000022
ffffffff00020003 0000000000000441
Apr 22 17:08:16 sdwbackup kernel:        ffff81000000000b 00000000000fffff
Apr 22 17:08:16 sdwbackup kernel: Call Trace: <ffffffff8010a341>{do_signal+619}
<ffffffff80182fd6>{do_sync_write+199}
Apr 22 17:08:16 sdwbackup kernel:        <ffffffff8010ac7a>{int_signal+18}
Apr 22 17:08:16 sdwbackup kernel:
Apr 22 17:08:16 sdwbackup kernel: Code:  Bad RIP value.
Apr 22 17:08:16 sdwbackup kernel: RIP [<00007fffffc33678>] RSP <ffff81006ae61e50>
Apr 22 17:08:16 sdwbackup kernel: CR2: 00007fffffc33678




Crash #3 using memtester-4.0.5 with "2":


Apr 22 20:39:03 sdwbackup kernel: Unable to handle kernel paging request at
ffff91007fac3e30 RIP:
Apr 22 20:39:03 sdwbackup kernel: <ffffffff801bb737>{dnotify_parent+36}
Apr 22 20:39:03 sdwbackup kernel: PGD 0
Apr 22 20:39:03 sdwbackup kernel: Oops: 0000 [1] SMP
Apr 22 20:39:03 sdwbackup kernel: last sysfs file: /class/vc/vcsa5/dev
Apr 22 20:39:03 sdwbackup kernel: CPU 0
Apr 22 20:39:03 sdwbackup kernel: Modules linked in: ipv6 parport_pc lp parport
autofs4 sunrpc acpi_cpufreq dm_mod video button battery ac uhci_hcd ehci_hcd
e752x_edac edac_mc i2c_i801 i2c_core e1000 floppy ext3 jbd ata_piix libata
sd_mod scsi_mod
Apr 22 20:39:03 sdwbackup kernel: Pid: 3202, comm: memtester Not tainted
2.6.16-1.2096_FC4smp #1
Apr 22 20:39:03 sdwbackup kernel: RIP: 0010:[<ffffffff801bb737>]
<ffffffff801bb737>{dnotify_parent+36}
Apr 22 20:39:03 sdwbackup kernel: RSP: 0018:ffff810022e39ef8  EFLAGS: 00010202
Apr 22 20:39:03 sdwbackup kernel: RAX: ffff81007dc7e7e0 RBX: ffff91007fac3e10
RCX: 0000000000000000
Apr 22 20:39:03 sdwbackup kernel: RDX: 00000000ffffffff RSI: 0000000000000002
RDI: ffff81007fd31a78
Apr 22 20:39:03 sdwbackup kernel: RBP: ffff81007fd31a78 R08: 0000000000000000
R09: ffff8100000bb566
Apr 22 20:39:03 sdwbackup kernel: R10: 000000000000001f R11: ffffffff802225ad
R12: 0000000000000002
Apr 22 20:39:03 sdwbackup kernel: R13: ffff81007ef43130 R14: 0000000000000016
R15: 0000000000000003
Apr 22 20:39:03 sdwbackup kernel: FS:  00002aaaaaabf1a0(0000)
GS:ffffffff8051e000(0000) knlGS:0000000000000000
Apr 22 20:39:03 sdwbackup kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 22 20:39:03 sdwbackup kernel: CR2: ffff91007fac3e30 CR3: 0000000068370000
CR4: 00000000000006e0
Apr 22 20:39:03 sdwbackup kernel: Process memtester (pid: 3202, threadinfo
ffff810022e38000, task ffff81007dc7e7e0)
Apr 22 20:39:03 sdwbackup kernel: Stack: ffff81007fd31a70 0000000000000016
0000000000000002 ffffffff80183593
Apr 22 20:39:03 sdwbackup kernel:        ffff8100765d20c0 0000000000000016
fffffffffffffff7 00002aaaaaaac000
Apr 22 20:39:03 sdwbackup kernel:        00002aaaaaac0000 ffffffff80183e79
Apr 22 20:39:03 sdwbackup kernel: Call Trace: <ffffffff80183593>{vfs_write+287}
<ffffffff80183e79>{sys_write+69}
Apr 22 20:39:03 sdwbackup kernel:        <ffffffff8010ab71>{tracesys+209}
Apr 22 20:39:03 sdwbackup kernel:
Apr 22 20:39:03 sdwbackup kernel: Code: 48 8b 43 20 4c 85 a0 70 02 00 00 74 33
8b 03 85 c0 75 0a 0f
Apr 22 20:39:03 sdwbackup kernel: RIP <ffffffff801bb737>{dnotify_parent+36} RSP
<ffff810022e39ef8>
Apr 22 20:39:03 sdwbackup kernel: CR2: ffff91007fac3e30



Expected results:

No server crashes

Additional info:

I've been able to crash the server in single user, multi user and X11 run modes.

Comment 1 Phil 2006-04-27 03:08:19 UTC
This problem is likely an issue w/ the L2 cache on the CPU.  Dell is sending a
replacement server for this one so hopefully that will resolve this issue.


Note You need to log in before you can comment on or make changes to this bug.