Bug 189508 - Running SPECsfs (NFS) to a 4socket Xeon dual core w/HT server running largeSMP (16 logical CPUs) panics during data laydown to 16 ext3 filesystems
Running SPECsfs (NFS) to a 4socket Xeon dual core w/HT server running largeSM...
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
x86_64 Linux
high Severity high
: ---
: ---
Assigned To: Eric Sandeen
Brian Brock
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-04-20 13:05 EDT by Barry Marson
Modified: 2007-11-30 17:07 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-11-17 13:51:54 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
crash log file for SPECsfs run RHEL4-U3-largeSMP 16CPU 256 nfsd threads ext3 (282.43 KB, text/plain)
2006-05-06 00:16 EDT, Barry Marson
no flags Details

  None (edit)
Description Barry Marson 2006-04-20 13:05:17 EDT
Description of problem:

The kernel is panicing when running the SPECsfs (NFS) benchmark.  The
configuration is 4 clients each communicating to the server via GigE with jumbo
frames enabled.  Each client communicates to an exclusive NIC/subnet on the server.

The server is a 4 socket dual core / HT Xeons, with 16GB of memory.  The kernel
is RHEL4-U3 x86_64 largeSMP since we are at 16 logical CPU's.  There are 16 ext3
filesystems, created from storage presented by 2 FC (dual ported) adapters and 4
HP MSA's.  Each MSA presents a single large LUN that is partitioned by RHEL into
4 partitions.  There are 256 NFS threads running.

I have setup netdump but only seem to be able to get the console log of the
failure.  I'm investigating why we don't get a core file.  Running the benchmark
with the largeSMP kernel but with 8 logical processors booted succeeds.   The
console log is in the "Actual results" below.


Version-Release number of selected component (if applicable):

RHEL4-U3 largeSMP

How reproducible:

So far every time.  It can take an hour or 5 hours.  It depends which work load
the benchmark is laying down.

Steps to Reproduce:
1. See Barry Marson.  Procedure is very simple
2.
3.
  
Actual results:

Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP:
<ffffffffa00aefc6>{:jbd:journal_dirty_metadata+71}
PML4 3f0302067 PGD 3f03e4067 PMD 3eeac4067 PTE 0
Oops: 0000 [1] SMP
CPU 13
Modules linked in: nfsd exportfs lockd nfs_acl md5 ipv6 parport_pc lp parport
netconsole netdump aut
ofs4 i2c_dev i2c_core sunrpc ds yenta_socket pcmcia_core dm_multipath button
battery ac uhci_hcd ehc
i_hcd hw_random e1000 tg3 dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2300
qla2xxx scsi_transpo
rt_fc cciss sd_mod scsi_mod
Pid: 9262, comm: nfsd Not tainted 2.6.9-34.ELlargesmp
RIP: 0010:[<ffffffffa00aefc6>] <ffffffffa00aefc6>{:jbd:journal_dirty_metadata+71}
RSP: 0018:000001025d071b58  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000010070baa6c0 RCX: 00000000ffffffff
RDX: 000000000000000f RSI: 000001021c6a39d0 RDI: 00000101dbe4d580
RBP: 000001021c6a39d0 R08: 0000000000000000 R09: 000001000f10af80
R10: 000001021c6a39d0 R11: 000001021c6a39d0 R12: 0000000000000000
R13: 00000103ff737e00 R14: 00000101dbe4d580 R15: 000001037b38ba08
FS:  0000002a958a0b00(0000) GS:ffffffff804eb100(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 00000000dfc7a000 CR4: 00000000000006e0
Process nfsd (pid: 9262, threadinfo 000001025d070000, task 00000103fe0cb7f0)
Stack: 000001039f334000 0000000000000000 000001037b38bb18 000001037b38bb18
       0000000000008000 ffffffffa00c68f5 000001021c6a39d0 000001025d071bd8
       00000101dbe4d580 000001037b38bb18
Call Trace:<ffffffffa00c68f5>{:ext3:ext3_mark_iloc_dirty+740}
       <ffffffffa00c6a47>{:ext3:ext3_mark_inode_dirty+65}
       <ffffffffa00c4f3a>{:ext3:ext3_new_inode+2867}
<ffffffffa00ae3c4>{:jbd:start_this_handle+964}
       <ffffffff8013347f>{__wake_up+54} <ffffffffa00cafe0>{:ext3:ext3_create+102}
       <ffffffff80185a77>{vfs_create+214}
<ffffffffa0282f89>{:nfsd:nfsd_create_v3+811}
       <ffffffffa0289bdc>{:nfsd:nfsd3_proc_create+307}
<ffffffffa027d7bd>{:nfsd:nfsd_dispatch+219}
       <ffffffffa019a39e>{:sunrpc:svc_process+1197}
<ffffffff801333d8>{default_wake_function+0}
       <ffffffffa027d2fc>{:nfsd:nfsd+0} <ffffffffa027d534>{:nfsd:nfsd+568}
       <ffffffff8013212e>{schedule_tail+55} <ffffffff80110e17>{child_rip+8}
       <ffffffffa027d2fc>{:nfsd:nfsd+0} <ffffffffa027d2fc>{:nfsd:nfsd+0}
       <ffffffff80110e0f>{child_rip+0}

Code: 49 39 5c 24 20 75 4b 41 83 7c 24 0c 02 75 43 49 3b 5d 50 0f
RIP <ffffffffa00aefc6>{:jbd:journal_dirty_metadata+71} RSP <000001025d071b58>
CR2: 0000000000000020

Expected results:


Additional info:

What ever is failing is not occuring right before the panic.  One side effect of
running this is benchmark with the 16 logical processors is the rate at which
data laydown occurs is significantly diminished.  in the 8CPU config, the data
rate is about 200MB/sec.  With the 16 CPUs it drops to 130MB/sec
Comment 2 Jay Turner 2006-05-02 21:44:44 EDT
Any clue if this is a regression?
Comment 3 Jay Turner 2006-05-02 21:47:57 EDT
QE ack for fixing this in U4.  Is significant enough to get resolves after the
code freeze in my opinion.
Comment 4 Steve Dickson 2006-05-05 06:21:47 EDT
If was going to speculate, I would say no this is not a regression since
this is the first time we've tried to run the SPECsfs benchmark.

But with that said... on the nahant mailing list a server crash has
recently been reported that happen in a different place but in
the same journal code... 

https://www.redhat.com/archives/nahant-list/2006-May/msg00041.html
Comment 5 Barry Marson 2006-05-06 00:16:54 EDT
Created attachment 128685 [details]
crash log file for SPECsfs run RHEL4-U3-largeSMP 16CPU 256 nfsd threads ext3
Comment 6 Barry Marson 2006-05-06 00:24:11 EDT
Running with NFSD set to 64 ran to completion but the results are not 
comparable yet because we dont have enough threads to handle the incoming 
requests at the high end of the benchmark.

The attachement below is a crash log when running with 128 NFSD threads.  We 
got significantly more crash stack data.  vmcore-incomplete was created but no 
data was written.

I will try and reduce the threads to a level where I can get a core dump.

I successfully ran with ext2 and 256 NFSD threads but the performance was 
abysmal.  Negative scaling (16 CPU) compared to 8CPU run.
Comment 7 Barry Marson 2006-05-06 00:28:59 EDT
Man it's getting late.  An early run today that I thought had died due to the 
benchmark (sometimes happens) actually died from the panic.  This time I have 
a log AND a vmcore.  It's 16GB, what do I do with it.  This was with 128 NFS 
threads :)

barry
Comment 9 Barry Marson 2006-05-13 11:14:33 EDT
The core file has been pushed to

http://ubrew.boston.redhat.com/benchmarks/SPEC/SPECsfs/bugzilla-189508/

Its compressed to under 1GB but expands to the 16GB on the test system.  Have 
verified that is accessible.

Barry
Comment 16 Eric Sandeen 2006-11-15 13:39:33 EST
Barry, can you reproduce this one?  I had the hunch that it was a dup of
Bugzilla Bug 199667: ext3 file system crashed in my IA64 box, which is fixed in
the latest release.  Since you said you could reproduce it at will, should be
easy enough to verify that it is fixed.

Thanks,
-Eric
Comment 17 Barry Marson 2006-11-16 21:33:56 EST
I have not been able to reproduce this.  The system was re-installed this past
summer and even though the bits were technically the same, the problem which use
to be easily recreated no longer fails.  Nor does this fail with RHEL4-U4.

I'm closing this since I cannot recreate it.

Barry
Comment 18 Eric Sandeen 2006-11-17 13:51:54 EST
worksforbarry :)

Note You need to log in before you can comment on or make changes to this bug.