Bug 149543

Summary: The writepage() race
Product: Red Hat Enterprise Linux 3 Reporter: Wendy Cheng <nobody+wcheng>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED WONTFIX QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 3.0CC: anderson, cel, k.georgiou, petrides, riel, sct, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 19:07:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Test program to reproduce the hang. none

Description Wendy Cheng 2005-02-23 21:14:54 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.3) Gecko/20040924

Description of problem:
There seems to have a race condition around writepage() call between kswapd and kupdated (or any flushing thread) when NFS files are "mmap"ed. The system would either hang or panic under heavy workload.
                                                                                                                      
(1) the hang:
                                                                                                                      
It can be recreated (test program will be uploaded) and made disappeared by hacking launder_page() as:
                                                                                                                      
--- vmscan.c.orig       2005-02-07 16:05:26.000000000 -0500
+++ vmscan.c    2005-02-07 16:22:19.000000000 -0500
@@ -384,7 +384,7 @@
       }
       pte_chain_unlock(page);
                                                                                                                      
-       if (PageDirty(page) && page->mapping) {
+       if (PageDirty(page) && is_page_cache_freeable(page) && page->mapping) {
               /*
                * The page can be dirtied after we start writing, but
                * in that case the dirty bit will simply be set again
                                                                                                                      
(2) the panic

We have two vmcores for far. First panic at: 
                                                                                                                      
Unable to handle kernel paging request at virtual address aef90690
printing eip:
c0147ee0
*pde = 00000000
Oops: 0000
netconsole nfs lockd sunrpc autofs tg3 sg keybdev mousedev hid input usb-ohci usbcore ext3 jbd cciss sd_mod scsi_mod  
CPU:    2
EIP:    0060:[<c0147ee0>]    Not tainted
EFLAGS: 00010297

EIP is at filemap_fdatawait [kernel] 0x30 (2.4.21-27.ELsmp/i686)
eax: c03edb94   ebx: aef90690   ecx: 00000000   edx: 00000004
esi: e58b2a44   edi: e58b2a54   ebp: 00000000   esp: cb91df90
ds: 0068   es: 0068   ss: 0068
Process kupdated (pid: 14, stackpage=cb91d000)
Stack: f8a4978f 00000003 e58b2980 cd934c64 cd934c00 c017f7ae e58b2a44 00000000
      cb91c000 c02be0e3 cb91c555 00000000 c0169a68 00000000 00000000 cb91c000
      cb91c000 c0169e66 c03b3000 00000001 00000068 c0169d80 00000000 00000000
Call Trace:   [<f8a4978f>] nfs_write_inode [nfs] 0x2f (0xcb91df90)
[<c017f7ae>] sync_unlocked_inodes [kernel] 0xfe (0xcb91dfa4)
[<c0169a68>] sync_old_buffers [kernel] 0x28 (0xcb91dfc0)
[<c0169e66>] kupdate [kernel] 0xe6 (0xcb91dfd4)
[<c0169d80>] kupdate [kernel] 0x0 (0xcb91dfe4)
[<c01095ad>] kernel_thread_helper [kernel] 0x5 (0xcb91dff0

The second one:

Unable to handle kernel paging request at virtual address b318967b
printing eip:
c01545b1
*pde = 81a4000e
Oops: 0002
nfs lockd sunrpc netconsole autofs tg3 sg keybdev mousedev hid input usb-ohci usbcore ext3 jbd cciss sd_mod scsi_mod
CPU:    1
EIP:    0060:[<c01545b1>]    Not tainted
EFLAGS: 00010246
                                                                               
EIP is at launder_page [kernel] 0x81 (2.4.21-27.ELsmp/i686)
eax: c03a8240   ebx: c2efbfe4   ecx: c03a7080   edx: b318967b
esi: c2efc000   edi: 00000001   ebp: c03a7080   esp: cbb63f64
ds: 0068   es: 0068   ss: 0068
Process kswapd (pid: 11, stackpage=cbb63000)
Stack: c0169897 00000000 c2efbff8 00000000 0013882a c03a7080 00000001 00000040
      c015657b c03a7080 000001d0 c2efbfe4 c03a8240 00000026 c03a7080 00056aeb
      00000001 00000040 c0156b7b c03a7080 00000100 000001d0 00057a5b 00000000
Call Trace:   [<c0169897>] try_to_free_buffers [kernel] 0x147 (0xcbb63f64)
[<c015657b>] rebalance_dirty_zone [kernel] 0xab (0xcbb63f84)
[<c0156b7b>] do_try_to_free_pages_kswapd [kernel] 0x1eb (0xcbb63fac)
[<c0156ca8>] kswapd [kernel] 0x68 (0xcbb63fd0)
[<c0156c40>] kswapd [kernel] 0x0 (0xcbb63fe4)
[<c01095ad>] kernel_thread_helper [kernel] 0x5 (0xcbb63ff0)
                                                                               
Another flushing thread:
kupdated      S 00000003  3700    14      1            15    12 (L-TLB)
Call Trace:   [<c0123f14>] schedule [kernel] 0x2f4 (0xcb91df58)
[<c0134f65>] schedule_timeout [kernel] 0x65 (0xcb91df9c)
[<c016b261>] sync_supers [kernel] 0x131 (0xcb91dfa4)
[<c0134ef0>] process_timeout [kernel] 0x0 (0xcb91dfbc)
[<c0169e0f>] kupdate [kernel] 0x8f (0xcb91dfd4)
[<c0169d80>] kupdate [kernel] 0x0 (0xcb91dfe4)
[<c01095ad>] kernel_thread_helper [kernel] 0x5 (0xcb91dff0)

Version-Release number of selected component (if applicable):
kernel-2.4.21-27.EL

How reproducible:
Always

Steps to Reproduce:
(Thanks to Anthony Golia for this reproducer)

1. Mount a relatively large nfs file system; cd to it.
2. Compile the will-be-uploaded mmapwrite.c
3. Run 10 of these procs for every gig of ram you have. Say on an 8G box:

typeset -i i
i=0

while [[ $i -lt 80 ]] ;
do
  mmapwrite $i 100 > $i.out &
   i=$i+1
done

4. Wait for a while (an hour or so) - when free mem starts to drop, the hang would occur. 
  

Actual Results:  Hang or panic

Expected Results:  No hang/panic.

Additional info:

Comment 1 Wendy Cheng 2005-02-23 21:20:34 UTC
Created attachment 111352 [details]
Test program to reproduce the hang.

Test program to recreate the hang.

Comment 2 Wendy Cheng 2005-02-23 21:25:20 UTC
Created attachment 111353 [details]
An experimental kernel patch

Larry Woodman drafted a test patch for this issue but it *doesn't* stop the
hang. 

A test kernel with this patch has been installed in customer's test machine -
waiting for their result.

Comment 4 Wendy Cheng 2005-02-24 15:55:56 UTC
Look like GFS can trigger this too (a GFS crash via IT#65377):

Unable to handle kernel paging request at virtual address 00460032
printing eip:
c0154641
*pde = 00000000
Oops: 0002
netconsole gfs lock_gulm crc32 lock_harness pool e1000 bonding ipt_REJECT
ipt_state ip_conntrack iptable_filter ip_tables floppy microcode loop lvm-mod keybde
CPU:    0
EIP:    0060:[<c0154641>]    Not tainted
EFLAGS: 00010246

EIP is at launder_page [kernel] 0x81 (2.4.21-27.0.2.ELsmp/i686)
eax: c03a8240   ebx: c48e5fe0   ecx: c03a7080   edx: 00460032
esi: c48e5ffc   edi: 000011c4   ebp: c03a7080   esp: c6671f64
ds: 0068   es: 0068   ss: 0068
Process kswapd (pid: 11, stackpage=c6671000)
Stack: c0134ef0 00000000 c48e5ff4 00000000 0008bd88 c03a7080 000011c4 00000001
      c015660b c03a7080 000001d0 c48e5fe0 c03a8240 00000067 c03a7080 00026bdd
      00000001 00000040 c0156c0b c03a7080 00000100 000001d0 0002fb14 00000000
Call Trace:   [<c0134ef0>] process_timeout [kernel] 0x0 (0xc6671f64)
[<c015660b>] rebalance_dirty_zone [kernel] 0xab (0xc6671f84)
[<c0156c0b>] do_try_to_free_pages_kswapd [kernel] 0x1eb (0xc6671fac)
[<c0156d38>] kswapd [kernel] 0x68 (0xc6671fd0)
[<c0156cd0>] kswapd [kernel] 0x0 (0xc6671fe4)
[<c01095ad>] kernel_thread_helper [kernel] 0x5 

Comment 7 Wendy Cheng 2005-02-25 22:17:23 UTC
1) Conf call with Anthony Golia (the customer that has the hang) - we discussed
the possibility that the deadlock might be caused by the network stacks short of
memory to ship the dirty pages back to NFS server (and that's the only way to
free these pages). 

Two hours later, he reported that the hang went away via the following vm tuning:

vm.kswapd='16384 1024 256'
vm.pagecache='2 15 30'

2) The panic customer hasn't reported any new panic (they used to crash on a
daily basis) using Larry's patch. 

In short, I think we're in good shape on the problem so far. Will double-check
the status sometime next week. 

Comment 9 Wendy Cheng 2005-03-08 16:26:14 UTC
The test kernel with Larry's patch passed customer's QA testing and is added
into their production system this morning. Waiting for further results. So far
so good. 

Comment 10 Larry Woodman 2005-03-11 03:33:12 UTC
Any further results on this testing?

Larry

Comment 14 Dave Anderson 2005-04-14 20:11:01 UTC
Here's the latest info that I posted in the associated IT, re: today's vmcore,
which was still running stock 2.4.21-27.  

In this dumpfile, kswapd was simply trying to free an available slab cache page:

crash> bt
PID: 11     TASK: cbb62000  CPU: 3   COMMAND: "kswapd"
#0 [cbb63cc0] netconsole_netdump at f8a1c77a
#1 [cbb63e58] try_crashdump at c0128c83
#2 [cbb63e68] die at c010c672
#3 [cbb63e7c] do_page_fault at c011fff9
#4 [cbb63f40] error_code (via page_fault) at c03f21c0
   EAX: cbb3f648  EBX: cbb3f638  ECX: e7756000  EDX: 00000000  EBP: 00000040
   DS:  0068      ESI: 0000071e  ES:  0068      EDI: cbb3f648
   CS:  0060      EIP: c0151a9f  ERR: ffffffff  EFLAGS: 00010016
#5 [cbb63f7c] __kmem_cache_shrink_locked at c0151a9f
#6 [cbb63f94] kmem_cache_shrink at c0151b74
#7 [cbb63fa0] shrink_dcache_memory at c017e008
#8 [cbb63fac] do_try_to_free_pages_kswapd at c0156adb
#9 [cbb63fd0] kswapd at c0156ca3
#10 [cbb63ff0] kernel_thread_helper at c01095ab
crash>

A full "kmem -s" showed that the dentry cache, which is the one
being shrunk, has a corrupted slab; just looking at it alone shows:

crash> kmem -s dentry_cache
CACHE    NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
kmem: dentry_cache: free list: slab: e7756000  bad prev pointer: 0
kmem: dentry_cache: free list: slab: e7756000  bad s_mem pointer: 0
cbb3f638 dentry_cache             128      82228    118710   3957     4k
crash>

__kmem_cache_shrink_locked() was in the process of unlinking the last slab
contained on the dentry_cache's slabs_free list:

               slabp = list_entry(cachep->slabs_free.prev, slab_t, list);
#if DEBUG
               if (slabp->inuse)
                       BUG();
#endif
               list_del(&slabp->list);    <== oops occurred here

The dentry_cache's "slabs_free" chain head shows that it's "prev" pointer
has just been updated with the contents of the "prev" pointer contained in
the slab at e7756000 that it is unlinking -- which is a zero (not good):

crash> kmem_cache_s cbb3f638
struct kmem_cache_s {
 slabs_full = {
   next = 0xe2dbf000,
   prev = 0xe678e000
 },
 slabs_partial = {
   next = 0xf24e8000,
   prev = 0xdc9d2000
 },
 slabs_free = {
   next = 0xd0a25000,
   prev = 0x0          <== copied from the prev pointer of the
 },                        slab being unlinked.
 objsize = 0x80,
 flags = 0x22000,
 num = 0x1e,
 spinlock = {
   lock = 0x0
 },
 batchcount = 0xfc,
 gfporder = 0x0,
 gfpflags = 0x0,
 colour = 0x0,
 colour_off = 0x80,
 colour_next = 0x0,
 slabp_cache = 0x0,
 growing = 0x0,
 dflags = 0x1,
 ctor = 0,
 dtor = 0,
 failures = 0x0,
 name = "dentry_cache\000\000\000\000\000\000\000",
...

And examining the actual slab_t being unlinked, the corruption is
evident.  The slab cache page should contain a slab_t data structure
at the beginning, followed by 30 dentry structures.  However, it
looks like this:

crash> rd e7756000 1024
e7756000:  cbb3f648 00000000 00000000 00000000   H...............
e7756010:  00000000 00000000 00000000 00000000   ................
e7756020:  00000000 00000000 00000000 00000000   ................
e7756030:  00000000 00000000 00000000 00000000   ................
e7756040:  00000000 00000000 00000000 00000000   ................
e7756050:  00000000 00000000 00000000 00000000   ................
e7756060:  00000000 00000000 00000000 00000000   ................
e7756070:  00000000 00000000 00000000 00000000   ................
e7756080:  00000000 00000000 00000000 00000000   ................
e7756090:  00000000 00000000 00000000 00000000   ................
e77560a0:  00000000 00000000 00000000 00000000   ................
e77560b0:  00000000 00000000 00000000 00000000   ................
e77560c0:  00000000 00000000 00000000 00000000   ................
e77560d0:  00000000 00000000 00000000 00000000   ................
e77560e0:  00000000 00000000 00000000 00000000   ................
e77560f0:  00000000 00000000 00000000 00000000   ................
e7756100:  00000000 00000000 00000000 00000000   ................
e7756110:  00000000 00000000 00000000 00000000   ................
e7756120:  00000000 00000000 00000000 00000000   ................
e7756130:  00000000 00000000 00000000 00000000   ................
e7756140:  00000000 00000000 00000000 00000000   ................
e7756150:  00000000 00000000 00000000 00000000   ................
e7756160:  00000000 00000000 00000000 00000000   ................
e7756170:  00000000 00000000 00000000 00000000   ................
e7756180:  00000000 00000000 00000000 00000000   ................
e7756190:  00000000 00000000 00000000 00000000   ................
e77561a0:  00000000 00000000 00000000 00000000   ................
e77561b0:  00000000 00000000 00000000 00000000   ................
e77561c0:  00000000 00000000 00000000 00000000   ................
e77561d0:  00000000 00000000 00000000 00000000   ................
e77561e0:  00000000 00000000 00000000 00000000   ................
e77561f0:  00000000 00000000 00000000 00000000   ................
e7756200:  00000000 00000000 00000000 00000000   ................
e7756210:  00000000 00000000 00000000 00000000   ................
e7756220:  00000000 00000000 00000000 00000000   ................
e7756230:  00000000 00000000 00000000 00000000   ................
e7756240:  00000000 00000000 00000000 00000000   ................
e7756250:  00000000 00000000 00000000 00000000   ................
e7756260:  00000000 00000000 00000000 00000000   ................
e7756270:  00000000 00000000 00000000 00000000   ................
e7756280:  00000000 00000000 00000000 00000000   ................
e7756290:  00000000 00000000 00000000 00000000   ................
e77562a0:  00000000 00000000 00000000 00000000   ................
e77562b0:  00000000 00000000 00000000 00000000   ................
e77562c0:  00000000 00000000 00000000 00000000   ................
e77562d0:  00000000 00000000 00000000 00000000   ................
e77562e0:  00000000 00000000 00000000 00000000   ................
e77562f0:  00000000 00000000 00000000 00000000   ................
e7756300:  00000000 00000000 00000000 00000000   ................
e7756310:  00000000 00000000 00000000 00000000   ................
e7756320:  00000000 00000000 00000000 00000000   ................
e7756330:  00000000 00000000 00000000 00000000   ................
e7756340:  00000000 00000000 046597b9 f8a6ad24   ..........e.$...
e7756350:  e4f9e400 00000000 00000000 00000000   ................
e7756360:  7119b738 7119b778 7119b7b8 7119b7f8   8..qx..q...q...q
e7756370:  7119b838 7119b878 00000000 7119b8f8   8..qx..q.......q
e7756380:  00000000 00000000 00000000 e0eff380   ................
e7756390:  e7756390 e7756390 e7756398 e7756398   .cu..cu..cu..cu.
e77563a0:  00000000 00000000 e77563a8 e77563a8   .........cu..cu.
e77563b0:  e77563b0 e77563b0 00000000 d54ab700   .cu..cu.......J.
e77563c0:  00000025 e8adef84 046597b9 f8a6ad24   %.........e.$...
e77563d0:  e4f9e400 00000000 00000000 00000000   ................
e77563e0:  711a1748 711a1cd0 711a1d10 711a1d50   H..q...q...qP..q
e77563f0:  711a1d90 711a1e10 00000000 711a1f30   ...q...q....0..q
e7756400:  00000000 00000000 00000000 e0eff380   ................
e7756410:  e7756410 e7756410 e7756418 e7756418   .du..du..du..du.
e7756420:  00000000 00000000 e7756428 e7756428   ........(du.(du.
e7756430:  e7756430 e7756430 00000000 d54ab780   0du.0du.......J.
e7756440:  00000025 e8869dd4 046597b9 f8a6ad24   %.........e.$...
e7756450:  e4f9e400 00000000 00000000 00000000   ................
e7756460:  711cec20 711cec60 711ceca0 711cece0    ..q`..q...q...q
e7756470:  711ced20 711ced60 00000000 711d8278    ..q`..q....x..q
e7756480:  00000000 00000000 00000000 e0eff380   ................
e7756490:  e7756490 e7756490 e7756498 e7756498   .du..du..du..du.
e77564a0:  00000000 00000000 e77564a8 e77564a8   .........du..du.
e77564b0:  e77564b0 e77564b0 00000000 d54ab800   .du..du.......J.
e77564c0:  00000025 e85f4c24 046597b9 f8a6ad24   %...$L_...e.$...
e77564d0:  e4f9e400 00000000 00000000 00000000   ................
e77564e0:  711d8938 711d89b8 711d89f8 711d8a38   8..q...q...q8..q
e77564f0:  711d8a78 711d8ab8 00000000 711d8b38   x..q...q....8..q
e7756500:  00000000 00000000 00000000 e0eff380   ................
e7756510:  e7756510 e7756510 e7756518 e7756518   .eu..eu..eu..eu.
e7756520:  00000000 00000000 e7756528 e7756528   ........(eu.(eu.
e7756530:  e7756530 e7756530 00000000 d54ab880   0eu.0eu.......J.
e7756540:  00000025 e837fa74 046597b9 f8a6ad24   %...t.7...e.$...
e7756550:  e4f9e400 00000000 00000000 00000000   ................
e7756560:  711d91f8 711d9238 71205e20 71205e60   ...q8..q ^ q`^ q
e7756570:  71205ea0 71205ee0 00000000 71205f60   .^ q.^ q....`_ q
e7756580:  00000000 00000000 00000000 e0eff380   ................
e7756590:  e7756590 e7756590 e7756598 e7756598   .eu..eu..eu..eu.
e77565a0:  00000000 00000000 e77565a8 e77565a8   .........eu..eu.
e77565b0:  e77565b0 e77565b0 00000000 d54ab900   .eu..eu.......J.
e77565c0:  00000025 e810a8c4 046597b9 f8a6ad24   %.........e.$...
e77565d0:  e4f9e400 00000000 00000000 00000000   ................
e77565e0:  71206b38 71206b78 71206bb8 71206ec0   8k qxk q.k q.n q
e77565f0:  71206f00 71206f40 00000000 71206fc0   .o q@o q.....o q
e7756600:  00000000 00000000 00000000 e0eff380   ................
e7756610:  e7756610 e7756610 e7756618 e7756618   .fu..fu..fu..fu.
e7756620:  00000000 00000000 e7756628 e7756628   ........(fu.(fu.
e7756630:  e7756630 e7756630 00000000 d54ab980   0fu.0fu.......J.
e7756640:  00000025 e7e95714 046597b9 f8a6ad24   %....W....e.$...
e7756650:  e4f9e400 00000000 00000000 00000000   ................
e7756660:  71207740 71207780 712077c0 71207800   @w q.w q.w q.x q
e7756670:  71207840 71207880 00000000 71208568   @x q.x q....h. q
e7756680:  00000000 00000000 00000000 e0eff380   ................
e7756690:  e7756690 e7756690 e7756698 e7756698   .fu..fu..fu..fu.
e77566a0:  00000000 00000000 e77566a8 e77566a8   .........fu..fu.
e77566b0:  e77566b0 e77566b0 00000000 d54aba00   .fu..fu.......J.
e77566c0:  00000025 e7c20564 046597b9 f8a6ad24   %...d.....e.$...
e77566d0:  e4f9e400 00000000 00000000 00000000   ................
e77566e0:  71209588 712095c8 71209608 71209648   .. q.. q.. qH. q
e77566f0:  71209688 712096c8 00000000 71209f10   .. q.. q...... q
e7756700:  00000000 00000000 00000000 e0eff380   ................
e7756710:  e7756710 e7756710 e7756718 e7756718   .gu..gu..gu..gu.
e7756720:  00000000 00000000 e7756728 e7756728   ........(gu.(gu.
e7756730:  e7756730 e7756730 00000000 d54aba80   0gu.0gu.......J.
e7756740:  00000025 e79ab3b4 046597b9 f8a6ad24   %.........e.$...
e7756750:  e4f9e400 00000000 00000000 00000000   ................
e7756760:  0000006c ffffffff 00000001 00000067   l...........g...
e7756770:  ffffffff 00000002 00000000 ffffffff   ................
e7756780:  00000000 00000000 00000000 e0eff380   ................
e7756790:  e7756790 e7756790 e7756798 e7756798   .gu..gu..gu..gu.
e77567a0:  00000000 00000000 e77567a8 e77567a8   .........gu..gu.
e77567b0:  e77567b0 e77567b0 00000000 d54abb00   .gu..gu.......J.
e77567c0:  00000025 e74c1054 046597b9 f8a6ad24   %...T.L...e.$...
e77567d0:  e4f9e400 00000000 00000000 00000000   ................
e77567e0:  00000051 00000023 ffffffff 00000066   Q...#.......f...
e77567f0:  0000003c ffffffff 00000000 00000078   <...........x...
e7756800:  00000000 00000000 00000000 e0eff380   ................
e7756810:  e7756810 e7756810 e7756818 e7756818   .hu..hu..hu..hu.
e7756820:  00000000 00000000 e7756828 e7756828   ........(hu.(hu.
e7756830:  e7756830 e7756830 00000000 d54abb80   0hu.0hu.......J.
e7756840:  00000025 e724bea4 046597b9 f8a6ad24   %.....$...e.$...
e7756850:  e4f9e400 00000000 00000000 00000000   ................
e7756860:  00000000 00000000 00000000 00000000   ................
e7756870:  00000000 00000000 00000000 00000000   ................
e7756880:  00000000 00000000 00000000 e0eff380   ................
e7756890:  e7756890 e7756890 e7756898 e7756898   .hu..hu..hu..hu.
e77568a0:  00000000 00000000 e77568a8 e77568a8   .........hu..hu.
e77568b0:  e77568b0 e77568b0 00000000 d54abc00   .hu..hu.......J.
e77568c0:  00000025 e6fd6cf4 046597b9 f8a6ad24   %....l....e.$...
e77568d0:  e4f9e400 00000000 00000000 00000000   ................
e77568e0:  00000000 00000000 00000000 00000000   ................
e77568f0:  00000000 00000000 00000000 00000000   ................
e7756900:  00000000 00000000 00000000 e0eff380   ................
e7756910:  e7756910 e7756910 e7756918 e7756918   .iu..iu..iu..iu.
e7756920:  00000000 00000000 e7756928 e7756928   ........(iu.(iu.
e7756930:  e7756930 e7756930 00000000 d54abc80   0iu.0iu.......J.
e7756940:  00000025 e6d61b44 046597b9 f8a6ad24   %...D.....e.$...
e7756950:  e4f9e400 00000000 00000000 00000000   ................
e7756960:  00000000 00000000 00000000 00000000   ................
e7756970:  00000000 00000000 00000000 00000000   ................
e7756980:  00000000 00000000 00000000 e0eff380   ................
e7756990:  e7756990 e7756990 e7756998 e7756998   .iu..iu..iu..iu.
e77569a0:  00000000 00000000 e77569a8 e77569a8   .........iu..iu.
e77569b0:  e77569b0 e77569b0 00000000 d54abd00   .iu..iu.......J.
e77569c0:  00000025 e6aec994 046597b9 f8a6ad24   %.........e.$...
e77569d0:  e4f9e400 00000000 00000000 00000000   ................
e77569e0:  00000000 00000000 00000000 00000000   ................
e77569f0:  00000000 00000000 00000000 00000000   ................
e7756a00:  00000000 00000000 00000000 e0eff380   ................
e7756a10:  e7756a10 e7756a10 e7756a18 e7756a18   .ju..ju..ju..ju.
e7756a20:  00000000 00000000 e7756a28 e7756a28   ........(ju.(ju.
e7756a30:  e7756a30 e7756a30 00000000 d54abd80   0ju.0ju.......J.
e7756a40:  00000025 e68777e4 046597b9 f8a6ad24   %....w....e.$...
e7756a50:  e4f9e400 00000000 00000000 00000000   ................
e7756a60:  00000000 00000000 00000000 00000000   ................
e7756a70:  00000000 00000000 00000000 00000000   ................
e7756a80:  00000000 00000000 00000000 e0eff380   ................
e7756a90:  e7756a90 e7756a90 e7756a98 e7756a98   .ju..ju..ju..ju.
e7756aa0:  00000000 00000000 e7756aa8 e7756aa8   .........ju..ju.
e7756ab0:  e7756ab0 e7756ab0 00000000 d54abe00   .ju..ju.......J.
e7756ac0:  00000025 e6602634 046597b9 f8a6ad24   %...4&`...e.$...
e7756ad0:  e4f9e400 00000000 00000000 00000000   ................
e7756ae0:  00000000 00000000 00000000 00000000   ................
e7756af0:  00000000 00000000 00000000 00000000   ................
e7756b00:  00000000 00000000 00000000 e0eff380   ................
e7756b10:  e7756b10 e7756b10 e7756b18 e7756b18   .ku..ku..ku..ku.
e7756b20:  00000000 00000000 e7756b28 e7756b28   ........(ku.(ku.
e7756b30:  e7756b30 e7756b30 00000000 d54abe80   0ku.0ku.......J.
e7756b40:  00000025 e638d484 046597b9 f8a6ad24   %.....8...e.$...
e7756b50:  e4f9e400 00000000 00000000 00000000   ................
e7756b60:  00000001 aeff6218 00000100 aeff16d8   .....b..........
e7756b70:  af0d3018 af1fac18 00000000 af1fac68   .0..........h...
e7756b80:  00000000 00000000 00000000 e0eff380   ................
e7756b90:  e7756b90 e7756b90 e7756b98 e7756b98   .ku..ku..ku..ku.
e7756ba0:  00000000 00000000 e7756ba8 e7756ba8   .........ku..ku.
e7756bb0:  e7756bb0 e7756bb0 00000000 d54abf00   .ku..ku.......J.
e7756bc0:  00000025 e61182d4 046597b9 f8a6ad24   %.........e.$...
e7756bd0:  e4f9e400 00000000 00000000 00000000   ................
e7756be0:  af235728 af235760 711f01e8 71205938   (W#.`W#....q8Y q
e7756bf0:  71206020 71205dd8 00000000 712061f8    ` q.] q.....a q
e7756c00:  00000000 00000000 00000000 e0eff380   ................
e7756c10:  e7756c10 e7756c10 e7756c18 e7756c18   .lu..lu..lu..lu.
e7756c20:  00000000 00000000 e7756c28 e7756c28   ........(lu.(lu.
e7756c30:  e7756c30 e7756c30 00000000 d54abf80   0lu.0lu.......J.
e7756c40:  00000025 e5ea3124 046597b9 f8a6ad24   %...$1....e.$...
e7756c50:  e4f9e400 00000000 00000000 00000000   ................
e7756c60:  00000000 00000000 00000000 00000000   ................
e7756c70:  00000000 00000000 00000000 00000000   ................
e7756c80:  00000000 00000000 00000000 e0eff380   ................
e7756c90:  e7756c90 e7756c90 e7756c98 e7756c98   .lu..lu..lu..lu.
e7756ca0:  00000000 00000000 e7756ca8 e7756ca8   .........lu..lu.
e7756cb0:  e7756cb0 e7756cb0 00000000 e20bf100   .lu..lu.........
e7756cc0:  00000025 e59b8dc4 046597b9 f8a6ad24   %.........e.$...
e7756cd0:  e4f9e400 00000000 00000000 00000000   ................
e7756ce0:  00000000 00000000 00000000 00000000   ................
e7756cf0:  00000000 00000000 00000000 00000000   ................
e7756d00:  00000000 00000000 00000000 e0eff380   ................
e7756d10:  e7756d10 e7756d10 e7756d18 e7756d18   .mu..mu..mu..mu.
e7756d20:  00000000 00000000 e7756d28 e7756d28   ........(mu.(mu.
e7756d30:  e7756d30 e7756d30 00000000 e20bf180   0mu.0mu.........
e7756d40:  00000025 e5743c14 046597b9 f8a6ad24   %....<t...e.$...
e7756d50:  e4f9e400 00000000 00000000 00000000   ................
e7756d60:  00000000 00000000 00000000 00000000   ................
e7756d70:  00000000 00000000 00000000 00000000   ................
e7756d80:  00000000 00000000 00000000 e0eff380   ................
e7756d90:  e7756d90 e7756d90 e7756d98 e7756d98   .mu..mu..mu..mu.
e7756da0:  00000000 00000000 e7756da8 e7756da8   .........mu..mu.
e7756db0:  e7756db0 e7756db0 00000000 e20bf200   .mu..mu.........
e7756dc0:  00000025 e54cea64 046597b9 f8a6ad24   %...d.L...e.$...
e7756dd0:  e4f9e400 00000000 00000000 00000000   ................
e7756de0:  00000000 00000000 00000000 00000000   ................
e7756df0:  00000000 00000000 00000000 00000000   ................
e7756e00:  00000000 00000000 00000000 e0eff380   ................
e7756e10:  e7756e10 e7756e10 e7756e18 e7756e18   .nu..nu..nu..nu.
e7756e20:  00000000 00000000 e7756e28 e7756e28   ........(nu.(nu.
e7756e30:  e7756e30 e7756e30 00000000 e20bf280   0nu.0nu.........
e7756e40:  00000025 e52598b4 046597b9 f8a6ad24   %.....%...e.$...
e7756e50:  e4f9e400 00000000 00000000 00000000   ................
e7756e60:  00000000 00000000 00000000 00000000   ................
e7756e70:  00000000 00000000 00000000 00000000   ................
e7756e80:  00000000 00000000 00000000 e0eff380   ................
e7756e90:  e7756e90 e7756e90 e7756e98 e7756e98   .nu..nu..nu..nu.
e7756ea0:  00000000 00000000 e7756ea8 e7756ea8   .........nu..nu.
e7756eb0:  e7756eb0 e7756eb0 00000000 e20bf300   .nu..nu.........
e7756ec0:  00000025 e4fe4704 046597b9 f8a6ad24   %....G....e.$...
e7756ed0:  e4f9e400 00000000 00000000 00000000   ................
e7756ee0:  00000000 00000000 00000000 00000000   ................
e7756ef0:  00000000 00000000 00000000 00000000   ................
e7756f00:  00000000 00000000 00000000 e0eff380   ................
e7756f10:  e7756f10 e7756f10 e7756f18 e7756f18   .ou..ou..ou..ou.
e7756f20:  00000000 00000000 e7756f28 e7756f28   ........(ou.(ou.
e7756f30:  e7756f30 e7756f30 00000000 e20bf380   0ou.0ou.........
e7756f40:  00000025 e4d6f554 046597b9 f8a6ad24   %...T.....e.$...
e7756f50:  e4f9e400 00000000 00000000 00000000   ................
e7756f60:  00000000 00000000 00000000 00000000   ................
e7756f70:  00000001 af1ee720 00000000 00000000   .... ...........
e7756f80:  00000000 00000000 00000000 e0eff380   ................
e7756f90:  e7756f90 e7756f90 e7756f98 e7756f98   .ou..ou..ou..ou.
e7756fa0:  00000000 00000000 e7756fa8 e7756fa8   .........ou..ou.
e7756fb0:  e7756fb0 e7756fb0 00000000 e20bf400   .ou..ou.........
e7756fc0:  00000025 e4afa3a4 046597b9 f8a6ad24   %.........e.$...
e7756fd0:  e4f9e400 00000000 00000000 00000000   ................
e7756fe0:  00000000 00000000 00000000 00000000   ................
e7756ff0:  00000000 00000000 00000000 00000000   ................
crash>

Obviously the the first 0x348 (840) bytes have been corrupted (i.e.,
up to e7756348), *except* for the first word in the buffer (cbb3f648),
which correctly points back to the "slabs_free" list_head in the
dentry_cache's kmem_cache_s.  But the second (and subsequent) words
containing zeroes is the problem; the second work's supposed to contain
the prev pointer back to the previous slab_t in the chain.  That value was
transferred to dentry_cache's slabs_free.prev as seen above, but when it
tries to use that NULL pointer, the oops occurred.

What is kind of interesting, but not illuminating in any wasy, is that the
first word in the corrupted slab contains a valid pointer.  But the remaining
set of zeroed-out memory is completely unexplainable.  It is somewhat
reminiscent of the /proc/kcore issue, but in that case the last 496 bytes of
a task_struct would be copied over the first 496 bytes of a slab page (or any
other unlucky page for that matter).  But there would be a definite signature
in the corruption data that could be recognized, i.e., not a bunch of zeroes,
but rather recognizable task_struct data.

In any case, I haven't a clue as to how the slab page got corrupted
in such a manner.  There are other "known" corrupters out there,
but none with this signature.  That being said, I still wish that they would
please upgrade to 2.4.21-31 from 2.4.21-27, so that we can debug from
a current kernel.

Also, the messages just prior to the oops are a bit troubling, and they
harken back to the NFS discussions earlier in this case:

...
NFS: Buggy server - nlink == 0!
__nfs_fhget: iget failed
NFS: Buggy server - nlink == 0!
__nfs_fhget: iget failed
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c0151a9f
*pde = 1300f001
*pte = 00000000
Oops: 0002
...

Did those messages appear prior to any of the other crashes?

Comment 16 Wendy Cheng 2005-04-19 15:23:23 UTC
The original problem traces did show race conditions between kswapd and kupdated
across three different customer reports. However, with Larry's patch, we got
different results:

1. IT65377 (GFS panic): customer did heavy poundings on the test kernel built
with Larry's patch. The system sustained. They seem to be happy and have asked
the patch to be included into our formal releases.
2. IT65627: the system still encounters different panics and crashes (as
described in previous update by Dave Anderson)
3. IT65377 (hang in kswapd/kupdated): problem had proved to be caused by
deadlock when kswapd tried to sync dirty pages back to nfs server but network
stacks ran out of memory (kmalloc). I've removed this IT ticket out of this
bugzilla.  

I would say we leave this bugzilla for IT65377 (so Larry can prepare to get the
patch into formal release) and open other bugzillas for (2) and (3) ?  

Comment 23 Ernie Petrides 2005-10-17 21:58:51 UTC
Apparently IT 65377 was not related to this bug, so unlinking it.

Comment 24 RHEL Program Management 2007-10-19 19:07:07 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.