From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050920 Description of problem: I haven't found this particular stack trace in any of the bugs reported regarding page_referenced. I may not have looked at all reported bugs but I am choosing to report our kernel crash here: SYSTEM MAP: ../rh_3.1/System.map-2.4.21-9.ELsmp DEBUG KERNEL: ../rh_3.1/vmlinux_smp_dbg (2.4.21-9.ELsmp) DUMPFILE: ./vmcore CPUS: 4 DATE: Tue Nov 15 09:08:40 2005 UPTIME: 129 days, 12:38:58 LOAD AVERAGE: 1.41, 1.24, 1.10 TASKS: 335 NODENAME: XXXXX RELEASE: 2.4.21-9.ELsmp VERSION: #1 SMP Thu Jan 8 17:08:56 EST 2004 MACHINE: i686 (3065 Mhz) MEMORY: 3 GB PANIC: "" PID: 11 COMMAND: "kswapd" TASK: c3d70000 CPU: 1 STATE: TASK_RUNNING (PANIC) The kernel crashes in page_referenced: Unable to handle kernel NULL pointer dereference at virtual address 00000084 printing eip: c015c67d *pde = 256f4001 *pte = 00000000 Oops: 0000 Tam ocl cpqci netconsole autofs bcm5700 bonding 8021q sg microcode keybdev mousedev hid input usb-ohci usbcore ext3 jbd qla2300_conf cciss sd_mod scsi_mod CPU: 1 EIP: 0060:[<c015c67d>] Tainted: P EFLAGS: 00010216 stack trace: PID: 11 TASK: c3d70000 CPU: 1 COMMAND: "kswapd" #0 [c3d71c70] netconsole_netdump at fe78263e #1 [c3d71e04] die at c010c4b1 #2 [c3d71e18] do_page_fault at c011f785 #3 [c3d71edc] error_code (via page_fault) at c03f21b0 EAX: c29127cc EBX: fff94fd8 ECX: 00000000 EDX: c100002c EBP: 00000fd8 DS: 0068 ESI: c29127cc ES: 0068 EDI: c100002c CS: 0060 EIP: c015c67d ERR: ffffffff EFLAGS: 00010216 #4 [c3d71f18] page_referenced at c015c67d #5 [c3d71f4c] refill_inactive_zone at c015269d #6 [c3d71f98] rebalance_inactive_zone at c0153518 #7 [c3d71fac] do_try_to_free_pages_kswapd at c01537f0 #8 [c3d71fd0] kswapd at c0153a33 #9 [c3d71ff0] kernel_thread_helper at c010958b I think it crashes here: eip=c015c67d, which causes the crash 0xc015c677 <page_referenced+199>: lea (%eax,%edx,4),%eax; eax is of type struct page 0xc015c67a <page_referenced+202>: mov 0x8(%eax),%ecx;ecx=mm=page->mapping 0xc015c67d <page_referenced+205>: mov 0x84(%ecx),%eax; eax=mm->rlimit_rss <=== and thus: int page_referenced(struct page * page, int * rsslimit) { int referenced = 0, under_rsslimit = 0; struct mm_struct * mm; struct pte_chain * pc; if (PageTestandClearReferenced(page)) referenced++; if (PageDirect(page)) { pte_t *pte = rmap_ptep_map(page->pte.direct); if (pte_young(*pte) && ptep_test_and_clear_young(pte)) referenced++; mm = ptep_to_mm(pte); <=== if (mm->rss < mm->rlimit_rss) <=== under_rsslimit++; rmap_ptep_unmap(pte); } else { ... } The problem is that ptep_to_mm() returns 0. crash> p/x &((struct mm_struct*)0)->rlimit_rss $14 = 0x84 Thus, it crashes at address 0x84 as the oops messages reports. Version-Release number of selected component (if applicable): kernel 2.4.21-9.ELsmp How reproducible: Didn't try Additional info:
Created attachment 121486 [details] some data structures I extracted from the dump
This problem was fixed in RHEL3 U3 with a change committed on 25-Jun-2004. Please upgrade to a recent kernel (latest is U6, kernel version 2.4.21-37.EL).