From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031114 Description of problem: our P4-based file server, which has never kernel panicked, oopsed, or crashed in a year and a half of use, had a kernel oops a few days ago. here is the relevant output from dmesg: swap_free: Bad swap file entry 18f56044 Unable to handle kernel NULL pointer dereference at virtual address 00000070 printing eip: c013ade2 *pde = 00000000 Oops: 0000 smbfs radeon agpgart binfmt_misc nfs nfsd lockd sunrpc autofs e1000 ide-scsi ide-cd cdrom ehci-hcd usb-ohci usb-uhci usbcore ext3 jbd raid5 xor ncr53c8xx sd_m CPU: 0 EIP: 0010:[<c013ade2>] Not tainted EFLAGS: 00010202 EIP is at page_referenced [kernel] 0x242 (2.4.20-24.7) eax: c1d84ec8 ebx: c1000030 ecx: 00000000 edx: 00000000 esi: f6535980 edi: 0000001d ebp: c1575b00 esp: c34b1f7c ds: 0018 es: 0018 ss: 0018 Process kscand (pid: 6, stackpage=c34b1000) Stack: 00000001 00000000 00000000 c34b1fac c1575b1c c1575b00 00000003 000001f4 c0133a9d c34b1fac c15edd0c c02df334 00000001 c34b0000 c02df1e8 00000003 000001f4 c0135889 c02df1e8 00000003 00000001 c34b0000 00000001 00000000 Call Trace: [<c0133a9d>] scan_active_list [kernel] 0x5d (0xc34b1f9c)) [<c0135889>] kscand [kernel] 0xc9 (0xc34b1fc0)) [<c0105000>] stext [kernel] 0x0 (0xc34b1fe8)) [<c0107146>] arch_kernel_thread [kernel] 0x26 (0xc34b1ff0)) [<c01357c0>] kscand [kernel] 0x0 (0xc34b1ff8)) Code: 8b 41 70 42 39 41 5c 0f 43 54 24 04 ff 04 24 4f 89 54 24 04 the swap_free error occured at 7:17:59 AM and the oops at 7:18:02 AM last saturday morning. the only significant thing i think it would have been doing was rsync'ing a wad of CCD data (few hundred MB at most over 100 Mbit wire) from the machines on the mountain. the machine did not crash and seems to have been working fine since the crash. xinetd croaked and needed to be restarted since then, but i never noticed anything else until i happened to run dmesg tonight. i noticed a few other bugzilla references to kscand problems with some earlier errata kernel, but none seemed completely relevant to what i see here. Version-Release number of selected component (if applicable): kernel-2.4.20-24.7.i686 How reproducible: Didn't try Steps to Reproduce: haven't tried to reproduce since it's a production server, though this is a problem never before seen on this hardware. Actual Results: the machine continued to mostly work as if nothing happened, though xinetd did crash subsequent to the oops. Expected Results: usually kernel oops mess things up badly so i'm probably lucky. Additional info: output from free: total used free shared buffers cached Mem: 1031184 942960 88224 0 98516 615464 -/+ buffers/cache: 228980 802204 Swap: 2096480 83864 2012616 /proc/cpuinfo: processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.00GHz stepping : 4 cpu MHz : 2018.022 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm bogomips : 4023.91 df: Filesystem 1k-blocks Used Available Use% Mounted on /dev/md0 459327632 150264584 285730492 35% / /dev/hda1 505605 13304 466197 3% /boot none 515592 0 515592 0% /dev/shm /proc/interrupts: CPU0 0: 46022934 XT-PIC timer 1: 6 XT-PIC keyboard 2: 0 XT-PIC cascade 4: 2675596 XT-PIC serial 5: 0 XT-PIC usb-uhci 6: 30 XT-PIC ncr53c8xx 8: 1 XT-PIC rtc 9: 10457017 XT-PIC ide2, ide3, ide4, ide5, usb-uhci, usb-ohci, usb-ohci, ehci-hcd 10: 10998648 XT-PIC eth0 12: 32 XT-PIC PS/2 Mouse 14: 2863036 XT-PIC ide0 15: 6319 XT-PIC ide1 NMI: 0 ERR: 169977 uptime: 3:07am up 5 days, 7:51, 5 users, load average: 0.17, 0.25, 0.16
Various VM stability fixes went into the -27.7 errata kernel. Try that. RHL 7 & 8 are now EOL, so won't recieve further updates.