From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90) Description of problem: There seems to be a problem with the swap component of the kernel(2.4.2- 2), running on dual PIII (1Gz), 512MB ram, with 1GB of swap space. How reproducible: Sometimes Steps to Reproduce: 1. Just get the server under a good load for a long period of time 2. 3. Actual Results: From log files: Jun 15 12:35:59 node11 kernel: VM: Bad swap entry 04000000 Jun 15 12:35:59 node11 kernel: VM: Bad swap entry 04000000 Jun 15 12:35:59 node11 kernel: swap_free: offset exceeds max Jun 15 12:35:59 node11 kernel: swap_free: offset exceeds max Jun 15 12:35:59 node11 kernel: VM: Bad swap entry 04000000 Jun 15 12:35:59 node11 kernel: VM: Bad swap entry 04000000 Jun 15 12:36:19 node11 kernel: swap_free: offset exceeds max Jun 15 12:36:19 node11 kernel: swap_free: offset exceeds max Jun 15 12:36:41 node11 kernel: swap_free: offset exceeds max Jun 15 12:36:41 node11 last message repeated 3 times Jun 15 12:36:41 node11 kernel: Bad swap offset entry 08000000 Jun 15 12:36:41 node11 kernel: VM: killing process l202.exe Jun 15 12:36:41 node11 kernel: swap_free: offset exceeds max Jun 15 12:36:41 node11 kernel: swap_free: offset exceeds max Jun 15 12:44:41 node11 automount[1888]: expired /mnt/home/home Jun 15 12:56:28 node11 automount[586]: attempting to mount entry /mnt/home/home Jun 15 12:56:29 node11 kernel: Bad swap offset entry 08000000 Jun 15 12:56:29 node11 kernel: VM: killing process l1.exe Jun 15 12:56:29 node11 kernel: Bad swap offset entry 08000000 Jun 15 12:56:29 node11 kernel: VM: killing process l1.exe Jun 15 12:56:29 node11 kernel: Bad swap offset entry 08000000 Jun 15 12:56:29 node11 kernel: VM: killing process l1.exe Jun 15 12:56:29 node11 kernel: swap_free: offset exceeds max Jun 15 12:56:29 node11 kernel: VM: Bad swap entry 04000000 Jun 15 12:59:55 node11 kernel: kernel BUG at page_alloc.c:84! Jun 15 12:59:55 node11 kernel: invalid operand: 0000 Jun 15 12:59:55 node11 kernel: CPU: 1 Jun 15 12:59:55 node11 kernel: EIP: 0010:[__free_pages_ok+75/896] Jun 15 12:59:55 node11 kernel: EIP: 0010:[<c013322b>] Jun 15 12:59:55 node11 kernel: EFLAGS: 00010282 Jun 15 12:59:55 node11 kernel: eax: 0000001f ebx: dbf2d2c8 ecx: 00000082 edx: 01000000 Jun 15 12:59:55 node11 kernel: esi: 00000000 edi: c10a60d8 ebp: 00000000 esp: d794be08 Jun 15 12:59:55 node11 kernel: ds: 0018 es: 0018 ss: 0018 Jun 15 12:59:55 node11 kernel: Process l302.exel (pid: 1985, stackpage=d794b000) Jun 15 12:59:55 node11 kernel: Stack: c022c57b c022c789 00000054 dbf2d2d0 c10a60d8 c01298a9 dbf2d220 00000004 Jun 15 12:59:55 node11 kernel: c10a60d8 c02de0e4 000001fc 00000004 c012676a 00000008 00400000 00000000 Jun 15 12:59:55 node11 kernel: 0000018d 000001e1 c6525780 001e0000 00000000 00400000 d794809c 00000000 Jun 15 12:59:55 node11 kernel: Call Trace: [error_table+42275/64248] [error_table+42801/64248] [__set_page_dirty+121/128] [zap_page_range+874/1296] [dput+59/416] [fput+116/224] [exit_mmap+201/304] Jun 15 12:59:55 node11 kernel: Call Trace: [<c022c57b>] [<c022c789>] [<c01298a9>] [<c012676a>] [<c014e8cb>] [<c013b0f4>] [<c0129289>] Jun 15 12:59:55 node11 rsh(pam_unix)[1987]: session closed for user mjpushie Jun 15 12:59:55 node11 kernel: [mmput+55/96] [do_exit+236/688] [dequeue_signal+109/176] [do_signal+569/688] [update_wall_time+22/80] [do_timer+11/96] [timer_interrupt+257/400] [handle_IRQ_event+94/144] Jun 15 12:59:55 node11 kernel: [<c0117da7>] [<c011c24c>] [<c0121a5d>] [<c0109069>] [<c0120f16>] [<c012134b>] [<c010def1>] [<c010aa1e>] Jun 15 12:59:55 node11 kernel: [do_IRQ+161/240] [do_IRQ+195/240] [do_page_fault+0/1120] [signal_return+20/24] Jun 15 12:59:55 node11 kernel: [<c010ac21>] [<c010ac43>] [<c0115b00>] [<c010921c>] Jun 15 12:59:55 node11 kernel: Jun 15 12:59:55 node11 kernel: Code: 0f 0b 83 c4 0c 8b 0d 0c 60 32 c0 89 f8 29 c8 69 c0 f1 f0 f0 Jun 15 12:59:55 node11 kernel: swap_free: offset exceeds max Jun 15 13:00:03 node11 kernel: kernel BUG at page_alloc.c:84! Jun 15 13:00:03 node11 kernel: invalid operand: 0000 Jun 15 13:00:03 node11 kernel: CPU: 0 Jun 15 13:00:03 node11 kernel: EIP: 0010:[__free_pages_ok+75/896] Jun 15 13:00:03 node11 kernel: EIP: 0010:[<c013322b>] Jun 15 13:00:03 node11 kernel: EFLAGS: 00010282 Jun 15 13:00:03 node11 kernel: eax: 0000001f ebx: dbf2d2c8 ecx: 00000012 edx: 02000000 Jun 15 13:00:03 node11 kernel: esi: 00000000 edi: c10a60d8 ebp: 00000000 esp: db813ef4 Jun 15 13:00:03 node11 kernel: ds: 0018 es: 0018 ss: 0018 Jun 15 13:00:03 node11 kernel: Process rpciod (pid: 1895, stackpage=db813000) Jun 15 13:00:03 node11 kernel: Stack: c022c57b c022c789 00000054 00000000 00001770 00000000 e08a6b01 d5eb48d4 Jun 15 13:00:03 node11 kernel: dba734c4 dba734a0 df067de0 cb5e4aa0 e08c6bb9 c10a60d8 dbf2d220 dbf2d244 Jun 15 13:00:03 node11 kernel: c027e140 00000000 dbf2d220 dba734c4 dba734a0 dbf2d3d4 dbf2d220 e08c8184 Jun 15 13:00:03 node11 kernel: Call Trace: [error_table+42275/64248] [error_table+42801/64248] [<e08a6b01>] [<e08c6bb9>] [<e08c8184>] [<e08a3e23>] [<e08a73ba>] Jun 15 13:00:03 node11 kernel: Call Trace: [<c022c57b>] [<c022c789>] [<e08a6b01>] [<e08c6bb9>] [<e08c8184>] [<e08a3e23>] [<e08a73ba>] Jun 15 13:00:03 node11 kernel: [<e08a75ca>] [<e08a7ef9>] [<e08b1f24>] [<e08b1f1c>] [<e08b1f1c>] [kernel_thread+38/48] [<e08b1f24>] [<e08a7de0>] Jun 15 13:00:03 node11 kernel: [<e08a75ca>] [<e08a7ef9>] [<e08b1f24>] [<e08b1f1c>] [<e08b1f1c>] [<c0107626>] [<e08b1f24>] [<e08a7de0>] Jun 15 13:00:03 node11 kernel: Jun 15 13:00:03 node11 kernel: Code: 0f 0b 83 c4 0c 8b 0d 0c 60 32 c0 89 f8 29 c8 69 c0 f1 f0 f0 Expected Results: None of the above message! Additional info: I haven't checked with smaller swap space (could it help?). Any ideas what's going on?
Could you check kernel 2.4.3-12? This kernel has a lot of small bugfixes and might fix this problem. (And yes, we REALLY REALLY load kernels before we even consider shipping them)
I downloaded kernel 2.4.5, compiled it and installed. It works fine, and, so far, none of the above problem. I'm not too sure what fixed the problem, but it hasn't happen again! (And yes, I have no doubt that you guys do extensive testing on the kernel before shipping them!)