From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050323 Firefox/1.0.2 Fedora/1.0.2-1.3.1 Description of problem: squid stops responding two hours after the kernel error message about ext3. Version-Release number of selected component (if applicable): kernel-2.4.20-42.9.legacy How reproducible: Couldn't Reproduce Steps to Reproduce: 1. squid running and using ext3 filesystems 2. 3. Actual Results: Apr 5 11:27:27 PRX02002 kernel: <1>Unable to handle kernel paging request at virtual address 80f7f7db Apr 5 11:27:27 PRX02002 kernel: printing eip: Apr 5 11:27:27 PRX02002 kernel: c015e4b8 Apr 5 11:27:27 PRX02002 kernel: *pde = 00000000 Apr 5 11:27:27 PRX02002 kernel: Oops: 0000 Apr 5 11:27:27 PRX02002 kernel: cpqasm cpqevt e100 ext3 jbd cpqarray sd_mod scsi_mod Apr 5 11:27:27 PRX02002 kernel: CPU: 0 Apr 5 11:27:27 PRX02002 kernel: EIP: 0060:[<c015e4b8>] Tainted: P Apr 5 11:27:27 PRX02002 kernel: EFLAGS: 00010293 Apr 5 11:27:27 PRX02002 kernel: Apr 5 11:27:27 PRX02002 kernel: EIP is at find_inode [kernel] 0x28 (2.4.20-42.9.legacy) Apr 5 11:27:27 PRX02002 kernel: eax: 00000000 ebx: 80f7f7b3 ecx: 0000ffff edx: f7f00000 Apr 5 11:27:27 PRX02002 kernel: esi: 00000000 edi: f7f7b3a8 ebp: 00098687 esp: d0249e70 Apr 5 11:27:27 PRX02002 kernel: ds: 0068 es: 0068 ss: 0068 Apr 5 11:27:27 PRX02002 kernel: Process squid (pid: 10531, stackpage=d0249000) Apr 5 11:27:27 PRX02002 kernel: Stack: 00000000 e8758ac4 f6b6fc00 00098687 f7f7b3a8 00098687 f6b6fc00 c015e7f4 Apr 5 11:27:27 PRX02002 kernel: f6b6fc00 00098687 f7f7b3a8 00000000 00000000 00098687 c0e01280 e1a8a100 Apr 5 11:27:27 PRX02002 kernel: c0e01280 f884cefc f6b6fc00 00098687 00000000 00000000 c51835a8 fffffff4 Apr 5 11:27:27 PRX02002 kernel: Call Trace: [<c015e7f4>] iget4 [kernel] 0x54 (0xd0249e8c)) Apr 5 11:27:27 PRX02002 kernel: [<f884cefc>] ext3_lookup [ext3] 0x7c (0xd0249eb4)) Apr 5 11:27:27 PRX02002 kernel: [<c0152d97>] real_lookup [kernel] 0xc7 (0xd0249ed4)) Apr 5 11:27:27 PRX02002 kernel: [<c01532ff>] link_path_walk [kernel] 0x40f (0xd0249ef0)) Apr 5 11:27:27 PRX02002 kernel: [<c012c49e>] futex_wait [kernel] 0x10e (0xd0249f1c)) Apr 5 11:27:27 PRX02002 kernel: [<c01537b9>] path_lookup [kernel] 0x39 (0xd0249f30)) Apr 5 11:27:27 PRX02002 kernel: [<c0153c2e>] open_namei [kernel] 0x7e (0xd0249f40)) Apr 5 11:27:27 PRX02002 kernel: [<c012c330>] futex_vcache_callback [kernel] 0x0 (0xd0249f5c)) Apr 5 11:27:27 PRX02002 kernel: [<c01469e9>] filp_open [kernel] 0x49 (0xd0249f70)) Apr 5 11:27:27 PRX02002 kernel: [<c0146da3>] sys_open [kernel] 0x53 (0xd0249fa8)) Apr 5 11:27:27 PRX02002 kernel: [<c010954f>] system_call [kernel] 0x33 (0xd0249fc0)) Apr 5 11:27:27 PRX02002 kernel: Apr 5 11:27:27 PRX02002 kernel: Apr 5 11:27:27 PRX02002 kernel: Code: 39 6b 28 89 de 75 f1 8b 44 24 20 39 83 94 00 00 00 75 e5 8b Additional info: Squid is using a dedicated ext3 filesystem for its cache. After the reboot, /lost+found directory has been recreated by fsck.
In my experience by far the most likely cause for this is bad memory. Can you memtest the machine so that we can rule that out first? Cheers.
Also, can you duplicate this on older kernel releases? Is this something specific to the legacy kernel, or is this more of a system problem in general?
memcheck86 haven't found any problem in one hour. The squid server was running kernel-2.4.20-37.9.legacy without problem since october or november 2004. It is running 2.4.20-42.9.legacy since february, 26. I can't reproduce the problem on this kernel.
This doesn't seem to be important enough to fix just on its own, so mark it DEFER.