From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc3 Firefox/1.0.7 Description of problem: After a fresh system boot, our -- java based -- application is started to perform some web-based configuration. The output of this JVM (Sun HotSpot JDK 1.4.2) is redirected to syslog, via a pipe to logger(1). Here is the crash backtrace: EIP is at __out_of_line_bug [kernel] 0x17 (2.4.21-32.ELsmp/i686) eax: 00000026 ebx: f7465f90 ecx: c0383eb4 edx: 01fa7ed7 esi: f7fee000 edi: f7465f90 ebp: 00000009 esp: f7465f34 ds: 0068 es: 0068 ss: 0068 Process syslogd (pid: 766, stackpage=f7465000) Stack: c02bd964 000000fe c0173c1c 000000fe c0172a90 f7465f90 f7fee000 f7465f90 c0173a87 f7fee000 f7fee000 f7465f90 c0173df9 00252d88 006afb8a f7464000 00000000 00000000 00000000 c0162c3b 00000000 bfff8e10 fffffeff 00000000 Call Trace: [<c0173c1c>] path_init [kernel] 0x16c (0xf7465f3c) [<c0172a90>] getname [kernel] 0xa0 (0xf7465f44) [<c0173a87>] path_lookup [kernel] 0x17 (0xf7465f54) [<c0173df9>] __user_walk [kernel] 0x49 (0xf7465f64) [<c0162c3b>] sys_access [kernel] 0x7b (0xf7465f80) [<c0166377>] sys_fsync [kernel] 0x47 (0xf7465f9c) Code: 0f 0b 37 01 5f d0 2b c0 90 eb fe 8d b4 26 00 00 00 00 8d bc Kernel panic: Fatal exception Version-Release number of selected component (if applicable): kernel, `uname-r`=2.4.21-32.ELsmp How reproducible: Sometimes Steps to Reproduce: No simple scenario: The crash does not seem to be related to any specific user operation. We are currently working to isolate this issue. 1. 2. 3. Additional info:
Please try to reproduce this on the latest officially released kernel, which is 2.4.21-37.EL (RHEL3 U6, released this past September). There was a post-U5 memory corruption fix that might have accounted for this. Thanks in advance.
Also, if it is reproducible with the latest kernel, please set up netdump and/or diskdump and forward us the vmcore.
We were unable to test with more recent kernels, due to a 3rd-party dependency (a kernel module). However, we have moved forward a lot. The problem arises only on machines that have a very low commit-to-disk performance. For example, the machines that exhibited the bug (with the syslog backtrace) was only able to commit 19 syslog entries on the disk per second. The commit-to-disk performance issue being fixed -- a H/W RAID setup problem -- the problem no longer arises at all. Due to the above, it is very likelly that the long delays spent waiting for write-completion were conccurency windows (the box has 8 processors) exposing to the memory corruption that you have pointed. I will update this record when we will have a chance to rest with the rhel3u6 kernel.
A similar-looking problem was reproduced with rhel3u6 kernel. The problem occurs when rebooting the machine. Here is the backtrace (it is a mnual copy from a screen-shot taken with a digital camera on the console[1]. JPG as attache to this bug record): EIP is at ext3_get_inode_loc [ext3] 0xda (2.4.21.37.ELsmp/i686) eax: 00000000 ebx: c4de1c00 ecx: 0000000c edx: f791ed3c esi: 00000060 edi: 00000d00 ebp: 00000003 esp: f6843e30 ds: 0060 es: 0060 ss: 0060 Process reboot (pid: 1090, stackpage=f6843000) Stack: c017fbd0 f70cb1f8 00000000 c4de1c00 f78cb100 00000003 f78cb100 f78ed080 f78cb100 f78f5400 f8850d7b f78cb100 f6843e84 00000000 c32e8140 00009a9b c4de1c00 c010152a c4de1c00 00009a9b c32e8140 00000000 00000000 f78cb100 Call Trace: [<c017fbd8>] alloc_inode [kernel] 0xc0 (0xf6843e38) [<f8858d7b>] ext3_read_inode [ext3] 0x1b (0xf6843e60) [<c018152a>] iget4_locked [kernel] 0x10a (0xf6843e7c) [<f885a78b>] ext3_lookup [ext3] 0xbb (0xf6843ea4) [<c017338c>] real_lookup [kernel] 0xec (0xf0xf6843ec8) [<c01739e7>] link_path_walk [kernel] 0x487 (0xf6843ee8) [<c0173f69>] path_lookup [kernel] 0x39 (0xf6843f28) [<c017452e>] open_namei [kernel] 0x7e (0xf6843f38) [<c0163813>] filp_open [kernel] 0x43 (0xf6843f68) [<c0163c53>] sys_open [kernel] 0x63 (0xf6843fa0) [1] How could we get a text console other than this VGA stuff BTW? We have no serial link available on this site... About the diskdump/netdump setup, I have requested that it is setup. I do not knwo at this time whether it will be possible or not.
Created attachment 137288 [details] console screen shot
Created attachment 137289 [details] console screen shot