Red Hat Bugzilla – Bug 177427
rhel3u5: kernel panic in fsync(2) while ~idle
Last modified: 2009-06-02 17:51:30 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc3 Firefox/1.0.7
Description of problem:
After a fresh system boot, our -- java based -- application is started to perform some web-based configuration. The output of this JVM (Sun HotSpot JDK 1.4.2) is redirected to syslog, via a pipe to logger(1).
Here is the crash backtrace:
EIP is at __out_of_line_bug [kernel] 0x17 (2.4.21-32.ELsmp/i686)
eax: 00000026 ebx: f7465f90 ecx: c0383eb4 edx: 01fa7ed7
esi: f7fee000 edi: f7465f90 ebp: 00000009 esp: f7465f34
ds: 0068 es: 0068 ss: 0068
Process syslogd (pid: 766, stackpage=f7465000)
Stack: c02bd964 000000fe c0173c1c 000000fe c0172a90 f7465f90 f7fee000
c0173a87 f7fee000 f7fee000 f7465f90 c0173df9 00252d88 006afb8a
00000000 00000000 00000000 c0162c3b 00000000 bfff8e10 fffffeff
Call Trace: [<c0173c1c>] path_init [kernel] 0x16c (0xf7465f3c)
[<c0172a90>] getname [kernel] 0xa0 (0xf7465f44)
[<c0173a87>] path_lookup [kernel] 0x17 (0xf7465f54)
[<c0173df9>] __user_walk [kernel] 0x49 (0xf7465f64)
[<c0162c3b>] sys_access [kernel] 0x7b (0xf7465f80)
[<c0166377>] sys_fsync [kernel] 0x47 (0xf7465f9c)
Code: 0f 0b 37 01 5f d0 2b c0 90 eb fe 8d b4 26 00 00 00 00 8d bc
Kernel panic: Fatal exception
Version-Release number of selected component (if applicable):
Steps to Reproduce:
No simple scenario: The crash does not seem to be related to any specific user operation. We are currently working to isolate this issue.
Please try to reproduce this on the latest officially released kernel,
which is 2.4.21-37.EL (RHEL3 U6, released this past September). There
was a post-U5 memory corruption fix that might have accounted for this.
Thanks in advance.
Also, if it is reproducible with the latest kernel, please set up
netdump and/or diskdump and forward us the vmcore.
We were unable to test with more recent kernels, due to a 3rd-party dependency
(a kernel module). However, we have moved forward a lot.
The problem arises only on machines that have a very low commit-to-disk
performance. For example, the machines that exhibited the bug (with the syslog
backtrace) was only able to commit 19 syslog entries on the disk per second.
The commit-to-disk performance issue being fixed -- a H/W RAID setup problem --
the problem no longer arises at all.
Due to the above, it is very likelly that the long delays spent waiting for
write-completion were conccurency windows (the box has 8 processors) exposing to
the memory corruption that you have pointed.
I will update this record when we will have a chance to rest with the rhel3u6
A similar-looking problem was reproduced with rhel3u6 kernel. The problem
occurs when rebooting the machine. Here is the backtrace (it is a mnual copy
from a screen-shot taken with a digital camera on the console. JPG as
attache to this bug record):
EIP is at ext3_get_inode_loc [ext3] 0xda (188.8.131.52.ELsmp/i686)
eax: 00000000 ebx: c4de1c00 ecx: 0000000c edx: f791ed3c
esi: 00000060 edi: 00000d00 ebp: 00000003 esp: f6843e30
ds: 0060 es: 0060 ss: 0060
Process reboot (pid: 1090, stackpage=f6843000)
Stack: c017fbd0 f70cb1f8 00000000 c4de1c00 f78cb100 00000003 f78cb100 f78ed080
f78cb100 f78f5400 f8850d7b f78cb100 f6843e84 00000000 c32e8140 00009a9b
c4de1c00 c010152a c4de1c00 00009a9b c32e8140 00000000 00000000 f78cb100
Call Trace: [<c017fbd8>] alloc_inode [kernel] 0xc0 (0xf6843e38)
[<f8858d7b>] ext3_read_inode [ext3] 0x1b (0xf6843e60)
[<c018152a>] iget4_locked [kernel] 0x10a (0xf6843e7c)
[<f885a78b>] ext3_lookup [ext3] 0xbb (0xf6843ea4)
[<c017338c>] real_lookup [kernel] 0xec (0xf0xf6843ec8)
[<c01739e7>] link_path_walk [kernel] 0x487 (0xf6843ee8)
[<c0173f69>] path_lookup [kernel] 0x39 (0xf6843f28)
[<c017452e>] open_namei [kernel] 0x7e (0xf6843f38)
[<c0163813>] filp_open [kernel] 0x43 (0xf6843f68)
[<c0163c53>] sys_open [kernel] 0x63 (0xf6843fa0)
 How could we get a text console other than this VGA stuff BTW? We have no
serial link available on this site...
About the diskdump/netdump setup, I have requested that it is setup. I do not
knwo at this time whether it will be possible or not.
Created attachment 137288 [details]
console screen shot
Created attachment 137289 [details]
console screen shot