From Bugzilla Helper: User-Agent: Mozilla/5.0 (compatible; Konqueror/3.2; Linux) (KHTML, like Gecko) Description of problem: We had 3 oops-es in last week (since we installed the system) and they were all on the same address and caused by the same user process (sh). The system gets a kernel panic, freezes and has to be rebooted. We managed to capture the last oops dump by redirecting the console to the serial port, but all 3 oops were the same. This is the oops dump processed by ksymoops... ksymoops 2.4.9 on i686 2.4.21-15.ELsmp. Options used -V (default) -K (specified) -l /proc/modules (default) -o /lib/modules/2.4.21-15.ELsmp/ (default) -m /boot/System.map-2.4.21-15.ELsmp (specified) No modules in ksyms, skipping objects No ksyms, skipping lsmod Unable to handle kernel NULL pointer dereference at virtual address 00000000 c01ad325 *pde = 1b017001 Oops: 0000 CPU: 2 EIP: 0060:[<c01ad325>] Tainted: PF Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010246 eax: 00000000 ebx: 00000500 ecx: 00000000 edx: 00000000 esi: 00000000 edi: e175fc00 ebp: c04cfe00 esp: daf4de80 ds: 0068 es: 0068 ss: 0068 Process sh (pid: 2569, stackpage=daf4d000) Stack: 00000000 00000000 c0140a62 0ac42025 00000000 00000000 cd085025 c4e85fa4 db102f00 00000000 b756a540 e1847e00 00030002 daf4c000 00000000 e175fc00 f57fc980 c01adf86 00000500 daf4dee4 c6abbd80 f169e008 00000000 c0179a00 Call Trace: [<c0140a62>] do_anonymous_page [kernel] 0x252 (0xdaf4de88) [<c01adf86>] tty_open [kernel] 0x66 (0xdaf4dec4) [<c0179a00>] dput [kernel] 0x30 (0xdaf4dedc) [<c016f3e6>] link_path_walk [kernel] 0x656 (0xdaf4def0) [<c0161288>] get_chrfops [kernel] 0x98 (0xdaf4df00) [<c0139773>] in_group_p [kernel] 0x23 (0xdaf4df08) [<f88e07ea>] ext3_permission [ext3] 0xaa (0xdaf4df10) [<c0161541>] chrdev_open [kernel] 0x71 (0xdaf4df38) [<c015f790>] dentry_open [kernel] 0x110 (0xdaf4df54) [<c015f678>] filp_open [kernel] 0x68 (0xdaf4df70) [<c015fa83>] sys_open [kernel] 0x53 (0xdaf4dfa8) Code: 8b 04 88 89 44 24 30 85 c0 0f 84 9c 00 00 00 8b 54 24 30 8b >>EIP; c01ad325 <init_dev+55/500> <===== >>ebp; c04cfe00 <dev_tty_driver+0/c0> Trace; c0140a62 <do_anonymous_page+252/510> Trace; c01adf86 <tty_open+66/410> Trace; c0179a00 <dput+30/1b0> Trace; c016f3e6 <link_path_walk+656/7a0> Trace; c0161288 <get_chrfops+98/170> Trace; c0139773 <in_group_p+23/30> Trace; f88e07ea <END_OF_CODE+383b2cd2/????> Trace; c0161541 <chrdev_open+71/b0> Trace; c015f790 <dentry_open+110/210> Trace; c015f678 <filp_open+68/70> Trace; c015fa83 <sys_open+53/c0> Code; c01ad325 <init_dev+55/500> 00000000 <_EIP>: Code; c01ad325 <init_dev+55/500> <===== 0: 8b 04 88 mov (%eax,%ecx,4),%eax <===== Code; c01ad328 <init_dev+58/500> 3: 89 44 24 30 mov %eax,0x30(%esp,1) Code; c01ad32c <init_dev+5c/500> 7: 85 c0 test %eax,%eax Code; c01ad32e <init_dev+5e/500> 9: 0f 84 9c 00 00 00 je ab <_EIP+0xab> Code; c01ad334 <init_dev+64/500> f: 8b 54 24 30 mov 0x30(%esp,1),%edx Code; c01ad338 <init_dev+68/500> 13: 8b 00 mov (%eax),%eax Kernel panic: Fatal exception Version-Release number of selected component (if applicable): 2.4.21-15.ELsmp How reproducible: Sometimes Steps to Reproduce: Don't have the steps to reproduce, but we don't have to wait longer than a couple of days to get the kernel panic. Additional info:
What module(s) have tainted your kernel?
We have VERITAS Volume Manager and VERITAS File System (certified for RHEL 3).
Ok, to be precise. We have: VERITAS Volume Manager, File System and Cluster Server - all from VERITAS Foundation Suite HA 2.2 MP1 which is certified for RHEL 3 (but honestly, not for RHEL 3 update 2). In addition to above, we also have installed (compiled from sources) Intel e1000 module version 5.3.19 (which replaces the original e1000 module supplied with the kernel) and Intel iANS module version 3.4.1 to enable Virtual LANs on ethernet interfaces. We also have installed Oracle ASMlib (for Oracle RAC) which uses a small "oracleasm" kernel module which is certified by Oracle to be compatible with 2.4.21-EL-smp kernel(s). That's it. All other modules are standard included and supoorted by RedHad.
Is there any news about this? I just have to add that we've experienced the same oops on several machines so it's less likely a hardware problem...
Hi Peter, Please pursue this problem through the support team, available at this URL: https://www.redhat.com/apps/support/ By working with support, rather than diving directly into bugzilla, they can help work with you to better isolate the problem. As part of that process, they will try to work with you to see if there is a scenario whereby the problem can be reproduced without the Veritas components. That way we are better able to distinguish where the root problem resides. When brought through support, as the problem is better defined, they will reopen a new bugzilla to track and manage the issue. Consequently, I am closing this bugzilla out.
*** This bug has been marked as a duplicate of 144059 ***
A fix for this problem has just been committed to the RHEL3 U5 patch pool this evening (in kernel version 2.4.21-30.EL). *** This bug has been marked as a duplicate of 144059 ***
A fix for this problem has also been committed to the RHEL3 E5 patch pool this evening (in kernel version 2.4.21-27.0.3.EL).
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-293.html
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-294.html