Hide Forgot
=Comment: #0================================================= Dany R. Madden <danymadden.com> - +++ This bug was initially created as a clone of Bug #70576 +++ ---Problem Description--- spawning nslookup causing kernel panic. ---Additional Hardware Info--- reproduced on two HS20 systems ---uname output--- Linux alrtbc3.austin.ibm.com 2.6.32-71.el6.x86_64 #1 SMP Wed Sep 1 01:33:01 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux Machine Type = IBM BladeCenter HS20 -[884345U]- ---System Hang--- kernel panic. System had to be rebooted. ---Debugger--- A debugger is not configured ---Steps to Reproduce--- Run the script below in a loop. Kernel will panic within 10 to 20 iterations. It appears that the spawning frequency is the cause of the issue. #! /usr/bin/expect set timeout 15 spawn nslookup -sil expect { -re ".*>" {} timeout { send "exit \r"; exit 255 } } send "server 9.3.192.21 \r" expect { -re ".*Default server:.*Address:.*>" {} timeout { send "exit \r"; exit 1 } } send "ltc.austin.ibm.com. \r" expect { -re ".*Name:.*Address:.*>" {} timeout { send "exit \r"; exit 2 } } send "exit \r" spawn nslookup -sil expect { -re ".*>" {} timeout { send "exit \r"; exit 255 } } send "server 9.3.192.21 \r" expect { -re ".*Default server:.*Address:.*>" {} timeout { send "exit \r"; exit 1 } } send "galaxy.ltc.austin.ibm.com. \r" expect { -re ".*Name:.*Address:.*>" {} timeout { send "exit \r"; exit 2 } } send "exit \r" spawn nslookup -sil expect { -re ".*>" {} timeout { send "exit \r"; exit 255 } } send "server 9.3.192.21 \r" expect { -re ".*Default server:.*Address:.*>" {} timeout { send "exit \r"; exit 1 } } send "galaxy.ltc.austin.ibm.com. \r" expect { -re ".*Name:.*Address:.*>" {} timeout { send "exit \r"; exit 2 } } send "exit \r" exit 0 ---Network Component Data--- Userspace tool common name: nslookup The userspace tool has the following bit modes: both Userspace rpm: bind-utils Userspace tool obtained from project website: na
I'm sure kernel panic means bug in the kernel, not in the nslookup utility. Can you please verify this issue is still reproducible with the latest kernel (currently 2.6.32-71.24.1.el6)? It includes bunch of fixes over 2.6.32-71.el6. Reassigning to kernel for further inspection.
------- Comment From clnperez.com 2011-04-26 10:57 EDT------- Red Hat, To answer your question: Kernel still panics on 2.6.32-130.el6.i686 [root@xracer5 ~]# uname -a Linux xracer5.x.x.x 2.6.32-130.el6.i686 #1 SMP Tue Apr 5 19:56:32 EDT 2011 i686 i686 i386 GNU/Linux
------- Comment From clnperez.com 2011-05-17 15:09 EDT------- Red Hat, I have a vmcore from a re-create on the original kernel this was reported on. I've uploaded it to ftp://testcase.software.ibm.com/fromibm/linux/vmcore.gz Are there any updates to this at this point? It's also be seen on a Power system.
------- Comment From arunabal.com 2011-05-23 13:07 EDT------- Hi, Looks like a problem with the tty driver , crash> bt PID: 24597 TASK: ffff880105dc4af0 CPU: 2 COMMAND: "nslookup" #0 [ffff880106475a50] machine_kexec at ffffffff8103695b #1 [ffff880106475ab0] crash_kexec at ffffffff810b8f08 #2 [ffff880106475b80] oops_end at ffffffff814cbbd0 #3 [ffff880106475bb0] no_context at ffffffff8104651b #4 [ffff880106475c00] __bad_area_nosemaphore at ffffffff810467a5 #5 [ffff880106475c50] bad_area at ffffffff810468ce #6 [ffff880106475c80] do_page_fault at ffffffff814cd740 #7 [ffff880106475cd0] page_fault at ffffffff814caf45 [exception RIP: n_tty_read+713] -------------------------------------> RIP: ffffffff812f88c9 RSP: ffff880106475d88 RFLAGS: 00010246 RAX: 0000000000000007 RBX: ffff880106474000 RCX: 00007f25d197700a RDX: 0000000000000000 RSI: 00000000fffffff9 RDI: ffff88011fee6d1c RBP: ffff880106475e98 R8: 0000000000000001 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff88011fee6800 R13: 0000000000000000 R14: ffff88011fee6a68 R15: ffff88011fee6d1c ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffff880106475ea0] tty_read at ffffffff812f34c6 #9 [ffff880106475ef0] vfs_read at ffffffff8116d085 #10 [ffff880106475f30] sys_read at ffffffff8116d1c1 #11 [ffff880106475f80] system_call_fastpath at ffffffff81013172 RIP: 0000003f2dcd41ed RSP: 00007f25d1922cb8 RFLAGS: 00010206 RAX: 0000000000000000 RBX: ffffffff81013172 RCX: 0000000000000000 RDX: 0000000000000400 RSI: 00007f25d1977000 RDI: 0000000000000000 RBP: 0000003f2df79780 R8: 0000003f2df7ae10 R9: 00007f25d1923710 R10: 0000000000000008 R11: 0000000000000293 R12: 0000000000000000 R13: 000000000000000a R14: 0000003f2df796a0 R15: 000000000000000a ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b crash> [exception RIP: n_tty_read+713] maps to 0xffffffff812f88c9 <n_tty_read+713>: movsbl (%rdx,%rax,1),%ebx spin_lock_irqsave(&tty->read_lock, flags); (linux-2.6.32-71.el6.x86_64/drivers/char/n_tty.c:1821) looks like problem with the read_lock, And RDX maps to tty->read_lock, but it contains 0000000000000000, couldnt examine tty structure since i dint find a way to move between frames in crash tool. RDX is getting its value from mov 0x250(%r12),%rdx and r12 contains a valid address. Thanks, Aruna.
------- Comment From jwboyer.ibm.com 2011-05-24 09:42 EDT------- *** Bug 72013 has been marked as a duplicate of this bug. ***
------- Comment From arunabal.com 2011-06-21 11:29 EDT------- Hi Redhat, We see the problem fixed in Rhel 6.1 , so can you please point us if there is a specfic patch that has fixed the issue. Thanks, Aruna,
------- Comment From clnperez.com 2011-07-20 12:22 EDT------- Red Hat, The submitting team has customers running RHEL 6 and would like to create a patched kernel for them as opposed to requiring that they upgrade to RHEL 6.1. We are unable to pinpoint the exact root cause, and therefore are also unable to create a 6.0 patch for these customers without considerable trial-and-error testing. As a result, we're requesting assistance in determining the cause of the kernel panic if you don't already know which patches we should be looking at backporting. Thanks!
Is this triggered by a local, unprivileged user?
------- Comment From aruna.ibm.com 2011-05-23 13:07 EDT------- ------- Comment From aruna.ibm.com 2011-06-21 11:29 EDT-------
------- Comment From aruna.ibm.com 2011-09-08 06:10 EDT------- It was triggered by root. Thanks.
------- Comment From aruna.ibm.com 2011-09-30 06:43 EDT------- Hi Redhat, We see the problem fixed in Rhel 6.1 , so can you please point us if there is a specfic patch that has fixed the issue. Thanks, Aruna,
Since RHEL 6.2 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
------- Comment From clnperez.com 2011-10-10 16:52 EDT------- RH, The submitting team has asked that this be updated to urgent, stating the following: I counted 13 of our customers (some of whom have GA'd) have built their products based on RHEL 6. Upgrading to RHEL 6.1 would be costly compared to just having a simple patch that we can provide to them. This bug was discovered in our testing and is easily reproducible. We want to have the fix before our customers report the same bug.