Bug 696032 - bind9-utils--spawn nslookup causes kernel panic
Summary: bind9-utils--spawn nslookup causes kernel panic
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: x86_64
OS: All
unspecified
urgent
Target Milestone: rc
: ---
Assignee: Steve Best
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-13 05:50 UTC by IBM Bug Proxy
Modified: 2012-05-10 13:05 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-10-12 19:53:03 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 71034 0 None None None Never

Description IBM Bug Proxy 2011-04-13 05:50:50 UTC
=Comment: #0=================================================
Dany R. Madden <danymadden.com> - 
+++ This bug was initially created as a clone of Bug #70576 +++

 
---Problem Description---
spawning nslookup causing kernel panic. 
 
---Additional Hardware Info---
reproduced on two HS20 systems 

 
---uname output---
Linux alrtbc3.austin.ibm.com 2.6.32-71.el6.x86_64 #1 SMP Wed Sep 1 01:33:01 EDT 2010 x86_64 x86_64
x86_64 GNU/Linux
 
Machine Type = IBM BladeCenter HS20 -[884345U]- 
 
---System Hang---
 kernel panic. System had to be rebooted.
 
---Debugger---
A debugger is not configured
 
---Steps to Reproduce---
 Run the script below in a loop. Kernel will panic within 10 to 20 iterations. It appears that the
spawning frequency is the cause of the issue.

#! /usr/bin/expect
      set timeout 15
      spawn nslookup -sil
      expect {
        -re ".*>" {}
        timeout { send "exit \r"; exit 255 }
      }

      send "server 9.3.192.21 \r"
      expect {
        -re ".*Default server:.*Address:.*>" {}
        timeout { send "exit \r"; exit 1 }
      }

      send "ltc.austin.ibm.com. \r"
      expect {
        -re ".*Name:.*Address:.*>" {}
        timeout { send "exit \r"; exit 2 }
      }
                        
                        
      send "exit \r"

spawn nslookup -sil
      expect {
        -re ".*>" {}
        timeout { send "exit \r"; exit 255 }
      }

      send "server 9.3.192.21 \r"
      expect {
        -re ".*Default server:.*Address:.*>" {}
        timeout { send "exit \r"; exit 1 }
      }

      send "galaxy.ltc.austin.ibm.com. \r"
      expect {
        -re ".*Name:.*Address:.*>" {}
        timeout { send "exit \r"; exit 2 }
      }


      send "exit \r"

spawn nslookup -sil
      expect {
        -re ".*>" {}
        timeout { send "exit \r"; exit 255 }
      }

      send "server 9.3.192.21 \r"
      expect {
        -re ".*Default server:.*Address:.*>" {}
        timeout { send "exit \r"; exit 1 }
      }

      send "galaxy.ltc.austin.ibm.com. \r"
      expect {
        -re ".*Name:.*Address:.*>" {}
        timeout { send "exit \r"; exit 2 }
      }


      send "exit \r"

      exit 0

 
---Network Component Data--- 
Userspace tool common name: nslookup 
 
The userspace tool has the following bit modes: both 

Userspace rpm: bind-utils 

Userspace tool obtained from project website:  na

Comment 2 Adam Tkac 2011-04-13 09:06:21 UTC
I'm sure kernel panic means bug in the kernel, not in the nslookup utility.

Can you please verify this issue is still reproducible with the latest kernel (currently 2.6.32-71.24.1.el6)? It includes bunch of fixes over 2.6.32-71.el6.

Reassigning to kernel for further inspection.

Comment 4 IBM Bug Proxy 2011-04-26 15:01:27 UTC
------- Comment From clnperez.com 2011-04-26 10:57 EDT-------
Red Hat,

To answer your question:

Kernel still panics on 2.6.32-130.el6.i686

[root@xracer5 ~]# uname -a
Linux xracer5.x.x.x 2.6.32-130.el6.i686 #1 SMP Tue Apr 5 19:56:32
EDT 2011 i686 i686 i386 GNU/Linux

Comment 5 IBM Bug Proxy 2011-05-17 19:10:25 UTC
------- Comment From clnperez.com 2011-05-17 15:09 EDT-------
Red Hat,

I have a vmcore from a re-create on the original kernel this was reported on. I've uploaded it to ftp://testcase.software.ibm.com/fromibm/linux/vmcore.gz

Are there any updates to this at this point? It's also be seen on a Power system.

Comment 6 IBM Bug Proxy 2011-05-23 17:10:26 UTC
------- Comment From arunabal.com 2011-05-23 13:07 EDT-------
Hi,

Looks like a problem with the tty driver ,

crash> bt
PID: 24597  TASK: ffff880105dc4af0  CPU: 2   COMMAND: "nslookup"
#0 [ffff880106475a50] machine_kexec at ffffffff8103695b
#1 [ffff880106475ab0] crash_kexec at ffffffff810b8f08
#2 [ffff880106475b80] oops_end at ffffffff814cbbd0
#3 [ffff880106475bb0] no_context at ffffffff8104651b
#4 [ffff880106475c00] __bad_area_nosemaphore at ffffffff810467a5
#5 [ffff880106475c50] bad_area at ffffffff810468ce
#6 [ffff880106475c80] do_page_fault at ffffffff814cd740
#7 [ffff880106475cd0] page_fault at ffffffff814caf45
[exception RIP: n_tty_read+713] ------------------------------------->
RIP: ffffffff812f88c9  RSP: ffff880106475d88  RFLAGS: 00010246
RAX: 0000000000000007  RBX: ffff880106474000  RCX: 00007f25d197700a
RDX: 0000000000000000  RSI: 00000000fffffff9  RDI: ffff88011fee6d1c
RBP: ffff880106475e98   R8: 0000000000000001   R9: 0000000000000000
R10: 0000000000000000  R11: 0000000000000000  R12: ffff88011fee6800
R13: 0000000000000000  R14: ffff88011fee6a68  R15: ffff88011fee6d1c
ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#8 [ffff880106475ea0] tty_read at ffffffff812f34c6
#9 [ffff880106475ef0] vfs_read at ffffffff8116d085
#10 [ffff880106475f30] sys_read at ffffffff8116d1c1
#11 [ffff880106475f80] system_call_fastpath at ffffffff81013172
RIP: 0000003f2dcd41ed  RSP: 00007f25d1922cb8  RFLAGS: 00010206
RAX: 0000000000000000  RBX: ffffffff81013172  RCX: 0000000000000000
RDX: 0000000000000400  RSI: 00007f25d1977000  RDI: 0000000000000000
RBP: 0000003f2df79780   R8: 0000003f2df7ae10   R9: 00007f25d1923710
R10: 0000000000000008  R11: 0000000000000293  R12: 0000000000000000
R13: 000000000000000a  R14: 0000003f2df796a0  R15: 000000000000000a
ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b
crash>

[exception RIP: n_tty_read+713] maps to
0xffffffff812f88c9 <n_tty_read+713>:    movsbl (%rdx,%rax,1),%ebx

spin_lock_irqsave(&tty->read_lock, flags); (linux-2.6.32-71.el6.x86_64/drivers/char/n_tty.c:1821)

looks like problem with the read_lock,

And RDX maps to tty->read_lock, but it contains 0000000000000000, couldnt examine tty structure since i dint find a way to move between frames in crash tool.
RDX is getting its value from mov    0x250(%r12),%rdx and r12 contains a valid address.

Thanks,
Aruna.

Comment 7 IBM Bug Proxy 2011-05-24 13:50:36 UTC
------- Comment From jwboyer.ibm.com 2011-05-24 09:42 EDT-------
*** Bug 72013 has been marked as a duplicate of this bug. ***

Comment 8 IBM Bug Proxy 2011-06-21 15:30:24 UTC
------- Comment From arunabal.com 2011-06-21 11:29 EDT-------
Hi Redhat,

We see the problem fixed in Rhel 6.1 , so can you please point us if there is a specfic patch that has fixed the issue.

Thanks,
Aruna,

Comment 9 IBM Bug Proxy 2011-07-20 16:30:26 UTC
------- Comment From clnperez.com 2011-07-20 12:22 EDT-------
Red Hat,

The submitting team has customers running RHEL 6 and would like to create a patched kernel for them as opposed to requiring that they upgrade to RHEL 6.1. We are unable to pinpoint the exact root cause, and therefore are also unable to create a 6.0 patch for these customers without considerable trial-and-error testing.

As a result, we're requesting assistance in determining the cause of the kernel panic if you don't already know which patches we should be looking at backporting.

Thanks!

Comment 10 Eugene Teo (Security Response) 2011-08-24 08:13:58 UTC
Is this triggered by a local, unprivileged user?

Comment 11 IBM Bug Proxy 2011-08-24 08:24:55 UTC
------- Comment From aruna.ibm.com 2011-05-23 13:07 EDT-------









------- Comment From aruna.ibm.com 2011-06-21 11:29 EDT-------

Comment 12 IBM Bug Proxy 2011-09-08 10:20:23 UTC
------- Comment From aruna.ibm.com 2011-09-08 06:10 EDT-------
It was triggered by root.

Thanks.

Comment 13 IBM Bug Proxy 2011-09-30 10:50:20 UTC
------- Comment From aruna.ibm.com 2011-09-30 06:43 EDT-------
Hi Redhat,

We see the problem fixed in Rhel 6.1 , so can you please point us if there is a
specfic patch that has fixed the issue.

Thanks,
Aruna,

Comment 14 RHEL Program Management 2011-10-07 15:30:17 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 15 IBM Bug Proxy 2011-10-10 21:00:22 UTC
------- Comment From clnperez.com 2011-10-10 16:52 EDT-------
RH,

The submitting team has asked that this be updated to urgent, stating the following:

I counted 13 of our customers (some of whom have GA'd) have built their products based on RHEL 6. Upgrading to RHEL 6.1 would be costly compared to just having a simple patch that we can provide to them. This bug was discovered in our testing and is easily reproducible. We  want to have the fix before our customers report the same bug.


Note You need to log in before you can comment on or make changes to this bug.