Bug 569106 - netconsole fails with tg3
Summary: netconsole fails with tg3
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: John Feeney
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 600498
TreeView+ depends on / blocked
 
Reported: 2010-02-28 06:37 UTC by Vivian Bian
Modified: 2011-01-13 20:36 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 654613 (view as bug list)
Environment:
Last Closed: 2011-01-13 20:36:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
serial output on RHEVH (767.31 KB, text/plain)
2010-02-28 06:37 UTC, Vivian Bian
no flags Details
vdc-log.txt (11.24 KB, text/plain)
2010-02-28 06:50 UTC, Vivian Bian
no flags Details
serial log (98.49 KB, application/x-bzip)
2010-04-16 13:20 UTC, Vivian Bian
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0017 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.6 kernel security and bug fix update 2011-01-13 10:37:42 UTC

Description Vivian Bian 2010-02-28 06:37:36 UTC
Created attachment 396832 [details]
serial output on RHEVH

Description of problem:
approve a RHEVH host which is in pending approve status , after installing for a while, there is a kdump generated on a/p array installed RHEVH

Version-Release number of selected component (if applicable):
kernel  2.6.18-190.el5
vdsm-4.5-32.el5rhev
RHEVH 5.5-2.2 (0.2)

RHEVM-43835.exe

How reproducible:
98%

Steps to Reproduce:
1.install a new RHEVH on a/p array 
2.register RHEVH to RHEVM 
3.approve new registered RHEVH in RHEVM
  
Actual results:
kdump on RHEVH

Expected results:


Additional info:

Comment 1 Vivian Bian 2010-02-28 06:40:37 UTC
the record in vdsm.log on the point that kdump happens is :
============================================================
MainThread::INFO::2010-02-28 13:04:48,559::dispatcher::69::irs::Run and protect: prepareForShutdown, args: ()^M
MainThread::DEBUG::2010-02-28 13:04:48,560::task::573::irs::Task 9bc750fe-f0a8-46de-be55-13744a4114ac: Prepare: <bound method HSM.public_prepareForShutdown of <storage.hsm.HSM instance at 0x2ad23c5dfbd8>> () {}^M
MainThread::DEBUG::2010-02-28 13:04:48,561::task::573::irs::Task 9bc750fe-f0a8-46de-be55-13744a4114ac: moving from state init -> state preparing^M
MainThread::DEBUG::2010-02-28 13:04:48,561::misc::91::irs::['/usr/bin/killall', '-g', '-USR1', 'spmprotect.sh'] (cwd None)^M
MainThread::WARNING::2010-02-28 13:04:48,604::misc::112::irs::FAILED: <err> = 'spmprotect.sh: no process killed\n'; <rc> = 1^M
MainThread::DEBUG::2010-02-28 13:04:48,606::spm::192::irs::(SPM.__cleanup) cleaning links; [] []^M
MainThread::DEBUG::2010-02-28 13:04:48,606::taskManager::65::irs::(TaskManager.prepareForShutdown) Request to stop all tasks^M
MainThread::DEBUG::2010-02-28 13:04:48,606::task::573::irs::Task 9bc750fe-f0a8-46de-be55-13744a4114ac: finished: None^M
MainThread::DEBUG::2010-02-28 13:04:48,606::task::573::irs::Task 9bc750fe-f0a8-46de-be55-13744a4114ac: moving from state preparing -> state finished^M
MainThread::DEBUG::2010-02-28 13:04:48,607::resource::656::irs::Owner.releaseAll requests [] resources []^M
MainThread::DEBUG::2010-02-28 13:04:48,607::task::573::irs::Task 9bc750fe-f0a8-46de-be55-13744a4114ac: ref 0 aborting False^M
MainThread::INFO::2010-02-28 13:04:48,608::dispatcher::74::irs::Run and protect: prepareForShutdown, Return response: {'status': {'message': 'OK', 'code': 0}}^M
MainThread::INFO::2010-02-28 13:04:48,608::vdsm::92::vds::VDSM main thread ended. Waiting for 12 other threads...^M
MainThread::INFO::2010-02-28 13:04:48,608::vdsm::95::vds::<WorkerThread(Thread-1, started)>^M
MainThread::INFO::2010-02-28 13:04:48,608::vdsm::95::vds::<WorkerThread(Thread-3, started)>^M
MainThread::INFO::2010-02-28 13:04:48,609::vdsm::95::vds::<HostStatsThread(Thread-11, started)>^M
MainThread::INFO::2010-02-28 13:04:48,609::vdsm::95::vds::<WorkerThread(Thread-4, started)>^M
MainThread::INFO::2010-02-28 13:04:48,609::vdsm::95::vds::<KsmMonitorThread(KsmMonitor, started daemon)>^M
MainThread::INFO::2010-02-28 13:04:48,609::vdsm::95::vds::<WorkerThread(Thread-5, started)>^M
MainThread::INFO::2010-02-28 13:04:48,610::vdsm::95::vds::<WorkerThread(Thread-6, started)>^M
MainThread::INFO::2010-02-28 13:04:48,610::vdsm::95::vds::<_MainThread(MainThread, started)>^M
MainThread::INFO::2010-02-28 13:04:48,610::vdsm::95::vds::<WorkerThread(Thread-10, started)>^M
MainThread::INFO::2010-02-28 13:04:48,610::vdsm::95::vds::<WorkerThread(Thread-7, started)>^M
MainThread::INFO::2010-02-28 13:04:48,611::vdsm::95::vds::<WorkerThread(Thread-8, started)>^M
MainThread::INFO::2010-02-28 13:04:48,611::vdsm::95::vds::<WorkerThread(Thread-9, started)>^M
MainThread::INFO::2010-02-28 13:04:48,611::vdsm::95::vds::<WorkerThread(Thread-2, started)>^M

Comment 2 Vivian Bian 2010-02-28 06:43:40 UTC
kdump core file analysing result is :
=====================================

crash> bt
PID: 22442  TASK: ffff81020e766100  CPU: 1   COMMAND: "modprobe"
#0 [ffff8102072ffb50] crash_kexec at ffffffff800ae9d8
#1 [ffff8102072ffc10] __die at ffffffff80066157
#2 [ffff8102072ffc50] do_page_fault at ffffffff80067dd7
#3 [ffff8102072ffd40] error_exit at ffffffff8005ede9
   [exception RIP: tg3_interrupt+15]
   RIP: ffffffff882c4e3a  RSP: ffff8102072ffdf8  RFLAGS: 00010046
   RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000001
   RDX: 0000000000000000  RSI: ffff8102298ba000  RDI: 000000000000009a
   RBP: ffff8102298ba000   R8: 00000000e22cbc42   R9: 0000000000000000
   R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
   R13: ffff810229ea61c0  R14: 0000000000000000  R15: 0000000018700580
   ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#4 [ffff8102072ffe10] tg3_poll_controller at ffffffff882c82e4
#5 [ffff8102072ffe30] netpoll_poll at ffffffff8023a842
#6 [ffff8102072ffe90] netpoll_send_skb at ffffffff8023ac56
#7 [ffff8102072ffec0] write_msg at ffffffff885b30d0
#8 [ffff8102072ffef0] __call_console_drivers at ffffffff80092c53
#9 [ffff8102072fff10] release_console_sem at ffffffff80017203
#10 [ffff8102072fff40] init_module at ffffffff885b307c
#11 [ffff8102072fff50] sys_init_module at ffffffff800a7e4d
#12 [ffff8102072fff80] tracesys at ffffffff8005e28d (via system_call)
   RIP: 00002b2fc3a7317a  RSP: 00007fffc4b9fad8  RFLAGS: 00000206
   RAX: ffffffffffffffda  RBX: ffffffff8005e28d  RCX: ffffffffffffffff
   RDX: 00000000186f64c0  RSI: 0000000000009800  RDI: 0000000018700580
   RBP: 00000000186f63f8   R8: 00002b2fc3cf66e0   R9: 0000000018709d80
   R10: 00000000186f64c0  R11: 0000000000000206  R12: 0000000000000000
   R13: 00000000186f63e0  R14: 0000000000000000  R15: 0000000018700580
   ORIG_RAX: 00000000000000af  CS: 0033  SS: 002b
crash> crash> exit

Comment 3 Vivian Bian 2010-02-28 06:50:25 UTC
Created attachment 396833 [details]
vdc-log.txt

Comment 5 Vivian Bian 2010-03-07 12:52:23 UTC
now with kernel 2.6.18-191.el5 , RHEVH 5.5-2.2.0.4 vdsm-4.5-34.el5rhev could 100% reproduce this bug . 

kdump analyzing result is the same with comment #2 , and the tail -F /var/log/vdsm/vdsm.log is as follows at the dump point

MainThread::DEBUG::2010-03-07 19:38:20,342::resource::656::irs::Owner.releaseAll requests [] resources []
MainThread::INFO::2010-03-07 19:38:20,343::dispatcher::69::irs::Run and protect: prepareForShutdown, args: ()
MainThread::DEBUG::2010-03-07 19:38:20,343::task::573::irs::Task 296278ed-4cd5-4cf6-9759-7cdcd743ba49: Prepare: <bound method HSM.public_prepareForShutdown of <storage.hsm.HSM instance at 0x2b30b113aab8>> () {}
MainThread::DEBUG::2010-03-07 19:38:20,343::task::573::irs::Task 296278ed-4cd5-4cf6-9759-7cdcd743ba49: moving from state init -> state preparing
MainThread::DEBUG::2010-03-07 19:38:20,343::misc::91::irs::['/usr/bin/killall', '-g', '-USR1', 'spmprotect.sh'] (cwd None)
MainThread::WARNING::2010-03-07 19:38:20,385::misc::112::irs::FAILED: <err> = 'spmprotect.sh: no process killed\n'; <rc> = 1
MainThread::DEBUG::2010-03-07 19:38:20,386::spm::192::irs::(SPM.__cleanup) cleaning links; [] []
MainThread::DEBUG::2010-03-07 19:38:20,386::taskManager::65::irs::(TaskManager.prepareForShutdown) Request to stop all tasks
MainThread::DEBUG::2010-03-07 19:38:20,386::task::573::irs::Task 296278ed-4cd5-4cf6-9759-7cdcd743ba49: finished: None
MainThread::DEBUG::2010-03-07 19:38:20,387::task::573::irs::Task 296278ed-4cd5-4cf6-9759-7cdcd743ba49: moving from state preparing -> state finished
MainThread::DEBUG::2010-03-07 19:38:20,387::resource::656::irs::Owner.releaseAll requests [] resources []
MainThread::DEBUG::2010-03-07 19:38:20,387::task::573::irs::Task 296278ed-4cd5-4cf6-9759-7cdcd743ba49: ref 0 aborting False
MainThread::INFO::2010-03-07 19:38:20,387::dispatcher::74::irs::Run and protect: prepareForShutdown, Return response: {'status': {'message': 'OK', 'code': 0}}
MainThread::INFO::2010-03-07 19:38:20,388::vdsm::92::vds::VDSM main thread ended. Waiting for 12 other threads...
MainThread::INFO::2010-03-07 19:38:20,388::vdsm::95::vds::<WorkerThread(Thread-1, started)>
MainThread::INFO::2010-03-07 19:38:20,388::vdsm::95::vds::<WorkerThread(Thread-3, started)>
MainThread::INFO::2010-03-07 19:38:20,388::vdsm::95::vds::<WorkerThread(Thread-5, started)>
MainThread::INFO::2010-03-07 19:38:20,389::vdsm::95::vds::<KsmMonitorThread(KsmMonitor, started daemon)>
MainThread::INFO::2010-03-07 19:38:20,389::vdsm::95::vds::<WorkerThread(Thread-6, started)>
MainThread::INFO::2010-03-07 19:38:20,389::vdsm::95::vds::<WorkerThread(Thread-7, started)>
MainThread::INFO::2010-03-07 19:38:20,389::vdsm::95::vds::<WorkerThread(Thread-8, started)>
MainThread::INFO::2010-03-07 19:38:20,390::vdsm::95::vds::<_MainThread(MainThread, started)>
MainThread::INFO::2010-03-07 19:38:20,390::vdsm::95::vds::<WorkerThread(Thread-9, started)>
MainThread::INFO::2010-03-07 19:38:20,390::vdsm::95::vds::<WorkerThread(Thread-2, started)>
MainThread::INFO::2010-03-07 19:38:20,390::vdsm::95::vds::<HostStatsThread(Thread-11, started)>
MainThread::INFO::2010-03-07 19:38:20,391::vdsm::95::vds::<WorkerThread(Thread-10, started)>
MainThread::INFO::2010-03-07 19:38:20,391::vdsm::95::vds::<WorkerThread(Thread-4, started)>

Comment 6 Vivian Bian 2010-03-09 14:36:23 UTC
with further investigation , we now get following result :

1.after RHEVM approve netconsole setup in /etc/sysconfig/netconsole is opened 
2.when start vdsmd , netconsole daemon is initialized as well . 
3.when starting netconsole on tg3 NIC, there would be always a kdump happens.

Comment 10 John Feeney 2010-03-19 17:31:01 UTC
I have a fix that might solve this problem. Please refer to the kernels on my people page, http://people.redhat.com/jfeeney/.rhel5-tg3

Any testing feedback would be appreciated.

Comment 11 XinSun 2010-04-09 09:49:53 UTC
I can reproduce this bug on rhev-h-5.5-2.2.0.10 , with 
kernel-2.6.18-194.el5
vdsm-4.5-45.el5rhev

but this problem is not 100% produced, maybe 20%

#cat /var/log/vdsm/vdsm.log

MainThread::INFO::2010-04-09 09:16:10,316::vdsm::49::vds::I am the actual vdsmd
MainThread::INFO::2010-04-09 09:16:10,317::vdsm::37::vds::I am Watchdog - vdsm pid is 12306
MainThread::ERROR::2010-04-09 09:16:10,490::vdsm::89::vds::Traceback (most recent call last):
  File "/usr/share/vdsm/vdsm", line 87, in run
    serve_clients(log)
  File "/usr/share/vdsm/vdsm", line 61, in serve_clients
    cif = clientIF.clientIF(log)
  File "/usr/share/vdsm/clientIF.py", line 52, in __init__
  File "/usr/share/vdsm/clientIF.py", line 118, in _createXMLRPCServer
  File "/usr/share/vdsm/SecureXMLRPCServer.py", line 132, in __init__
  File "/usr/lib64/python2.4/site-packages/M2Crypto/SSL/Context.py", line 96, in load_cert_chain
SSLError: No such file or directory

MainThread::INFO::2010-04-09 09:16:10,490::vdsm::91::vds::VDSM main thread ended. Waiting for 0 other threads...
MainThread::INFO::2010-04-09 09:16:10,491::vdsm::94::vds::<_MainThread(MainThread, started)>
MainThread::ERROR::2010-04-09 09:16:10,519::vdsm::45::vds::VDSMD crashed immediately - ( < 2 sec) exiting without recovering
MainThread::ERROR::2010-04-09 09:16:10,519::vdsm::89::vds::Traceback (most recent call last):
  File "/usr/share/vdsm/vdsm", line 85, in run
    startWatchdog(log)
  File "/usr/share/vdsm/vdsm", line 46, in startWatchdog
    sys.exit(3)
SystemExit: 3

MainThread::INFO::2010-04-09 09:16:10,520::vdsm::91::vds::VDSM main thread ended. Waiting for 0 other threads...
MainThread::INFO::2010-04-09 09:16:10,520::vdsm::94::vds::<_MainThread(MainThread, started)>
MainThread::INFO::2010-04-09 09:22:10,920::vdsm::49::vds::I am the actual vdsmd
MainThread::INFO::2010-04-09 09:22:10,922::vdsm::37::vds::I am Watchdog - vdsm pid is 11738
MainThread::ERROR::2010-04-09 09:22:11,100::vdsm::89::vds::Traceback (most recent call last):
  File "/usr/share/vdsm/vdsm", line 87, in run
    serve_clients(log)
  File "/usr/share/vdsm/vdsm", line 61, in serve_clients
    cif = clientIF.clientIF(log)
  File "/usr/share/vdsm/clientIF.py", line 52, in __init__
  File "/usr/share/vdsm/clientIF.py", line 118, in _createXMLRPCServer
  File "/usr/share/vdsm/SecureXMLRPCServer.py", line 132, in __init__
  File "/usr/lib64/python2.4/site-packages/M2Crypto/SSL/Context.py", line 96, in load_cert_chain
SSLError: No such file or directory

MainThread::INFO::2010-04-09 09:22:11,101::vdsm::91::vds::VDSM main thread ended. Waiting for 0 other threads...
MainThread::INFO::2010-04-09 09:22:11,101::vdsm::94::vds::<_MainThread(MainThread, started)>
MainThread::ERROR::2010-04-09 09:22:11,128::vdsm::45::vds::VDSMD crashed immediately - ( < 2 sec) exiting without recovering
MainThread::ERROR::2010-04-09 09:22:11,129::vdsm::89::vds::Traceback (most recent call last):
  File "/usr/share/vdsm/vdsm", line 85, in run
    startWatchdog(log)
  File "/usr/share/vdsm/vdsm", line 46, in startWatchdog
    sys.exit(3)
SystemExit: 3

MainThread::INFO::2010-04-09 09:22:11,129::vdsm::91::vds::VDSM main thread ended. Waiting for 0 other threads...
MainThread::INFO::2010-04-09 09:22:11,129::vdsm::94::vds::<_MainThread(MainThread, started)>

Comment 12 Alan Pevec 2010-04-09 10:35:40 UTC
XinSun, is that a machine with tg3 ?
Note that patch from comment 10 is not included yet in 5.5 kernel.

Mike, could you please provide to QE a test image with the above test kernel?

Comment 13 XinSun 2010-04-09 10:55:35 UTC
Hi Alan, yes, it is a tg3 NIC.

Comment 17 Vivian Bian 2010-04-16 13:20:44 UTC
Created attachment 407098 [details]
serial log

Comment 20 RHEL Program Management 2010-05-20 12:48:42 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 23 Jarod Wilson 2010-06-14 18:22:47 UTC
in kernel-2.6.18-203.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 25 Douglas Silas 2010-06-28 20:49:54 UTC
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
Red Hat Enterprise Linux 5.4 SMP guests running on the Red Hat Enterprise Virtualization Hypervisor may have experienced inconsistent time, such as the clock driftingbackwards. This could have caused some applications to become unresponsive.

Comment 27 Douglas Silas 2010-06-28 20:52:05 UTC
Deleted Technical Notes Contents.

Old Contents:
Red Hat Enterprise Linux 5.4 SMP guests running on the Red Hat Enterprise Virtualization Hypervisor may have experienced inconsistent time, such as the clock driftingbackwards. This could have caused some applications to become unresponsive.

Comment 29 Matt Carlson 2010-09-15 19:26:51 UTC
John, I might know what this is.  If I developed a test patch, is someone still available to test it?

Comment 30 John Feeney 2010-09-15 19:50:25 UTC
Hey Matt,
I'm pretty sure I know what this is...it`s a result of not providing the correct value to tg3_interrupt(). The following patch was found to fix the problem.

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 939bdd7..5dc97df 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -4896,7 +4896,7 @@ static void tg3_poll_controller(struct net_device *dev)
 	struct tg3 *tp = netdev_priv(dev);
 
 	for (i = 0; i < tp->irq_cnt; i++)
-		tg3_interrupt(tp->napi[i].irq_vec, dev, NULL);
+		tg3_interrupt(tp->napi[i].irq_vec, &tp->napi[i], NULL);
 }
 #endif

I hope that`s the same one you have, otherwise we may have another problem. It looks like this just needs to be verified. A z-stream bugzilla was created and is now in closed errata state so someone must have tested that to find it fixed.

But thanks for offering a patch.
  John

Comment 31 Matt Carlson 2010-09-15 20:06:34 UTC
Ah.  O.K.  Thanks for the update.

Comment 34 Mohua Li 2010-11-18 11:59:14 UTC
oops, didn't notice the flags, just saw the version, 

test on 
Red Hat Enterprise Virtualization Hypervisor release 5.6 (5.el5)

steps, 
1,register to rhevm with tg3 nic setting,
2,after rhev-hypervisor up,restart netconsole service about 20 times,

actual result,
rhev-hypervisor is up, but still met vdsmd crash error once, during this process,

[root@amd-1352-8-1 vdsm]# cat vdsm.log  | grep -i error
MainThread::ERROR::2010-11-18 11:46:01,696::vdsm::93::vds::Traceback (most recent call last):
SSLError: No such file or directory
MainThread::ERROR::2010-11-18 11:46:01,724::vdsm::49::vds::VDSMD crashed immediately - ( < 2 sec) exiting without recovering
MainThread::ERROR::2010-11-18 11:46:01,725::vdsm::93::vds::Traceback (most recent call last):
[root@amd-1352-8-1 vdsm]# date
Thu Nov 18 11:53:35 UTC 2010



so set it to assigned, and also from comment 32,33, clone it to 5.5.2.2.Z,

Comment 35 Alan Pevec 2010-11-18 12:27:50 UTC
test case is not valid for this issue, bug was a kernel crash when starting netconsole

Comment 36 Mohua Li 2010-11-18 13:04:52 UTC
ok, as there is no kernel crash after approved on rhevm side, and rhev-hypervisor is up, so set it to verified, and for the error message, will use BZ 654613 to trace

[root@amd-1352-8-1 vdsm-reg]# uname -a
Linux amd-1352-8-1.englab.nay.redhat.com 2.6.18-230.el5 #1 SMP Thu Oct 28 17:09:10 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
[root@amd-1352-8-1 vdsm-reg]# rpm -qa | grep vdsm
vdsm22-4.5-63.6.el5

Comment 37 John Brier 2010-12-17 21:09:39 UTC
me too!

I just personally ran into this on one of my own testing systems.

      KERNEL: vmlinux                           
    DUMPFILE: vmcore
        CPUS: 4
        DATE: Fri Dec 17 18:24:58 2010
      UPTIME: 00:03:02
LOAD AVERAGE: 0.62, 0.30, 0.12
       TASKS: 151
    NODENAME: dhcp53-94.gsslab.rdu.redhat.com
     RELEASE: 2.6.18-194.3.1.el5
     VERSION: #1 SMP Sun May 2 04:17:42 EDT 2010
     MACHINE: x86_64  (1861 Mhz)
      MEMORY: 7.9 GB
       PANIC: "Oops: 0000 [1] SMP " (check log for details)
         PID: 7414
     COMMAND: "modprobe"
        TASK: ffff81021b798080  [THREAD_INFO: ffff81021a82c000]
         CPU: 2
       STATE: TASK_RUNNING (PANIC)

netconsole: local port 6666
netconsole: local IP 10.12.53.94
netconsole: interface eth0
netconsole: remote port 25285
netconsole: remote IP 10.12.53.204
netconsole: remote ethernet address 52:54:00:a6:b1:bb
Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: 
 [<ffffffff881f4df9>] :tg3:tg3_interrupt+0xf/0x10c
PGD 21af68067 PUD 21a8c3067 PMD 0 
Oops: 0000 [1] SMP 
last sysfs file: /devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/timeout
CPU 2 
Modules linked in: netconsole bonding tun lockd sunrpc xt_physdev bridge ipt_REJECT xt_tcpudp xt_state ip_conntrack nfnetlink xt_multiport iptable_filter ip_tables x_tables be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi ksm(U) kvm_intel(U) kvm(U) sg shpchp i2c_core tg3 floppy i5000_edac edac_mc squashfs dm_snapshot ext3 jbd dm_round_robin dm_multipath dm_mod sd_mod ata_piix uhci_hcd ehci_hcd ahci libata ide_cd scsi_dh_rdac scsi_dh loop sr_mod scsi_mod cdrom
Pid: 7414, comm: modprobe Tainted: G      2.6.18-194.3.1.el5 #1
RIP: 0010:[<ffffffff881f4df9>]  [<ffffffff881f4df9>] :tg3:tg3_interrupt+0xf/0x10c
RSP: 0018:ffff81021a82ddf8  EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffff810238c20000 RDI: 0000000000000072
RBP: ffff810238c20000 R08: 00000000aace85c3 R09: 0000000000000000
R10: 0000000000000097 R11: 0000000000000007 R12: 0000000000000000
R13: ffff81021ac02d80 R14: 0000000000000000 R15: 000000000d31f720
FS:  00002b2c216706e0(0000) GS:ffff810107f22ec0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 000000021a97d000 CR4: 00000000000026e0
Process modprobe (pid: 7414, threadinfo ffff81021a82c000, task ffff81021b798080)
Stack:  0000000000000001 ffff810238c20500 ffff810238c20000 ffffffff881f8292
 ffff810238c20000 ffffffff88594ae0 ffffffff88594ae0 ffffffff80239b2b
 000000000000002c ffffffff80239f9a 0000000000000019 0000000000000038
Call Trace:
 [<ffffffff881f8292>] :tg3:tg3_poll_controller+0x2c/0x3e
 [<ffffffff80239b2b>] netpoll_poll+0x3a/0x365
 [<ffffffff80239f9a>] find_skb+0x41/0xec
 [<ffffffff80239f3f>] netpoll_send_skb+0xe9/0x103
 [<ffffffff885940d0>] :netconsole:write_msg+0x40/0x58
 [<ffffffff80091cf9>] __call_console_drivers+0x5b/0x69
 [<ffffffff800171dd>] release_console_sem+0x149/0x20e
 [<ffffffff8859407c>] :netconsole:init_netconsole+0x5f/0x73
 [<ffffffff800a6efa>] sys_init_module+0xaf/0x1f2
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 41 f6 04 24 01 75 20 f6 83 ab 0a 00 00 40 0f 85 db 00 00 00 
RIP  [<ffffffff881f4df9>] :tg3:tg3_interrupt+0xf/0x10c
 RSP <ffff81021a82ddf8>


PID: 7414   TASK: ffff81021b798080  CPU: 2   COMMAND: "modprobe"
 #0 [ffff81021a82db50] crash_kexec at ffffffff800ada85
 #1 [ffff81021a82dc10] __die at ffffffff80065157
 #2 [ffff81021a82dc50] do_page_fault at ffffffff80066dd7
 #3 [ffff81021a82dd40] error_exit at ffffffff8005dde9
    [exception RIP: tg3_interrupt+15]
    RIP: ffffffff881f4df9  RSP: ffff81021a82ddf8  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000001
    RDX: 0000000000000000  RSI: ffff810238c20000  RDI: 0000000000000072
    RBP: ffff810238c20000   R8: 00000000aace85c3   R9: 0000000000000000
    R10: 0000000000000097  R11: 0000000000000007  R12: 0000000000000000
    R13: ffff81021ac02d80  R14: 0000000000000000  R15: 000000000d31f720
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #4 [ffff81021a82de10] tg3_poll_controller at ffffffff881f8292
 #5 [ffff81021a82de30] netpoll_poll at ffffffff80239b2b
 #6 [ffff81021a82de90] netpoll_send_skb at ffffffff80239f3f
 #7 [ffff81021a82dec0] write_msg at ffffffff885940d0
 #8 [ffff81021a82def0] __call_console_drivers at ffffffff80091cf9
 #9 [ffff81021a82df10] release_console_sem at ffffffff800171dd
#10 [ffff81021a82df40] init_module at ffffffff8859407c
#11 [ffff81021a82df50] sys_init_module at ffffffff800a6efa
#12 [ffff81021a82df80] tracesys at ffffffff8005d28d (via system_call)

Comment 38 Alan Pevec 2010-12-17 22:19:04 UTC
John, could you retest with the latest 5.6 kernel? Thanks.

Comment 40 errata-xmlrpc 2011-01-13 20:36:12 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html


Note You need to log in before you can comment on or make changes to this bug.