Bug 461014

Summary: netdump fails when bnx2 has remote copper PHY - Badness in local_bh_enable at kernel/softirq.c:141
Product: Red Hat Enterprise Linux 4 Reporter: Flavio Leitner <fleitner>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.7CC: agospoda, akarlsso, dhoward, fluo, lmcilroy, ltroan, qcai, tao, vmayatsk
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-18 19:10:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 466113    
Attachments:
Description Flags
Patch fixing spin lock
none
patch to disable softirqs in netpoll
none
log from patch id=319285
none
traffic dump of the netdump session
none
patch to disable softirqs entirely during netdump operation
none
patch to disable softirqs during netdump, disable hard irqs sooner and supress superfoulous warning messages none

Description Flavio Leitner 2008-09-03 14:27:44 UTC
Created attachment 315645 [details]
Patch fixing spin lock

Description of problem:

During the crash dump the local interrupts are disabled and bnx2 driver 
tries to read a register doing the following
bnx2_reg_rd_ind(struct bnx2 *bp, u32 offset)
{
        u32 val;

        spin_lock_bh(&bp->indirect_lock);
        REG_WR(bp, BNX2_PCICFG_REG_WINDOW_ADDRESS, offset);
        val = REG_RD(bp, BNX2_PCICFG_REG_WINDOW);
===>    spin_unlock_bh(&bp->indirect_lock);
        return val;
}

but spin_unlock_bh() can cause preemption and so, there is a 
warning there in case of interrupts are disabled.

This should happen only with bnx2 boards with remote copper PHY that 
triggered STATUS_ATTN_BITS_TIMER_ABORT event. The code path is:

bnx2_phy_int(struct bnx2 *bp)
{
...
        if (bnx2_phy_event_is_set(bp, STATUS_ATTN_BITS_TIMER_ABORT))
=====>       bnx2_set_remote_link(bp);
...

and 
bnx2_set_remote_link(struct bnx2 *bp)
{
...
=====>  evt_code = REG_RD_IND(bp, bp->shmem_base + BNX2_FW_EVT_CODE_MB); 



Here is the console output reproducing the problem:

[root@ ~]# echo c > /proc/sysrq-trigger
< ....Client hang after triggering dump here .. >
Here is the serial console output for client(x3755)
SysRq : Crashing the kernel by request
Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
<ffffffff8023e104>{sysrq_handle_crash+0}
PML4 25405c067 PGD 25405a067 PMD 0
Oops: 0002 [1] SMP
CPU 1
Modules linked in: netconsole netdump nfsd exportfs lockd nfs_acl parport_pc lp
parport autofs4 i2c_dev i2c_core sunrpc ipmi_devintf ipmi_si ipmi_msghandler ds
yenta_socket pcmcia_core cpufreq_powersave ib_srp ib_sdp ib_ipoib md5 ipv6
rdma_ucm rdma_cm iw_cm ib_addr ib_umad ib_ucm ib_uverbs ib_cm ib_sa ib_mad
ib_core zlib_deflate dm_mirror dm_multipath dm_mod joydev button battery ac
ohci_hcd ehci_hcd k8_edac edac_mc e1000 bnx2 ext3 jbd qla2400 aacraid qla2xxx
scsi_transport_fc sd_mod scsi_mod
Pid: 26687, comm: bash Not tainted 2.6.9-70.ELsmp
RIP: 0010:[<ffffffff8023e104>] <ffffffff8023e104>{sysrq_handle_crash+0}
RSP: 0018:00000100bc5ffeb0  EFLAGS: 00010012
RAX: 000000000000001f RBX: ffffffff80413380 RCX: ffffffff803f59a8
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000063
RBP: 0000000000000063 R08: ffffffff803f59a8 R09: ffffffff80413380
R10: 0000000100000000 R11: ffffffff8011f5fc R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000006 R15: 0000000000000246
FS:  0000002a9557a3e0(0000) GS:ffffffff8050c500(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000000dffc0000 CR4: 00000000000006e0
Process bash (pid: 26687, threadinfo 00000100bc5fe000, task 000001015d43c030)
Stack: ffffffff8023e2c7 0000000000000000 00000100bc5fe000 0000000000000002
00000100bc5fff50 0000000000000002 0000002a983fb000 0000000000000000
ffffffff801b391d 0000000000000246
Call Trace:<ffffffff8023e2c7>{__handle_sysrq+115}
<ffffffff801b391d>{write_sysrq_trigger+43}
<ffffffff8017bc46>{vfs_write+207} <ffffffff8017bd2e>{sys_write+69}
<ffffffff801102b6>{system_call+126}

Code: c6 04 25 00 00 00 00 00 c3 e9 78 ef f3 ff e9 01 3e f4 ff 48
RIP <ffffffff8023e104>{sysrq_handle_crash+0} RSP <00000100bc5ffeb0>
CR2: 0000000000000000
CPU#0 is frozen.
CPU#1 is executing netdump.
CPU#2 is frozen.
CPU#3 is frozen.
CPU#4 is frozen.
CPU#5 is frozen.
CPU#6 is frozen.
CPU#7 is frozen.
< netdump activated - performing handshake with the server. >
Badness in local_bh_enable at kernel/softirq.c:141

Call Trace:<ffffffff8013d54d>{local_bh_enable+70}
<ffffffffa00e7032>{:bnx2:bnx2_reg_rd_ind+50}
<ffffffffa00e9739>{:bnx2:bnx2_poll+173} <ffffffff801f0007>{vsnprintf+1406}
<ffffffff802c89ac>{netpoll_poll_dev+223}
<ffffffff802c889a>{netpoll_send_skb+340}
<ffffffffa02f84f2>{:netdump:netpoll_netdump+308}
<ffffffff8011f5fc>{flat_send_IPI_mask+0}
<ffffffff8023e104>{sysrq_handle_crash+0}
<ffffffffa02f839a>{:netdump:netpoll_start_netdump+221}


How reproducible:
Always

Additional info:
Attaching a patch fixing the lock with good feedback.

Comment 4 Larry Troan 2008-09-03 20:31:30 UTC
Per comment #1, a test package was built on an internal server on 9/03 -- it is not available yet for external consumption. 

So it appears this will be in 4.8 and is being considered for the 4.7.z stream. 

Can we get the bug updated to ASSIGNED status; it's still in NEW state.

Comment 5 Neil Horman 2008-09-15 16:46:28 UTC
Everybody slow down.  

Larry, I'm not sure why you think this will be in 4.8, I've not seen anything go up to rhkl about it.  The only thing that this has been built by us is Flavio's build in brew.   That doesn't mean this is going into 4.8, nor does it inmply backporting to 4.7.z.

As for the technical merit of the patch, I guess it doesn't hurt anything, but I don't really see a need for it either.  Yeah we get a badness dump, but thats just a WARN_ON, not a huge deal.  And its warning us that we're calling an unlock function that can sleep from a context that isn't able to sleep.  That rescheduling happens in preempt_schedule, and in that function, we immediately check irq_disabled, and return if thats true.  Since we got the WARN_ON in local_bh_enabled we should never actually schedule, and so practically speaking, its safe.

Is there actually a problem here?  Are we not captuing core dumps?  Or are we just seeing the warning on the console?  If its the latter, then we can really safely ignore this bug.  If its the former, then lets figure out why we're hanging, before we do anything else.

Comment 6 Flavio Leitner 2008-09-15 17:58:07 UTC
I agree that the warn_on message can be ignored sometimes but it usually 
indicates a problem and in this case we can't capture core dumps because 
after the warn_on messages the system hangs. After applying the proposed 
patch that doesn't invoke softirqs we could get a complete core dump as 
expected.

Flavio

Comment 7 Neil Horman 2008-09-15 18:39:31 UTC
"the problem goes away" isn't a good enough reason  to take a patch.  It says nothing about our understanding of the problem.  We may have a timing issue elsewhere in the driver.  Do you have a serial console on this box?  If we were panic-ing because we were scheduling in an interrupt, I would expect to see us  produce a "bad: scheduling while atomic" message when we called schedule and to dump the requisite stack.  If we do, then I'm worried about how we got into a state in which we tried to preempt the kernel while in an interrupt context (which is what we should fix).  If we don't see that message, then we have likely deadlocked somewhere else, and we should use sysrq-t (if the system is still responsive to it), or the nmi watchdog to determine where we are deadlocking and solve that problem.

I'm not trying to assert that your patch is particularly bad.  On the contrary, its probably a fine change to make.  But I can't take a patch that just makes the problem go away without knowing how it fixes the problem in the first place.

Can you please try to attach a serial console to the box and see if we get that secondary panic that I mentioned above?  Thanks!

Comment 8 RHEL Program Management 2008-09-18 13:30:08 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 13 Issue Tracker 2008-10-01 21:16:24 UTC
I'm attaching the serial console log from another system reproducing it.

If you don't do anything the console stays stopped for more than 5
minutes,
then if you send sysrq+t nothing happens (you can see telnet send brk 
command line) until the nmi watchdog comes in and print the second and
the third backtrace at once.

The system is ibm-hs21-8853-1.gsslab.rdu.redhat.com and it is currently 
reserved for fleite/olive, so check with him before do anything there.

Flavio


This event sent from IssueTracker by fleitner 
 issue 214359
it_file 160919

Comment 14 Andy Gospodarek 2008-10-02 11:05:49 UTC
My initial thoughts were still that this is a recursive call to netpoll (first netdump, then netconsole) and that's why you are stuck.  My first look at the new capture from comment #13 seems to indicate that is the case as well, but I'll look at it more closely to be sure.

Comment 15 Neil Horman 2008-10-02 12:52:13 UTC
It interesting, I agree with andy, this rather smells like we're taking a recursive spinlock of some sort, but given the traces from the nmi watchdog, it appears that we're getting stuck on the recursive taking of sysrq_key_table_lock in __handle_sysrq, which I'm guessing arose from the fact that you hit sysrq-t five minutes after you started the dump process.  That would in turn suggest that prior to the issuance of the sysrq-t, the system was in fact _not_ deadlocked (I assume that the nmi never triggers if the system is just left to sit on its own?).  If thats the case, it would seem that our cpus are either:

a) spinning, doing no usefull work somewhere
b) trying to do usefull work, but not accomplishing anything.

Have we tried to capture a tcpdump of the netdump happenening from the netdump server.  Given the trace above, I'm starting to wonder if we're sending out some netdump frames but never managing to receive anything on the dumping client.  Flavio, can you grab a tcpdump from the netdump server while this system is dumping?  It would be nice to see if we get any frames from the netdump client, and if so, when  we stop getting them. Thanks!

Comment 16 Issue Tracker 2008-10-02 14:08:42 UTC
This is the messages.log in the other end (10.10.56.62)

Oct  2 14:13:44 bl40p-1 kernel: device eth0 entered promiscuous mode
Oct  2 14:13:47 bl40p-1 sshd(pam_unix)[13566]: session opened for user
netdump by (uid=0)
Oct  2 14:13:48 bl40p-1 sshd(pam_unix)[13566]: session closed for user
netdump
Oct  2 14:14:46 bl40p-1 netdump[6266]: Got too many timeouts in
handshaking, ignoring client 10.10.56.163
Oct  2 14:14:49 bl40p-1 netdump[6266]: Got too many timeouts waiting for
SHOW_STATUS for client 10.10.56.163, rebooting it
Oct  2 14:16:53 bl40p-1 kernel: device eth0 left promiscuous mode
Oct  2 14:17:04 bl40p-1 sshd(pam_unix)[13570]: session opened for user
root by (uid=0)
Oct  2 14:17:16 bl40p-1 sshd(pam_unix)[13570]: session closed for user
root

the traffic is attached.

netdump server: 10.10.56.62
crash system: 10.10.56.163

Flavio


This event sent from IssueTracker by fleitner 
 issue 214359
it_file 161029

Comment 17 Neil Horman 2008-10-02 18:09:02 UTC
Thank you. 

So, I'm looking at your tcpdump and a few things stand out to me:

1) Frames 111, 197, 198, 199 & 200 Show the startup and usage of netconsole correctly.  The source port is 6664 (which I assume that you specified with the LOCALPORT option), and everything seems to work well.

2) Frame 228, we seem to have the start of a netdump, except that the contained data looks all wrong.  We send from local port 6666 instead of 6664 (which is correct for netdump), except we contain data that looks like its log data rather than the netdump reply header from the client with the REPLY_START_NETDUMP command in the header.

3) Frames 229,230, etc.  The netdump server begins acting as though it _has_ recieved a REPLY_START_NETDUMP command, since it sends back a COMM_START_NETDUMP_ACK message, which times out and beigns to send COMM_HELLO messages every timeout period thereafter.

So from this I think we can conclude that we at least got a netdump start message from the bnx2 based system, but the tcpdump on the netdump server never saw it (even though the netdump server itself did).  It would be nice if we could figure out why that happened.  Can you check the message log on the netdump server to see if any odd messages about clients appeared?  Is it possible that a firewall on the client or server is dropping frames to/from port 6666 or some such?  It seems really odd to me that we have a properly working netconsole, but the netdump startup message gets oddly dropped in the tcpdump.    Also, what version of the netdump-server are you using?  I'm looking at 0.7.16 sources and I couldn't find any server version info in the bz or it.

Comment 18 Flavio Leitner 2008-10-02 18:56:36 UTC
netdump setup on client side:
# grep -v  '^#' /etc/sysconfig/netdump
NETDUMPADDR=10.10.56.62

The netdump server messages log is available in my previous comment#16

# rpm -q netdump-server
netdump-server-0.7.16-14

Flavio

Comment 19 Flavio Leitner 2008-10-02 18:58:17 UTC
netpoll_netdump()
...
   netpoll_reset_locks(&np); <--- reset poll_lock
   netdump_startup_handshake()
     send_netdump_msg()
        netpoll_send_udp()
           netpoll_send_skb()
              netpoll_poll_dev()
                 poll_napi()
                    spin_trylock(&npinfo->poll_lock)) <--- hold poll_lock
                    dev->poll() => bnx2:bnx2_poll()
                    ...
                    :bnx2:bnx2_reg_rd_ind() <--- enable softirqs
                       do_softirq()
                         net_rx_action()
                            local_irq_enable()  <-- watchdog still works
                            have = netpoll_poll_lock(dev);
                              spin_lock(&ndw->npinfo->poll_lock); <-- deadlock

The watchdog still works as the local interrupts are enabled at this
point but the CPU is stuck there.

Triggering sysrq+t  it does:

__handle_sysrq()
  spin_lock_irqsave(&sysrq_key_table_lock, flags); <--- deadlock here

because we had used sysrq to start the crash, but now the interrupts
are disabled, so watchdog is able to come in and show the backtrace
we are seeing.

my 0,02.
Flavio

Comment 20 Neil Horman 2008-10-02 19:30:44 UTC
Created attachment 319285 [details]
patch to disable softirqs in netpoll

yup, that looks like it.  Nicely done.  And there are quite a few drivers which use spin_lock_bh in their ->poll paths.  Its interesting that we've not seen this happen before.  Regardless however, it seems like the appropritate fix is to disable softirqs in poll_napi, so that we are guaranteed to not have net_rx_action run on that cpu while the poll is taking place.  Please test out the attached patch and confirm that it solves the problem.  Thanks!

Comment 21 Flavio Leitner 2008-10-02 20:22:32 UTC
Created attachment 319290 [details]
log from patch id=319285

The badness still happens as the interrupts are off since
netpoll_start_netdump(), then it got in endless loop.

attaching the serial console log.

Flavio

Comment 22 Neil Horman 2008-10-02 20:33:57 UTC
I'm not worried about the badness warning yet.  You say it loops endlessly, but it appears that its sending netdump packets (which is why you keep getting the warnings).  Are you getting a core on the server?

Comment 23 Flavio Leitner 2008-10-02 21:17:34 UTC
Created attachment 319294 [details]
traffic dump of the netdump session


I let it running more time then netdump server resets the client. 
See the messages log on the server side:

Oct  2 21:17:24 bl40p-1 netdump[6266]: Got too many timeouts waiting for memory page for client 10.10.56.163, ignoring it
Oct  2 21:17:27 bl40p-1 netdump[6266]: Got too many timeouts waiting for SHOW_STATUS for client 10.10.56.163, rebooting it
Oct  2 21:17:27 bl40p-1 netdump[6266]: Got unexpected packet type 3 from ip 10.10.56.163
Oct  2 21:17:35 bl40p-1 last message repeated 123 times
Oct  2 21:17:37 bl40p-1 netdump[6266]: Got unexpected packet type 12 from ip 10.10.56.163

# pwd
/var/crash
# ls -la 10.10.56.163-2008-10-02-21\:16/
 total 32
 drwx------  2 netdump netdump 4096 Oct  2 21:16 .
 drwxr-xr-x  6 netdump netdump 4096 Oct  2 21:16 ..
 -rw-------  1 netdump netdump  114 Oct  2 21:16 log
 -rw-------  1 netdump netdump 4096 Oct  2 21:16 vmcore-incomplete
# file 10.10.56.163-2008-10-02-21\:16/vmcore-incomplete
vmcore-incomplete: ELF 64-bit LSB core file AMD x86-64, version 1 (SYSV), SVR4-style, from 'nux'


and on client side:
<snip>
Badness in local_bh_enable at kernel/softirq.c:141

Call Trace:<ffffffff8013d659>{local_bh_enable+70} <ffffffff802c9783>{netpoll_poll_dev+242}
       <ffffffff802c965e>{netpoll_send_skb+340} <ffffffffa02185ac>{:netdump:netpoll_netdump+494}
       <ffffffff8023eaac>{sysrq_handle_crash+0} <ffffffff8023eaac>{sysrq_handle_crash+0}
       <ffffffffa021839a>{:netdump:netpoll_start_netdump+221}

netdump: rebooting in 3 seconds.                                                               
< snip, here the client reboots>

attaching the traffic dump.

Flavio

Comment 25 Neil Horman 2008-10-03 18:16:10 UTC
yeah, the upstream version added several spin_lock_bh clauses , thats why it worked before.

So I'm looking at this code, and while I understand in the general case why we don't disable bottom halves  for netpoll, I really don't see why we don't just disable them in their entirety for netdump specifically.  We don't have any need for bottom halves while executing a netdump, no more than we need hard interrupts.

I'm attaching a second patch for you to try, that should simplify all of this greatly.

Comment 26 Neil Horman 2008-10-03 18:17:52 UTC
Created attachment 319400 [details]
patch to disable softirqs entirely during netdump operation

Comment 27 Andy Gospodarek 2008-10-03 18:36:01 UTC
I'm not sure that patch is perfect -- I see the 'badness' errors on rhel4 using netconsole too.  I don't see them with basically the same driver on rhel5, so we must have an _irqsave where we have a _bh in rhel5 (or something similar) in the netpoll and netdump paths.

Comment 28 Flavio Leitner 2008-10-03 18:48:26 UTC
Neil,

I think the problem isn't with softirqs anymore. The traffic dump shows some 
traffic going on between client and server but it still gets timed out errors.

The fact that my initial patch has worked also indicates that your patch
should have done the same, but it still gets the badness messages going on.

It seems to me that bnx2_poll() is generating too much badness warning 
messages and they are stealing time to do the work causing the timed out
errors on the server side. Following this idea I did one more try now
without the serial console and it seems better:

[root@bl40p-1 10.10.56.163-2008-10-02-23:53]# ls -la
total 185408
drwx------  2 netdump netdump       4096 Oct  3 06:04 .
drwxr-xr-x  7 netdump netdump       4096 Oct  2 23:53 ..
-rw-------  1 netdump netdump         41 Oct  3 06:04 log
-rw-------  1 netdump netdump 1073467392 Oct  3 06:04 vmcore

Flavio

Comment 29 Neil Horman 2008-10-03 19:04:45 UTC
In response to comment 27:
The _irqsave is in write_msg, which exists in both RHEL4 and RHEL5.  Everything works fine in netconsole, its netdump thats the problem.  We can't move that irqsave to poll_napi, because we don't want to create extra latency in the fast, nominal receive path.    Thats why I wanted to disable softirqs for all of netdumps operation.  That way we wouldn't ever wind up in net_rx_action while holding the poll lock on the same cpu.

In response to comment #28, you say that you're still getting badness errors with my patch in comment 26?  You absolutely shouldn't be, unless there is an unbalanced local_bh_enable somewhere.   Do you have the log from the most recent kernel , which tested my patch from comment 26?  It seems  however, despite the timeouts and badness messages that you got a complete vmcore, is that correct?

Comment 30 Andy Gospodarek 2008-10-03 19:17:44 UTC
(In reply to comment #29)
> In response to comment 27:
> The _irqsave is in write_msg, which exists in both RHEL4 and RHEL5.  Everything
> works fine in netconsole, its netdump thats the problem.  We can't move that
> irqsave to poll_napi, because we don't want to create extra latency in the
> fast, nominal receive path.    Thats why I wanted to disable softirqs for all
> of netdumps operation.  That way we wouldn't ever wind up in net_rx_action
> while holding the poll lock on the same cpu.
> 

Netconsole over a bonding interface (when that bonding interface contains a bnx2-based card) spews this message on rhel4.

Comment 31 Neil Horman 2008-10-03 20:18:46 UTC
Actually, thats a good point.  local_bh_enable issues a WARN_ON if local irqs are disabled.  write_msg issues a local_save_flags, but never actually disables irq's.  So we shouldn't get this badness message when netconsole is running, ever.  Thats in keeping with what we're seeing in this bug (netconsole works fine, but netdump doesn't).  If you see this message get spewed with every netconsole packet to get sent over a bnx2 card in rhel4 from the bonding interface, something is then disabling interrupts somewhere and not re-enabling them properly. I don't see anything thas disabling from write_msg down through the bonding netpoll xmit routine or in the bnx2 xmit routine.

Looking more closely, I think I see the problem.  The patch I gave you is doing its job properly, and is keeping softirqs from running, but the WARN_ON  message is checked unconditionally when we re-enable irqs, so we still get the message spew.  What I don't understand is why they keep comming.  Once we enter crashdump_mode, netconsole should suppress all messagess from a WARN_ON, but we continue to get them (or are you seeing these only on the serial console)?

Regardless, the onther thing thats bothering me here is the frequency at which we get these messgaes.  The only path that I can see in bnx2 that calls spin_unlock_bh is through bnx2_phy_int.  Why are we getting so many phy events when we take a netdump?  Is there something acutally happening on the phy when we go into netdump that we need to  query it on every poll (i.e. do we check the phy on every napi poll as well when operating normally)?  Or is something out of sync in the driver which prevents inadvertently drops us into this phy_checking clause when we trigger a netdump?

Comment 32 Neil Horman 2008-10-03 20:28:49 UTC
I wonder if the best thing to do here isn't to add a condition to the warn on like this:

WARN_ON(!softirq_count()  && irqs_disabled())

That way we would only get the warning printed in the event that irqs were disabled when we _actually_ re-enabled softirqs, rather than unilaterally.

Flavio, can you add that to the patch and retest?

Comment 33 Flavio Leitner 2008-10-03 20:32:36 UTC
re#29 - my previous comments are about last results in comment#23. I was able 
to get a vmcore removing the serial console. I'll try the last patch asap.

re#30 
- the messages are indeed on serial console and on tty ones.
- bnx2_phy_int() is frequently called by poll_napi().
- it's probably a heartbeat timer expiring.

Flavio

Comment 34 Flavio Leitner 2008-10-03 20:44:38 UTC
Neil, 

--- linux-2.6.9/drivers/net/netdump.c.orig	2008-10-03 14:14:33.000000000 -0400
+++ linux-2.6.9/drivers/net/netdump.c	2008-10-03 14:15:09.000000000 -0400
@@ -401,6 +401,7 @@ static asmlinkage void netpoll_netdump(s
 
 	while (netdump_mode) {
 		local_irq_disable();
+		local_bh_disable();
 		Dprintk("main netdump loop: polling controller ...\n");
 		netpoll_poll(&np);
 
This patch won't work because before that we call netdump_startup_handshake()
and it will hangs there as before.

I would suggest to try this patch
https://bugzilla.redhat.com/attachment.cgi?id=319285 plus the one in comment#32.

but I'm not sure this will fix Andy's one.

Flavio

Comment 35 Andy Gospodarek 2008-10-03 21:45:30 UTC
Mine is a printk issue:

asmlinkage int printk(const char *fmt, ...)
{
        va_list args;
        int r;

        va_start(args, fmt);
        r = vprintk(fmt, args);
        va_end(args);

        return r;
}

asmlinkage int vprintk(const char *fmt, va_list args)
{
        unsigned long flags;
        int printed_len;
        char *p;
        static char printk_buf[1024];
        static int log_level_unknown = 1;

        if (unlikely(oops_in_progress))
                zap_locks();

        /* This stops the holder of console_sem just where we want him */
---->   spin_lock_irqsave(&logbuf_lock, flags);


since printk is always in the stack on my issues, it's the problem.  Looks like we need 2 different solutions....

Comment 40 Neil Horman 2008-10-07 19:22:39 UTC
Created attachment 319686 [details]
patch to disable softirqs during netdump, disable hard irqs sooner and supress superfoulous warning messages

This updated patch works quite well for me on the ibm blade in question.  I've captured several dumps successfully with it.

Comment 42 Vivek Goyal 2008-10-15 21:22:16 UTC
Committed in 78.14.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 46 Luo Fei 2009-04-21 02:17:46 UTC
I test for  2.6.9-78.ELsmp(has the problem ) ,2.6.9-88.ELsmp(has no problem)

[root@dell-per805-01 ~]# uname -a
Linux dell-per805-01.rhts.bos.redhat.com 2.6.9-78.ELsmp #1 SMP Wed Jul 9 15:46:26 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
[root@dell-per805-01 ~]# echo c > /proc/sysrq-trigger 
SysRq : Crashing the kernel by request
...
< netdump activated - performing handshake with the server. >
Badness in local_bh_enable at kernel/softirq.c:141

Call Trace:<ffffffff8013d595>{local_bh_enable+70} <ffffffffa00f2032>{:bnx2:bnx2_reg_rd_ind+50} 
       <ffffffffa00f473a>{:bnx2:bnx2_poll+173} <ffffffff801f016b>{vsnprintf+1406} 
       <ffffffff802c902c>{netpoll_poll_dev+223} <ffffffff802c8f1a>{netpoll_send_skb+340} 
       <ffffffffa01984f2>{:netdump:netpoll_netdump+308} <ffffffff8011f63c>{flat_send_IPI_mask+0} 
       <ffffffff8023e49c>{sysrq_handle_crash+0} <ffffffffa019839a>{:netdump:netpoll_start_netdump+221} 
      
[root@dell-per805-01 ~]# uname -a
Linux dell-per805-01.rhts.bos.redhat.com 2.6.9-88.ELsmp #1 SMP Mon Apr 13 19:23:31 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux     
[root@dell-per805-01 ~]# echo c > /proc/sysrq-trigger 
SysRq : Crashing the kernel by request
...
< netdump activated - performing handshake with the server. >
NETDUMP START!
< handshake completed - listening for dump requests. >
...

the testing results are the same on two different mechines dell-per805-01.rhts.bos.redhat.com and hp-dl585g2-01.rhts.bos.redhat.com.

Comment 48 errata-xmlrpc 2009-05-18 19:10:14 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html

Comment 50 Lachlan McIlroy 2009-06-15 03:37:01 UTC
Adding issue 304632.