Red Hat Bugzilla – Bug 483445
Packets Loss with Netdump
Last modified: 2009-02-01 14:09:23 EST
Description of problem:
This is to track the additional issue with the fix,
Bug 477945 - Kernel Panic with Bnx2 - Badness in local_bh_enable at kernel/softirq.c:141
I have seen consistently packets loss while running "echo t >/proc/sysrq-trigger" in a loop.
From the affected machine's serial console,
# while :; do echo t >/proc/sysrq-trigger; done
From another host,
$ ping hp-dl785g5-01.rhts.bos.redhat.com
I have seen lots of packets loss here.
It likely happens on machines using bnx2 driver.
Version-Release number of selected component (if applicable):
kernel-2.6.9-78.23.EL + patch from,
Steps to Reproduce:
1. reserve one of the affected machines.
2. while :; do echo t >/proc/sysrq-trigger; done
3. From another host,
$ ping <the affected machine>
no packet loss
This isn't a bug, you're exercizing the pessimal case of netpoll. In the prior bug that you mention, we found a problem wherein there was access to shared data from multiple contexts causing a panic. The fix for that was to enforce the needed mutual exclusion between those contexts. Since one of the contexts was the nominal receive fast path (net_rx_action), netpoll now (correctly) blocks receive operations while calling the poll_controller/poll methods of a driver. doing this puts us at risk for frame loss. By sending multiple sysrq-t's, you effectively create multiple windows of time where we can't rx frames, leading to overflow and frame drops. This is working as it should.