Bug 143053

Summary: tg3 driver not throttling interrupts
Product: Red Hat Enterprise Linux 3 Reporter: Dag Wieers <dag>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: petrides, riel
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-01-19 15:40:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dag Wieers 2004-12-16 00:11:30 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Gecko/20041020 Galeon/1.3.18

Description of problem:
We've got a situation where 2 GigE adapters are causing 44k interrupts
per second with the tg3 driver. I would have expected Interrupt
Coalescence to reduce this number drastically as I fear the number of
interrupts are causing poorer performance and limiting the available
network bandwidth.

Looking at the code I can't find anything that would allow for
throttling interrupts.

Here's the dstat output (I hope it fits):

[root@rtlbrunews01 root]# dstat -cydni -D hires -N bond0,eth2,eth3 -I
32,36,53,68 10
----total-cpu-usage---- ---system-- -disk/hires
-net/bond0----net/eth2----net/eth3- -------interrupts------
usr sys idl wai hiq siq|_int_ _csw_|_read write|_recv _send _recv
_send _recv _send|__32_ __36_ __53_ __68_
  2  13  54  14   2  14|   0     0 |   0     0 |   0     0 :   0     0
:   0     0 |   0     0     0     0 
  5  26   8  21   4  37|44.6k 44.8k|20.8M 61.7M|60.9M 4397k:32.3M
4384k:28.6M 12.5k|  91    90  23.5k 20.9k
  5  30   6  20   5  35|44.2k 48.5k|20.6M 61.7M|63.2M 4448k:32.4M
4436k:30.9M 12.1k|  88    86  23.1k 20.8k
  5  29   7  20   5  34|44.3k 44.0k|20.5M 61.0M|   -  4757k:36.0M
4746k:28.4M 11.2k|  90    89  23.2k 20.8k
  5  29   8  19   5  34|44.3k 45.7k|20.6M 61.7M|61.0M 4425k:32.6M
4414k:28.4M 11.7k|  86    87  23.2k 20.8k
  5  28   7  19   5  36|44.2k 47.7k|20.6M 61.2M|60.3M 4294k:32.1M
4281k:28.1M 12.4k|  87    86  23.1k 20.8k
  4  28   6  20   4  38|44.1k 50.1k|20.7M 61.2M|70.7M 4617k:33.3M
4602k:37.4M 15.3k|  87    94  23.6k 20.2k

Looking at these numbers, we can see that the 2 Qlogic HBA's are each
causing about 90 interrupts/second (for 512kB blocksizes) whereas the
GigE adapters each have about 22k interrupts/sec (at 1500B).

We can't use jumbo frames to improve this situation.

Version-Release number of selected component (if applicable):
kernel-2.4.21-20.EL

How reproducible:
Always

Steps to Reproduce:
1...
2.
3.
    

Additional info:

Comment 1 Dag Wieers 2004-12-16 00:16:35 UTC
Let me give less detailed dstat output so it doesn't get mangled.
Hope this fits (less than 80 columns).

[root@rtlbrunews01 root]# dstat -cni -N eth2,eth3 -I 32,36,53,68 10
----total-cpu-usage---- --net/eth2----net/eth3- -------interrupts------
usr sys idl wai hiq siq|_recv _send _recv _send|__32_ __36_ __53_ __68_
  2  13  54  14   2  15|   0     0 :   0     0 |   0     0     0     0 
  5  29   5  20   5  36|32.7M 4378k:28.4M 11.6k|  89    89  23.2k 20.6k
  6  31   5  18   5  35|32.5M 4389k:31.2M 11.9k|  88    87  23.3k 20.6k
  6  30   5  18   5  35|32.4M 4389k:28.4M 10.8k|  90    88  23.1k 20.6k
  6  29   5  19   6  36|35.5M 4853k:28.2M 11.7k|  88    88  23.1k 20.6k
  4  30   5  21   5  36|32.6M 4403k:28.3M 11.5k|  90    87  23.1k 20.6k


Comment 2 John W. Linville 2004-12-17 01:30:13 UTC
Forgive me for not being familiar with dstat.  I'll have to do some
research before I can fully understand the tables above... :-)

For the record, what sort of interrupt frequency are you expecting? 
At wire-speed, a 1Gbps ethernet link can do between ~80k (1500 byte
frames) and almost 1.5M (64 byte frames) frames/second.  So, ~20k
interrupts/second is indeed a reduction.

Also (perhaps counter-intuitively) a lower link utilization (i.e. less
than wire-speed) actually offers fewer opportunities for coalescing
interrupts, since frames are more likely to already have been handled
before the next frame arrives...

Finally, by "Interrupt Coalescence" are you referring to software
techniques like NAPI?  Or a specific feature of the hardware?  If the
latter, I'm not sure how Linux can do anything about it (beyond
perhaps enabling/disabling and possibly configuring it).

Just a thought...are these 64-bit PCI NICs?  If not, that may be the
source of less then optimal performance.

Comment 3 Dag Wieers 2004-12-17 03:11:09 UTC
Well, I didn't expect an interrupt for each frame on gigabit
interfaces, even though 22k interrupts per interface is not unusual.
Looking at the dstat output (and with experience of how the system
acts with different usages) the 36% softirq's are directly related to
the high number of interrupts and the high number of context switches
is also directly related to the number of softirq's (interrupt
handlers) handled.

Well, I've read that the tg3 driver uses (or can use ?) NAPI, but the
module only has a debug parameter. So I'm not sure what NAPI improves
and if it is enabled. 

The bcm5700 driver seems to have some module options that allow, what
they call adaptive coalescence. (This is actually disabling it :))

    adaptive_coalesce=0
    rx_coalesce_ticks=1 
    rx_max_coalesce_frames=1 
    tx_coalesce_ticks=1
    tx_max_coalesce_frames=1

So I guess (some?) hardware supports it. I wonder if it would work
better than NAPI (in our case).

PS dstat output is pretty straight-forward. cpu percentage, interface
usage (in kB/sec) and interrupts/sec, 32/36 are Qlogic, 53/68 are both
tg3 NICs. Each line is a 10sec average.

Comment 4 John W. Linville 2004-12-17 16:12:03 UTC
The tg3 driver only uses NAPI nowadays...the output of "ethtool -i"
will tell me for sure (although I'll still have to look at the source
to correlate the reported version number).

RH doesn't support the bcm5700 driver after RHEL2.1, as you may know.
 There does appear to be some code in the tg3 driver to support the
coalescing hardware, but it also appears to be turned-off in favor of
NAPI (with the possible exception of CHIPREV_5700_AX and CHIPREV_5700_BX).

NAPI and other interrupt mitigation schemes are really designed to
help-out under heavy network load.  I'm not sure that your pushing
enough traffic for any of them to make much of a difference.

Again, you may want to check your PCI bus speed/width.  A 64-bit PCI
tg3 card should have no problem hitting wire-speed (or thereabouts) on
any modern CPU.  I've been testing just this scenario recently with
another issue.

Comment 5 John W. Linville 2004-12-21 14:27:17 UTC
Can you conduct a test using smaller (e.g. 64 byte) frames?  They are
more likely to benefit from NAPI, because there is a greater
likelihood of receiving multiple frames per interrupt.

Comment 6 John W. Linville 2005-01-19 15:40:08 UTC
Closing this one for now...feel free to re-open if/when you have data
from running a test w/ 64-bit PCI cards using 64-byte frames...