Bug 143053
Summary: | tg3 driver not throttling interrupts | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Dag Wieers <dag> |
Component: | kernel | Assignee: | John W. Linville <linville> |
Status: | CLOSED NOTABUG | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | petrides, riel |
Target Milestone: | --- | Keywords: | FutureFeature |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-01-19 15:40:08 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dag Wieers
2004-12-16 00:11:30 UTC
Let me give less detailed dstat output so it doesn't get mangled. Hope this fits (less than 80 columns). [root@rtlbrunews01 root]# dstat -cni -N eth2,eth3 -I 32,36,53,68 10 ----total-cpu-usage---- --net/eth2----net/eth3- -------interrupts------ usr sys idl wai hiq siq|_recv _send _recv _send|__32_ __36_ __53_ __68_ 2 13 54 14 2 15| 0 0 : 0 0 | 0 0 0 0 5 29 5 20 5 36|32.7M 4378k:28.4M 11.6k| 89 89 23.2k 20.6k 6 31 5 18 5 35|32.5M 4389k:31.2M 11.9k| 88 87 23.3k 20.6k 6 30 5 18 5 35|32.4M 4389k:28.4M 10.8k| 90 88 23.1k 20.6k 6 29 5 19 6 36|35.5M 4853k:28.2M 11.7k| 88 88 23.1k 20.6k 4 30 5 21 5 36|32.6M 4403k:28.3M 11.5k| 90 87 23.1k 20.6k Forgive me for not being familiar with dstat. I'll have to do some research before I can fully understand the tables above... :-) For the record, what sort of interrupt frequency are you expecting? At wire-speed, a 1Gbps ethernet link can do between ~80k (1500 byte frames) and almost 1.5M (64 byte frames) frames/second. So, ~20k interrupts/second is indeed a reduction. Also (perhaps counter-intuitively) a lower link utilization (i.e. less than wire-speed) actually offers fewer opportunities for coalescing interrupts, since frames are more likely to already have been handled before the next frame arrives... Finally, by "Interrupt Coalescence" are you referring to software techniques like NAPI? Or a specific feature of the hardware? If the latter, I'm not sure how Linux can do anything about it (beyond perhaps enabling/disabling and possibly configuring it). Just a thought...are these 64-bit PCI NICs? If not, that may be the source of less then optimal performance. Well, I didn't expect an interrupt for each frame on gigabit interfaces, even though 22k interrupts per interface is not unusual. Looking at the dstat output (and with experience of how the system acts with different usages) the 36% softirq's are directly related to the high number of interrupts and the high number of context switches is also directly related to the number of softirq's (interrupt handlers) handled. Well, I've read that the tg3 driver uses (or can use ?) NAPI, but the module only has a debug parameter. So I'm not sure what NAPI improves and if it is enabled. The bcm5700 driver seems to have some module options that allow, what they call adaptive coalescence. (This is actually disabling it :)) adaptive_coalesce=0 rx_coalesce_ticks=1 rx_max_coalesce_frames=1 tx_coalesce_ticks=1 tx_max_coalesce_frames=1 So I guess (some?) hardware supports it. I wonder if it would work better than NAPI (in our case). PS dstat output is pretty straight-forward. cpu percentage, interface usage (in kB/sec) and interrupts/sec, 32/36 are Qlogic, 53/68 are both tg3 NICs. Each line is a 10sec average. The tg3 driver only uses NAPI nowadays...the output of "ethtool -i" will tell me for sure (although I'll still have to look at the source to correlate the reported version number). RH doesn't support the bcm5700 driver after RHEL2.1, as you may know. There does appear to be some code in the tg3 driver to support the coalescing hardware, but it also appears to be turned-off in favor of NAPI (with the possible exception of CHIPREV_5700_AX and CHIPREV_5700_BX). NAPI and other interrupt mitigation schemes are really designed to help-out under heavy network load. I'm not sure that your pushing enough traffic for any of them to make much of a difference. Again, you may want to check your PCI bus speed/width. A 64-bit PCI tg3 card should have no problem hitting wire-speed (or thereabouts) on any modern CPU. I've been testing just this scenario recently with another issue. Can you conduct a test using smaller (e.g. 64 byte) frames? They are more likely to benefit from NAPI, because there is a greater likelihood of receiving multiple frames per interrupt. Closing this one for now...feel free to re-open if/when you have data from running a test w/ 64-bit PCI cards using 64-byte frames... |