Bug 660506
| Summary: | [Broadcom 5.6 bug] tg3: Increase tx jumbo bd flag threshold | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Matt Carlson <mcarlson> | ||||
| Component: | kernel | Assignee: | John Feeney <jfeeney> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Network QE <network-qe> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 5.6 | CC: | andriusb, benlu, gideonn, hjia, jfeeney, kzhang, mcarlson | ||||
| Target Milestone: | rc | Keywords: | OtherQA | ||||
| Target Release: | 5.6 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2011-01-13 22:03:10 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Matt Carlson
2010-12-06 22:55:07 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. > Do you consider this bug as a blocker for RHEL 5.6? Note that we are *very*
> late in the release cycle and plan to build RC kernel in a few days.
FYI, maybe the problem is my web browser (Opera), but I didn't see the quoted part of this comment (above) in the comment history.
Unfortunately, yes. Common transmit patterns can easily bring about the problem. This modification would be the correct thing to do, even if a problem were not present. IMO, it is low risk.
When applying the "Raise the jumbo frame BD frame threshold" patch I noticed
that the upstream patch changes tg3_start_xmit() and tg3_start_xmit_dma_bug()
but RHEL5.6 code would be changing tg3_start_xmit_dma_bug() only.
--- linux-2.6.18.noarch/drivers/net/tg3.c.657097.6.repost
+++ linux-2.6.18.noarch/drivers/net/tg3.c
@@ -5390,7 +5390,7 @@ static int tg3_start_xmit_dma_bug(struct
#endif
if ((tp->tg3_flags3 & TG3_FLG3_USE_JUMBO_BDFLAG) &&
- !mss && skb->len > ETH_DATA_LEN)
+ !mss && skb->len > VLAN_ETH_FRAME_LEN)
base_flags |= TXD_FLAG_JMB_PKT;
/* Queue skb data, a.k.a. the main skb fragment. */
@@ -5416,7 +5416,7 @@ static int tg3_start_xmit_dma_bug(struct
would_hit_hwbug = 1;
if ((tp->tg3_flags3 & TG3_FLG3_USE_JUMBO_BDFLAG) &&
- !mss && skb->len > ETH_DATA_LEN)
+ !mss && skb->len > VLAN_ETH_FRAME_LEN)
base_flags |= TXD_FLAG_JMB_PKT;
tg3_set_txd(tnapi, entry, mapping, len, base_flags,
One of these if statements appears to be superfluous. I tracked this mis-step
down to the trouble I had a year ago when posting the 3.104 patch set. If you
remember, there was a pointer problem that caused panics. The original patch
set posted had the correct code, as far as TXD_FLAG_JMB_PKT goes, but then the
patch set had to be reposted in a piecemeal fashion and it looks like both if
statements ended up in tg3_start_xmit_dma_bug(), as opposed to one in each
function. (Yuck)
So given that RHEL5.5 onward does not set TXD_FLAG_JMB_PKT in tg3_start_xmit(),
does this bring up another problem and does tg3_start_xmit_dma_bug() still need
to change to VLAN_ETH_FRAME_LEN? I assume it does, but does fixing it still
remain critical at this point in the RHEL5.6 cycle?
(In reply to comment #4) > One of these if statements appears to be superfluous. I tracked this mis-step > down to the trouble I had a year ago when posting the 3.104 patch set. If you > remember, there was a pointer problem that caused panics. The original patch > set posted had the correct code, as far as TXD_FLAG_JMB_PKT goes, but then the > patch set had to be reposted in a piecemeal fashion and it looks like both if > statements ended up in tg3_start_xmit_dma_bug(), as opposed to one in each > function. (Yuck) > > So given that RHEL5.5 onward does not set TXD_FLAG_JMB_PKT in tg3_start_xmit(), > does this bring up another problem and does tg3_start_xmit_dma_bug() still need > to change to VLAN_ETH_FRAME_LEN? I assume it does, but does fixing it still > remain critical at this point in the RHEL5.6 cycle? Yuck is right. I didn't catch that. The TXD_FLAG_JMB_PKT flag is needed in tg3_start_xmit() too for the 57765 asic rev. We need to add that part or jumbo frames won't work correctly. So, the patch is needed in both places, though it only fixes a bug for the 5719 in the tg3_start_xmit_dma_bug() case. I'd consider the full patch as necessary. Okay, how does this look? Add if to tg3_start_xmit(), modify tg3_start_xmit_dma_bug(), and remove superflous.
@@ -5197,6 +5197,10 @@ static int tg3_start_xmit(struct sk_buff
tnapi->tx_buffers[entry].skb = skb;
pci_unmap_addr_set(&tnapi->tx_buffers[entry], mapping, mapping);
+ if ((tp->tg3_flags3 & TG3_FLG3_USE_JUMBO_BDFLAG) &&
+ !mss && skb->len > VLAN_ETH_FRAME_LEN)
+ base_flags |= TXD_FLAG_JMB_PKT;
+
tg3_set_txd(tnapi, entry, mapping, len, base_flags,
(skb_shinfo(skb)->nr_frags == 0) | (mss << 1));
@@ -5390,7 +5394,7 @@ static int tg3_start_xmit_dma_bug(struct
#endif
if ((tp->tg3_flags3 & TG3_FLG3_USE_JUMBO_BDFLAG) &&
- !mss && skb->len > ETH_DATA_LEN)
+ !mss && skb->len > VLAN_ETH_FRAME_LEN)
base_flags |= TXD_FLAG_JMB_PKT;
/* Queue skb data, a.k.a. the main skb fragment. */
@@ -5415,10 +5419,6 @@ static int tg3_start_xmit_dma_bug(struct
if (tp->tg3_flags3 & TG3_FLG3_5701_DMA_BUG)
would_hit_hwbug = 1;
- if ((tp->tg3_flags3 & TG3_FLG3_USE_JUMBO_BDFLAG) &&
- !mss && skb->len > ETH_DATA_LEN)
- base_flags |= TXD_FLAG_JMB_PKT;
-
tg3_set_txd(tnapi, entry, mapping, len, base_flags,
(skb_shinfo(skb)->nr_frags == 0) | (mss << 1));
Looks good to me. See http://people.redhat.com/jfeeney/.bz660506 It has the patch provided above. Very quick testing feedback would be appreciated. Ben, can you test this? OK, update the test soon. in kernel-2.6.18-237.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed. Hi Ben, When you finish the testing, could you post a feedback to indicate the testing status of this bug. The test kernels can be found in Comment #12. Thanks. Created attachment 469849 [details]
Additional patches to fix the problem
Hi John. It looks like some more code is needed to fix the problem. The attached tarball contains two patches that will take care of the bug. Any chance these can make it in?
Matt, That's too bad. This bz would not be appropriate because it is all locked and loaded for RHEL5.6. I would need another. Still, I would think it is doubtful that any bz of this nature would make it into RHEL5.6 at this point. Comment on attachment 469849 [details]
Additional patches to fix the problem
needs to be in a new bugzilla since ON_QA
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html |