Bug 460349
| Summary: | igb network driver takes too much time to establish link status | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Flavio Leitner <fleitner> | ||||
| Component: | kernel | Assignee: | Andy Gospodarek <agospoda> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Martin Jenner <mjenner> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 5.2 | CC: | agibson2, agospoda, alexander.h.duyck, andriusb, dmair, james.brown, jesse.brandeburg, peterm, ranshalit, rdoty, rpacheco, tao | ||||
| Target Milestone: | rc | Keywords: | OtherQA, Regression | ||||
| Target Release: | --- | ||||||
| Hardware: | i386 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2008-11-10 18:00:59 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 391501, 441885 | ||||||
| Attachments: |
|
||||||
|
Description
Flavio Leitner
2008-08-27 17:49:00 UTC
Created attachment 316109 [details]
igb-52-fix.patch
Good news! I have a small fix to the 5.2 igb driver that fixes this. I took at look at what was being done and on systems that use MSI-X, and found that we were never properly starting the receive queues.
I've been testing this fix for a while and it seems to be exactly what we need. It doesn't require a large driver update that magically fixes this, it's an actual fix.
My test kernels will only contain the updated patch for 5.3, so if testing of this patch on 5.2 (-92 kernel) is needed, someone else will probably need to build it.
My test kernels have been updated to include a patch for this bugzilla. http://people.redhat.com/agospoda/#rhel5 Please test them and report back your results. Without immediate feedback there is a good chance this or any other fix for this driver will not be included in the upcoming update. I examined this issue and it is one that is created by redhat having to backport kernel.org drivers to RHEL5. I'm not sure if we should start tracking this kind of issue resolution separately. I concur that Andy's patch appears correct. booting with pci=nomsi caused the driver to only use 1 tx and rx queue due to only having one interrupt, which caused things to (mostly) work. File uploaded: ebet-32.png This event sent from IssueTracker by jwest issue 189601 it_file 163951 File uploaded: rh52-64.png This event sent from IssueTracker by jwest issue 189601 it_file 163952 This fix in comment #8 was already included in the big igb update fro 5.3 for bug 436040. *** This bug has been marked as a duplicate of bug 436040 *** Any chance of backporting this fix to 5.2? I have a similar issue with igb driver on Dell R200 hardware where no traffic is seen and 'Link Detected: no' but the rest of the link status is correct (duplex, etc). A ethtool -r eth# causes 'Link Detected' to go active 'yes' and the interface to work but sometimes when I unplug and plug back in the interface the link state gets stuck at no again even though it is plugged in. I can not access bug 436040 so I can not put this comment on that bug report so I am placing it here. I forgot to mention that pci=nomsi does fix the link detection issue for me but I worry that this workaround might not be a 100% fix after reading a comment above which mentions that it (mostly) works with pci=nomsi. "I concur that Andy's patch appears correct. booting with pci=nomsi caused the driver to only use 1 tx and rx queue due to only having one interrupt, which caused things to (mostly) work." Adam, with the release of 5.3 pending (in a matter of weeks), I'm not sure this will be able to slip into the 5.2 stream in time to be worth anything. Will you be able to update to at least the newer kernel when it comes out? And don't worry, you aren't missing much by not being able to access bug 436040 (seriously). Waiting is not such a big deal if 5.3 is expected in weeks. The workaround will work for now until I upgrade them I guess. I was just worried that the workaround might not be good enough based on comments above. I am deploying a few firewalls so networking is pretty important. Testing so far has revealed that the pci=nomsi is working fine. Hopefully this question isn't too far off topic but what are the downsides to using pci=nomsi? I did a google search but didn't really turn up anything. I assume that since all ports share an IRQ there will be more overhead but I have a feeling it won't account for much with processor speeds nowadays. Any ideas? Hello, I have similar issue, but I don't have "eth0: link is not ready" messages. I only see that it takes a lot of time (6 seconds in total: 3 from loading driver till link is up, and additional 3 seconds from link is up till ping. Is it the same issue (I see it in buildroot) ? Should I try the patch ? Regards, ranran |