Bug 500925
Summary: | IP Fragments Dropped when ARP is needed | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Tuan Hoang <tqhoang> | ||||
Component: | kernel | Assignee: | Neil Horman <nhorman> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 5.3 | CC: | nhorman, tgraf | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-06-15 11:06:14 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Tuan Hoang
2009-05-14 21:27:38 UTC
This sounds an awful lot like the arp_queue overflowing (see __neigh_event_send). It used to be a silent discard, but sometime in the 5.4 devel cycle I added an unres_discard stat to the /proc/net/stat/[arp|ndisc]_cache files, so they could be observed. nominally that queue length is only 3 frames, but it can be adjusted upward via /proc/sys/net/ipv[4|6]/neigh/<iface>/unres_qlen. That would be the appropriate adjustment to make for the test described. (In reply to comment #1) > This sounds an awful lot like the arp_queue overflowing (see > __neigh_event_send). It used to be a silent discard, but sometime in the 5.4 > devel cycle I added an unres_discard stat to the > /proc/net/stat/[arp|ndisc]_cache files, so they could be observed. nominally > that queue length is only 3 frames, but it can be adjusted upward via > /proc/sys/net/ipv[4|6]/neigh/<iface>/unres_qlen. That would be the appropriate > adjustment to make for the test described. Hello, Neil. Thanks for your helpful hints! Yes, /proc/sys/net/ipv[4|6]/neigh/<iface>/unres_qlen is the reason here, the value in it is exactly the number of ICMP packets that we can get on HOST2 with the above test. Hmm, in the source code it should be the last 'if' in __neigh_event_send(). So... we don't need to fix this? Or just change the default value of unres_qlen? I think, given that the setting is tunable, no code change is needed. If the tests this customer is conducting require no UDP frame loss, the answer is for them to tune that value appropriately. I'd close this as NOTABUG, and provide documentation on how the user can scale that tunable appropriately. Thank you for the valuable information. I will setup the same test and report back. Out of curiosity, is there any adverse side effect of setting "unres_qlen" to a value of say 50 or even 100? Only that you potentially create a large backlog of frames in the system. IIRC that queue is per-peer. So if you have a lot of hosts that need revalidation frequently, you can get lots of frames backing up. But if lost frames on tx are unacceptible, thats your only recourse. (In reply to comment #3) > I think, given that the setting is tunable, no code change is needed. If the > tests this customer is conducting require no UDP frame loss, the answer is for > them to tune that value appropriately. I'd close this as NOTABUG, and provide > documentation on how the user can scale that tunable appropriately. That is fine for me, please close this as NOTABUG. As discussed. I think the discussion in this bug serves as sufficient documentation. Tuan, please feel free to reopen this (or a new bug), if any subsequent problems come up. |