Bug 618325
Summary: | tcpdump & wireshark are very slow to start due to lengthy setsockopt PACKET_RX_RING calls | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Phil Mayers <p.mayers> | ||||||
Component: | libpcap | Assignee: | Miroslav Lichvar <mlichvar> | ||||||
Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 12 | CC: | anton, dougsland, gansalmon, gharris, itamar, jonathan, kernel-maint, madhu.chinakonda, mlichvar, nhorman | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2010-07-28 11:05:22 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Created attachment 434465 [details]
vmstat whilst tcpdump is starting
Just a note - the delay might not seem "that bad" in this case, but I'd run tcpdump several times in quick succession. I suspect various stuff had been pushed out to swap. This is as *fast* as it ever gets - when the machine has been used for "general stuff" for a few hours, it can take very noticeable time to start tcpdump.
I'm not sure if there is anything we can do in libpcap to fix this, beside disabling the mmaped capture. Perhaps kernel developers will have a suggestion? (In reply to comment #2) > I'm not sure if there is anything we can do in libpcap to fix this, beside > disabling the mmaped capture. > > Perhaps kernel developers will have a suggestion? You're trying to allocate 128k of contiguous memory and that can always cause this problem. Should probably close as NOTABUG? "you" in this case is libpcap (the component I opened the bug against ;o) so if it's a problem to do that, it should perhaps be fixed in libpcap? Just out of curiosity: why does it take 3 seconds and *then* fail? Is there a way to tell libpcap to *not* use MMAPed capture? Environment variable? Ah, this is interesting. If I run: tcpdump -s 65484 ...I get a single, fast setsockopt() call: 0.000059 setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=65536, block_nr=32, frame_size=65536, frame_nr=32}, 16) = 0 If I run: tcpdump -s 65485 ...I get a series of setsockopt() calls, which are slow and fail until the size is ramped down: 0.000101 setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=131072, block_nr=31, frame_size=65552, frame_nr=31}, 16) = -1 ENOMEM 1.089152 setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=131072, block_nr=30, frame_size=65552, frame_nr=30}, 16) = -1 ENOMEM 2.848500 setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=131072, block_nr=29, frame_size=65552, frame_nr=29}, 16) = -1 ENOMEM 1.025462 setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=131072, block_nr=28, frame_size=65552, frame_nr=28}, 16) = -1 ENOMEM 0.000827 setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=131072, block_nr=27, frame_size=65552, frame_nr=27}, 16) = -1 ENOMEM 0.697725 setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=131072, block_nr=26, frame_size=65552, frame_nr=26}, 16) = -1 ENOMEM 0.000778 setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=131072, block_nr=25, frame_size=65552, frame_nr=25}, 16) = -1 ENOMEM 1.381146 setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=131072, block_nr=24, frame_size=65552, frame_nr=24}, 16) = 0 ...so the issue seems to be that asking for a capture over a certain size will trigger a memory allocation that the kernel is unwilling to perform (and takes a long-ish time to decide that) I've habitually used "-s 0" to ensure I don't miss any data regardless of the underlying link MTU. It would be nice if this could still be relied on and be fast, but I think we're well into a libpcap/tcpdump bug here. I'm still not sure how can we fix this in libpcap. If you have suggestions, please write to the tcpdump-workers list and I will cherry pick the commit. The current top-of-the-1.2-branch and trunk versions of libpcap might do a better job of this, at least on ARPHRD_ETHER interfaces (e.g., actual Ethernet interfaces), as, for those interfaces, it tries to allocate a ring buffer based on the minimum of what it thinks is the maximum packet size and the snapshot length, rather than just on the snapshot length. It requests the MTU in an attempt to properly handle jumbo frames, and it also checks for several forms of offloading and punts if it appears that TCP segmentation/reassembly offloading may be done (as you can then get packets larger than the interface MTU+link-layer header size). It does *not* attempt it on other network types - in particular, it can't do it on 802.11 interfaces when you're in monitor mode and getting radiotap etc. radio metadata headers, as there's no way to ask for the maximum size of those headers. It looks as if the new TPACKET_V3 memory-mapped interface in newer kernels might not be using fixed-length slots per packets and might not require that a maximum packet size be specified when the ring buffer is created. If so, having libpcap use that if available should, I think, work even better. FWIW, This commit: http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git;a=commit;h=0e3125c755445664f00ad036e4fc2cd32fd52877 Changed the AF_PACKET allocation strategy in the kernel so as to make the PACKET_RX_RING calls much quicker. If you update to a later fedora it should be improved. The TPACKET_V3 changes are in http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git;a=commit;h=0d4691ce112be025019999df5f2a5e00c03f03c2 http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git;a=commit;h=f6fb8f100b807378fda19e83e5ac6828b638603a http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git;a=commit;h=bc59ba399113fcbcac56ba22edde4b816199d48c and probably subsequent changes such as http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git;a=commit;h=eea49cc9009767dfbafd673ee577854454b52e0d Those, however, won't help without libpcap changes to use TPACKET_V3 - and without running a kernel with TPACKET_V3 support. It would be Very Nice if somebody with more Copious Free Time(R) than me were to add TPACKET_V3 support to libpcap; I can't guarantee when I'd be able to work on it. |
Created attachment 434461 [details] strace of tcpdump; note the slow setsockopt calls on lines 76-81 Description of problem: I'm a network engineer and use tcpdump and wireshark a lot. I've recently noticed that, on my Fedora 12 desktop machine, tcpdump and wireshar seem to take a good 10-15 seconds box to start up on average; this is not the usual startup delay I'd expect on a machine this quick (or slow!) strace seems to indicate several calls to: setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=131072, block_nr=31, frame_size=65600, frame_nr=31}, 16) = -1 ENOMEM (Cannot allocate memory) ...which take between 0.5 and 5 seconds to return before finally succeeding. Whilst this is happening, the system is very unresponsive - vmstat seems to think the time is spent in "swap out" and iowait. This is new behaviour; tcpdump didn't use to do this. Version-Release number of selected component (if applicable): libpcap-1.0.0-4.20090922gite154e2.fc12 tcpdump-4.0.0-3.20090921gitdf3cb4.fc12 How reproducible: Every time Steps to Reproduce: 1. Start tcpdump 2. Observe high CPU / disk activity and slow wait 3. ? Actual results: tcpdump seems to block allocating memory Expected results: well... it should start a bit quicker; ideally the same sort of snappy speeds it used to Additional info: I will attach an strace and vmstat; at the time in question the machine had: $ free total used free shared buffers cached Mem: 3992340 3419804 572536 0 33848 503500 -/+ buffers/cache: 2882456 1109884 Swap: 2031608 1876168 155440 ...which doesn't seem unreasonable. How much ram can libpcap need ;o)