Red Hat Bugzilla – Bug 461706
PACKET socket 'read()' consumes lots of cpu during heavy write activity
Last modified: 2014-06-18 04:29:56 EDT
Description of problem:
While running network stress test at 540k pps with PACKET socket
application, observed that thread waiting in either 'read()' or
'poll()' [mmap'ed mode] and receiving absolutely zero packets
consumes 30% again as much as the CPU consumed by the thread
writing data to same socket, and the writer thread consumes
about 10-15% more CPU than it does when no thread is waiting on
Looks like a lock contention issue where pended reader thread is
run after a lock release. Inefficient of course, and was cause
of data loss in socket write-side queue. Eliminating reader
removed data loss and lowered overall CPU consumption
Suspect this may be fundamental issue with all sockets, but
did not delve into the possibility.
Version-Release number of selected component (if applicable):
Write 280,000 pps on two separate packet sockets which
have a reader thread blocked on a 'read()' or 'poll()'.
Each socket is bound to a different NIC.
Observe CPU consumption in 'top' with threads view active.
Note: Writer is bound to one quad-core CPU as performance
is much worse if it floats to both CPUs. Also IRQ are
bound to CPUs on same node.
Idle reader thread waiting on socket consumes
up to 25% of a core and writer thread consumes
10-25% more CPU than it does when no reader is
Idle reader thread waiting on socket should consume
zero CPU and writer thread should consume less CPU
and be less prone to dropping data.
Can you still reproduce this issue on RHEL5.8?
Probably. Before I spend/waste two or three
hours firing up the test configuration, are
you actually likely to fix this? It's only
been four short years since this issue was
opened, and RHEL 6 has arrived in the interim.
Correcting this issue would probably require
big changes to the socket logic, and I've
seen RH mark much simpler issues WONTFIX
due to an aversion to modifying mature
kernels that aren't seeing significant
On the other hand we're finding the newer
kernel performance is exceedingly bad,
so it would be nice to have this one
last a few more years.
If your're just looking for an excuse to
close this issue tell me now and don't
waste my time.
(In reply to comment #4)
> If your're just looking for an excuse to
> close this issue tell me now and don't
> waste my time.
I don't need an excuse to close this. If I want to close it I would have done so.
I'm considering to fix this for 5.9 if is it doable without major changes.
Note that this bug report is not backed with a support case which means it will receive much lower priority than others. You also never provided a reproducer for this problem.
Ok, I'll re-test sometime in the next ten days.
Unless significant restructuring of the socket logic has happened since the original report, it's probable that the issue still exists.
This is not a big issue for us--reported it as it seems the sort of thing that ought to be fixed for the general performance benefit, even under less intense loading scenarios.