Hide Forgot
According to: https://www.redhat.com/archives/linux-cluster/2011-September/msg00067.html this issue exists in RHEL6 too.
Created attachment 535954 [details] [PATCH 1/6] fix bz742431: clarify recv/read_restart+send/write_restart
Created attachment 535955 [details] [PATCH 2/6] fix bz742431: introduce per-peer outgoing queue pruning
Created attachment 535956 [details] [PATCH 3/6] fix bz742431: limit peer's send() to one message only
Created attachment 535957 [details] [PATCH 4/6] fix bz742431: read all available with peer's receive()
Created attachment 535959 [details] [PATCH 5/6] fix bz742431: split+restructure poll handling in communicator
Created attachment 535961 [details] [PATCH 6/6] fix bz742431: turn off Nagle's alg. in peers' communication
Created attachment 535963 [details] bz742431: additional performance improvement patch [1/2]
Created attachment 535964 [details] bz742431: additional performance improvement patch [2/2]
As per Comment https://bugzilla.redhat.com/show_bug.cgi?id=618321#c75 acking this for QA using an artificial test as described.
Created attachment 541598 [details] bz742431: additional fix for a minor memory leak Original patch attachment 529083 [details] (accidentally posted by bug 618321 whereas it should have been here) revisited. Recap: the leaking triggered with connections to /var/run/clumond.sock (2 B per connection IIRC, incomparable with that big memory issue)
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause * trigger unknown, presumably uncommon event/attribute of the environment Consequence * outgoing queues in inter-nodes communication are growing over time Fix * better balanced inter-nodes communication + restriction of the queues Result * resources utilization kept at reasonable level * possible queues interventions logged in /var/log/clumond.log
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0750.html