Bug 247733 - bug in flow control accounting can freeze cluster communication IO
bug in flow control accounting can freeze cluster communication IO
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: openais (Show other bugs)
All Linux
high Severity high
: ---
: ---
Assigned To: Steven Dake
Depends On:
  Show dependency treegraph
Reported: 2007-07-11 01:16 EDT by Steven Dake
Modified: 2016-04-26 12:32 EDT (History)
1 user (show)

See Also:
Fixed In Version: RHBA-2007-0599
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-11-07 12:00:07 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Steven Dake 2007-07-11 01:16:03 EDT
Description of problem:
Totem has a queue.  The queue has a fixed size entries available for queueing
incoming messaqge segments.  A determination is made on acceptance of the
message whether it should be flow controlled, or whether it should be sent to
the executive handler for processing.  The determination is made using an
estimation based upon the library request size.  If the executive message size
is larger then the library request size, sometimes totem will be unable to queue
the message because it does not have suitable room in its queue for the "real"
message once it is generated for transmission even though it has already been
accepted for transmission.

In the particular case if a message size of 4035 which requires 4 queue entries
is sent repeatidly with cpgbench while the queue has 3 entries, aisexec will
send the request to the executive handler instead of rejecting the message back
to the user with a try again error code.  Then when the real message is queued
with totem, it takes up 4 spots in the queue when only 3 are available.  Totem
protects itself from memory corruption by not allowing this queue operation to
occur and returns an error code.  All services in aisexec assert when this error
code is returned since it should never be returned, EXCEPT cpg which increments
the outstanding reference count for flow control.  Then the message is never
delivered which reduces the reference count for flow control.  As a result, the
flow control value used to determine when to shut off incoming requests reaches
the shutoff point, but is never decremented back to the turn on point, resulting
in no new messages being queued into totem.  The net result is a complete IO
lockup for the CPG service for various sizes of messages (and possibly
assertions for other services).  Note totem implements message packing so any
number of normal requests could potentially generate this scenario.

Version-Release number of selected component (if applicable):

How reproducible:
hard to reproduce with RHCS however modified "cpgbench" can recreate problem in
my test network with specific byte sizes 100% of the time.

Steps to Reproduce:
1. modify cpgbench to 4035 byte message size
2. run on 2 node gige cluster with netmtu of 8800 using jumbo frame gige
Actual results:
Flow control is enabled for IPC but never disabled

Expected results:
Flow control should be enabled but then disabled later once the server's output
queue has emptied sufficiently.

Additional info:
small patch.
Comment 1 RHEL Product and Program Management 2007-07-11 01:23:46 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
Comment 5 errata-xmlrpc 2007-11-07 12:00:07 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.