Bug 1303035 - Backport error_logger improvements related to output throttling
Summary: Backport error_logger improvements related to output throttling
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: erlang
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 8.0 (Liberty)
Assignee: Peter Lemenkov
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-29 11:41 UTC by Peter Lemenkov
Modified: 2019-02-17 12:13 UTC (History)
6 users (show)

Fixed In Version: erlang-R16B-03.10min.6.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-02-04 13:11:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Peter Lemenkov 2016-01-29 11:41:45 UTC
Erlang uses a generic error_logger library for logging. Currently only files and tty output is available (no syslog or Journald support so far).

Usually it looks something like that:

error_logger:error_msg("ERROR. Current State is: ~p~n", [ComplexErrorStructureToDump]).

ComplexErrorStructureToDump is an arbitrary object to print. "ERROR. Current State is: ~p~n" - is a format string. This string is almost identical to one using in a generic ANSI C printf implementations, except one interesting modifier - "~p" (sometimes "~w" which is almost the same in the context of this ticket). It tells Erlang VM that we don't know anything about the exact ComplexErrorStructureToDump structure, and we ask Erlang VM to decode and print ComplexErrorStructureToDump according to its actual structure.

There is one major issue in the example above. If ComplexErrorStructureToDump contains only bytes, is a string or an array of integers, then it's easy to print it. On the contrary if it has a complex structure, then it takes a lot of time to recognize it, create a corresponding format structure (and substitute "~p" with it in runtime), and format ComplexErrorStructureToDump according to this format structure.

Unfortunately this latter situation is more common than we want (mostly due to dynamically typed nature if the language, but that's another heated and political debate more suited for Slashdot / Reddit). For example if a critical error occurs in one of the Erlang threads, then the entire call trace, containing the full message flow led to this state is sent into error_logger for further printing. if it happens not so frequent it's not an issue, but if a situation continues to deteriorate, then error_logger got filled with messages waiting for printing.

In our case we see cases where RabbitMQ ate ~32 GByte of memory from which more than 31 goes to error_logger.

A proposed experimental solution is to limit a depth of a recursive parsing while recognizing an object's structure with a configuration switch "~p" or "~w":

http://erlang.org/doc/man/kernel_app.html#error_logger_format_depth

This experimental feature is available since Erlang 18.1 via the following commits:

* https://github.com/erlang/otp/commit/3c3a5b08589a6e54b0d2d81e470fcffaaa74c15c
* https://github.com/erlang/otp/commit/7ef23e3122467434300f340ef149f69113b617b4

In order to enable support for this feature one has to add the following kernel parameter to the application config-file (/etc/rabbitmq/rabbitmq.config in case of RabbitMQ):

...
{kernel,
  [{error_logger_format_depth, 11}, ...]
}
...

Value can't be lower than "10", and default value is "unlimited" (w/o quotes).

Comment 2 Peter Lemenkov 2016-01-29 11:57:02 UTC
Another one patch which was backported:

* https://github.com/erlang/otp/commit/7c0333ff7a559c2d0f138b3156c4a5b73e15051b


Note You need to log in before you can comment on or make changes to this bug.