Description of problem: Errors on fluentd logs: 2019-03-15 10:57:03 +1300 [debug]: buffer queue is full. Wait 1 second to re-emit events 2019-03-15 10:57:03 +1300 [info]: Connection opened to Elasticsearch cluster => {:host=>"logging-es", :port=>9200, :scheme=>"https", :user=>"fluentd", :password=>"obfuscated"} 2019-03-15 10:57:03 +1300 [debug]: Indexed (op = create), 1 mapper_parsing_exception 2019-03-15 10:57:03 +1300 [debug]: Elasticsearch errors returned, retrying: {"took"=>1, "errors"=>true, "items"=>[{"create"=>{"_index"=>"project.bwb.efdc9fa1-a71c-11e8-acab-0050568d7609.2019.03.04", "_type"=>"com.redhat.viaq.common", "_id"=>"NDdhYzg3OTktNmMyNi00MGUwLTk1OTgtMmNmYTViMDQzYjIx", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [ExceptionDetail.InnerException.ErrorCode]", "caused_by"=>{"type"=>"number_format_exception", "reason"=>"For input string: \"ErrorImpersonateUserDenied\""}}}}]} 2019-03-15 10:57:03 +1300 [warn]: temporarily failed to flush the buffer. next_retry=2019-03-15 10:57:05 +1300 error_class="Fluent::ElasticsearchErrorHandler::ElasticsearchError" error="Elasticsearch returned errors, retrying. Add '@log_level debug' to your config to see the full response" plugin_id="object:3fa7f8cb110c" Version-Release number of selected component (if applicable): ocp v3.9.27 Additional info: 2019-03-15 11:51:27 +1300 [debug]: buffer queue is full. Wait 1 second to re-emit events 2019-03-15 11:51:28 +1300 [debug]: buffer queue is full. Wait 1 second to re-emit events 2019-03-15 11:51:29 +1300 [debug]: buffer queue is full. Wait 1 second to re-emit events 2019-03-15 11:51:30 +1300 [debug]: buffer queue is full. Wait 1 second to re-emit events 2019-03-15 11:51:31 +1300 [debug]: buffer queue is full. Wait 1 second to re-emit events 2019-03-15 11:51:32 +1300 [debug]: buffer queue is full. Wait 1 second to re-emit events 2019-03-15 11:51:33 +1300 [debug]: buffer queue is full. Wait 1 second to re-emit events 2019-03-15 11:51:34 +1300 [debug]: buffer queue is full. Wait 1 second to re-emit events" Requested the team to tune the buffer chunks. However, the buffer queue piling up seems to be related to the other errors or vice versa. Attached the logging dump.
This is a duplicate of 1668338 which is caused by MERGE_JSON_LOG and the reason we proactively disabled it. These logs are silently being dropped but fluent will be unable to push new logs until the MERGE feature is disabled and the buffers clear out. *** This bug has been marked as a duplicate of bug 1668338 ***