+++ This bug was initially created as a clone of Bug #1593297 +++ The query are still blocked in openshift3/logging-fluentd/images/v3.9.31-2. Steps: 1. Put the attached buffer file under /var/lib/fluentd 2. oc set env ds/logging-fluentd ENABLE_UTF8_FILTER=true oc set env ds/logging-fluentd DEBUG=true VERBOSE=true DEBUG_TRANSPORTER=true TRANSPORTER_LOG_LEVEL=debug oc set env ds/logging-fluentd BUFFER_QUEUE_LIMIT=40 FILE_BUFFER_LIMIT=320Mi 3. watch the buffer query size # ls /var/lib/fluentd/buffer-output-es-config.output_tag.* |wc -l 36 4. watch the log 2018-06-19 15:30:32 +0800 [warn]: fluent/root_agent.rb:187:emit_error_event: dump an error event: error_class=ArgumentError error="invalid byte sequence in UTF-8" location="/usr/share/gems/gems/fluent-plugin-elasticsearch-1.16.1/lib/fluent/plugin/out_elasticsearch.rb:383:in `downcase'" tag="output_tag" time=1522079957 record={"systemd"=>{"t"=>{"BOOT_ID"=>"f4becb7fb2ae42b8abb3076654c926cc", "CAP_EFFECTIVE"=>"1fffffffff", "CMDLINE"=>"/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --authorization-plugin=rhel-push-plugin --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --seccomp-profile=/etc/docker/seccomp.json --selinux-enabled --log-driver json-file --log-opt max-size=50m --signature-verification=False --storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/docker_vg-docker--pool --storage-opt dm.use_deferred_removal=true --storage-opt dm.use_deferred_deletion=true --storage-opt dm.basesize=3G --mtu=8951 --add-registry registry.access.redhat.com --add-registry registry.access.redhat.com", "COMM"=>"dockerd-current", "EXE"=>"/usr/bin/dockerd-current", "GID"=>"0", "MACHINE_ID"=>"d52c597d0f1a42aeb01b5a7d71e63f24", "PID"=>"2961", "SELINUX_CONTEXT"=>"system_u:system_r:init_t:s0", "SYSTEMD_CGROUP"=>"/system.slice/docker.service", "SYSTEMD_SLICE"=>"system.slice", "SYSTEMD_UNIT"=>"docker.service", "TRANSPORT"=>"stdout", "UID"=>"0"}, "u"=>{"SYSLOG_FACILITY"=>"3", "SYSLOG_IDENTIFIER"=>"dockerd-current"}}, "level"=>"info", "message"=>"time=\"2018-03-26T15:59:17.716724325Z\" level=error msg=\"Failed to log msg \\\" 2018-03-26 15:58:49 +0000 [warn]: suppressed same stacktrace\\\" for logger json-file: write /var/lib/docker/containers/a3fca2c0a2f4f44321a7d8165b0cd70c041360ae31960e0959ac8cc7f25dae46/a3fca2c0a2f4f44321a7d8165b0cd70c041360ae31960e0959ac8cc7f25dae46-json.log: no space left on device\"", "hostname"=>"ip-172-31-31-141.us-west-1.compute.internal", "pipeline_metadata"=>{"collector"=>{"ipaddr4"=>"10.131.60.182", "ipaddr6"=>"fe80::c802:5fff:fe7a:629a", "inputname"=>"fluent-plugin-systemd", "name"=>"fluentd", "received_at"=>"2018-03-26T16:00:23.009320+00:00", "version"=>"0.12.42 1.6.0"}}, "@timestamp"=>"2018-03-26T15:59:17.000000+00:00", 90=>"\u0018\u05C8\xA7systemd\x82\xA1t\x8E\xA7BOOT_ID\xC4 "} 2018-06-19 15:30:34 +0800 [warn]: fluent/output.rb:381:rescue in try_flush: temporarily failed to flush the buffer. next_retry=2018-06-19 15:35:31 +0800 error_class="Fluent::ElasticsearchErrorHandler::ElasticsearchError" error="The number of records submitted 6837 do not match the number returned 6835. Unable to process bulk response." plugin_id="object:3fdebe407fa4" 2018-06-19 15:30:34 +0800 [warn]: suppressed same stacktrace 2018-06-19 15:30:43 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:31:13 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:31:43 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:32:13 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:32:43 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:33:13 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:33:43 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:34:13 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:34:43 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:35:13 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:35:32 +0800 [warn]: fluent/root_agent.rb:187:emit_error_event: dump an error event: error_class=ArgumentError error="invalid byte sequence in UTF-8" location="/usr/share/gems/gems/fluent-plugin-elasticsearch-1.16.1/lib/fluent/plugin/out_elasticsearch.rb:383:in `downcase'" tag="output_tag" time=1522079957 record={"systemd"=>{"t"=>{"BOOT_ID"=>"f4becb7fb2ae42b8abb3076654c926cc", "CAP_EFFECTIVE"=>"1fffffffff", "CMDLINE"=>"/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --authorization-plugin=rhel-push-plugin --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --seccomp-profile=/etc/docker/seccomp.json --selinux-enabled --log-driver json-file --log-opt max-size=50m --signature-verification=False --storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/docker_vg-docker--pool --storage-opt dm.use_deferred_removal=true --storage-opt dm.use_deferred_deletion=true --storage-opt dm.basesize=3G --mtu=8951 --add-registry registry.access.redhat.com --add-registry registry.access.redhat.com", "COMM"=>"dockerd-current", "EXE"=>"/usr/bin/dockerd-current", "GID"=>"0", "MACHINE_ID"=>"d52c597d0f1a42aeb01b5a7d71e63f24", "PID"=>"2961", "SELINUX_CONTEXT"=>"system_u:system_r:init_t:s0", "SYSTEMD_CGROUP"=>"/system.slice/docker.service", "SYSTEMD_SLICE"=>"system.slice", "SYSTEMD_UNIT"=>"docker.service", "TRANSPORT"=>"stdout", "UID"=>"0"}, "u"=>{"SYSLOG_FACILITY"=>"3", "SYSLOG_IDENTIFIER"=>"dockerd-current"}}, "level"=>"info", "message"=>"time=\"2018-03-26T15:59:17.716724325Z\" level=error msg=\"Failed to log msg \\\" 2018-03-26 15:58:49 +0000 [warn]: suppressed same stacktrace\\\" for logger json-file: write /var/lib/docker/containers/a3fca2c0a2f4f44321a7d8165b0cd70c041360ae31960e0959ac8cc7f25dae46/a3fca2c0a2f4f44321a7d8165b0cd70c041360ae31960e0959ac8cc7f25dae46-json.log: no space left on device\"", "hostname"=>"ip-172-31-31-141.us-west-1.compute.internal", "pipeline_metadata"=>{"collector"=>{"ipaddr4"=>"10.131.60.182", "ipaddr6"=>"fe80::c802:5fff:fe7a:629a", "inputname"=>"fluent-plugin-systemd", "name"=>"fluentd", "received_at"=>"2018-03-26T16:00:23.009320+00:00", "version"=>"0.12.42 1.6.0"}}, "@timestamp"=>"2018-03-26T15:59:17.000000+00:00", 90=>"\u0018\u05C8\xA7systemd\x82\xA1t\x8E\xA7BOOT_ID\xC4 "} 2018-06-19 15:35:34 +0800 [warn]: fluent/output.rb:381:rescue in try_flush: temporarily failed to flush the buffer. next_retry=2018-06-19 15:40:31 +0800 error_class="Fluent::ElasticsearchErrorHandler::ElasticsearchError" error="The number of records submitted 6837 do not match the number returned 6835. Unable to process bulk response." plugin_id="object:3fdebe407fa4" 2018-06-19 15:35:34 +0800 [warn]: suppressed same stacktrace 2018-06-19 15:35:43 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:36:13 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:36:43 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:37:13 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:37:43 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 2018-06-19 15:38:13 +0800 [info]: plugin/filter_kubernetes_metadata.rb:109:dump_stats: stats - namespace_cache_size: 2, pod_cache_size: 3, namespace_cache_api_updates: 3, pod_cache_api_updates: 3, id_cache_miss: 3 --- Additional comment from Jeff Cantrill on 2018-06-20 10:12:15 EDT --- requires an official image build with rubygem-fluent-plugin-elasticsearch-1.17.0-1.el7
Waiting the image include the fix.
Please change to ON_QA, issue is fixed. Used buffer file from https://bugzilla.redhat.com/attachment.cgi?id=1417859 and followed steps in Comment 0, all buffer record are read without the no equal response error in fluentd pods logs. fluentd image version: logging-fluentd-v3.7.58-1
# docker run -it brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:v3.7.58-1 rpm -qa | grep rubygem-fluent-plugin-elasticsearch rubygem-fluent-plugin-elasticsearch-1.17.0-1.el7.noarch
Change back to MOFIFIED, the latest logging-fluentd v3.7.59.2 is not built with rubygem-fluent-plugin-elasticsearch-1.17.0-1.el7.noarch.rpm, but with rubygem-fluent-plugin-elasticsearch-1.15.2-1.el7.noarch. need correct fluentd image # docker run -it brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:v3.7.59.2 rpm -qa | grep rubygem-fluent-plugin-elasticsearch rubygem-fluent-plugin-elasticsearch-1.15.2-1.el7.noarch REPOSITORY TAG IMAGE ID CREATED SIZE brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd v3.7 268fc822a76a 10 hours ago 286.1 MB brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd v3.7.59-2 268fc822a76a 10 hours ago 286.1 MB
(In reply to Junqi Zhao from comment #11) > Change back to MOFIFIED, the latest logging-fluentd v3.7.59.2 is not built Correction: the latest logging-fluentd v3.7.59-2
The fix is in 3.7.58. But the errata will release v3.7.57. so move bug to modified again.
verified on v3.7.61
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2337