Bug 1478821 - Fluentd log is filled up with warnings about "log unreadable. It is excluded and would be examined next time." when using json-file log driver
Fluentd log is filled up with warnings about "log unreadable. It is excluded ...
Status: NEW
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging (Show other bugs)
3.6.0
Unspecified Unspecified
low Severity low
: ---
: ---
Assigned To: Jeff Cantrill
Xia Zhao
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-07 04:01 EDT by Xia Zhao
Modified: 2017-08-22 08:23 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
fluentd_log_#2 (38.75 KB, text/plain)
2017-08-07 04:01 EDT, Xia Zhao
no flags Details
fluentd_log_#1 (348.40 KB, text/plain)
2017-08-07 04:01 EDT, Xia Zhao
no flags Details
es log (13.00 KB, text/plain)
2017-08-07 04:01 EDT, Xia Zhao
no flags Details

  None (edit)
Description Xia Zhao 2017-08-07 04:01:02 EDT
Created attachment 1309952 [details]
fluentd_log_#2

Description of problem:
Deploy logging 3.6 stacks on OCP env with json-file docker log driver. Wait for a while after EFK pods are started up, check the fluentd log, it is filled up with warnings like this:

2017-08-07 03:20:50 -0400 [warn]: /var/log/containers/java-mainclass-1-7hz6s_java_java-mainclass-bd852c77e9ddba7bf090aa2070fd325939ec6f622834c9553f301120c768ea52.log unreadable. It is excluded and would be examined next time.

Checked on node that the files/linked files should be accessible (file right is set to 777):
# ls -l /var/log/containers/java-mainclass-1-7hz6s_java_java-mainclass-bd852c77e9ddba7bf090aa2070fd325939ec6f622834c9553f301120c768ea52.log
lrwxrwxrwx. 1 root root 71 8月   7 03:07 /var/log/containers/java-mainclass-1-7hz6s_java_java-mainclass-bd852c77e9ddba7bf090aa2070fd325939ec6f622834c9553f301120c768ea52.log -> /var/log/pods/02ef8522-7b3f-11e7-b1f4-fa163e8691cf/java-mainclass_0.log

# ls -l /var/log/pods/02ef8522-7b3f-11e7-b1f4-fa163e8691cf/java-mainclass_0.log
lrwxrwxrwx. 1 root root 165 8月   7 03:07 /var/log/pods/02ef8522-7b3f-11e7-b1f4-fa163e8691cf/java-mainclass_0.log -> /var/lib/docker/containers/bd852c77e9ddba7bf090aa2070fd325939ec6f622834c9553f301120c768ea52/bd852c77e9ddba7bf090aa2070fd325939ec6f622834c9553f301120c768ea52-json.log


Version-Release number of selected component (if applicable):
logging-fluentd                           v3.6.173.0.5-3      81d989e6ab3f        3 days ago          232 MB

How reproducible:
Always

Steps to Reproduce:
1.Deploy logging 3.6 stacks on OCP env with json-file docker log driver
2.Wait for a while after EFK pods are started up, check the fluentd log

Actual results:
Fluentd log is filled up with warnings about "log unreadable. It is excluded and would be examined next time."

Expected results:
Fluentd log should not be filled up with warnings about "log unreadable. It is excluded and would be examined next time."

Additional info:
1. fluent and es logs are attached.
2. Issue did not repro to journald log driver
Comment 1 Xia Zhao 2017-08-07 04:01 EDT
Created attachment 1309953 [details]
fluentd_log_#1
Comment 2 Xia Zhao 2017-08-07 04:01 EDT
Created attachment 1309955 [details]
es log
Comment 3 Eduardo Minguez 2017-08-07 08:17:09 EDT
It seems to affect OCP 3.4 as well:

[cloud-user@bastion ~]$ oc version
oc v3.4.1.44
kubernetes v1.4.0+776c994
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ocp.edu.example.com:8443
openshift v3.4.1.44
kubernetes v1.4.0+776c994

[root@app-node-2 ~]# rpm -qa | grep -i openshift
openshift-ansible-3.4.119-1.git.0.2b36c8a.el7.noarch
openshift-ansible-roles-3.4.119-1.git.0.2b36c8a.el7.noarch
atomic-openshift-3.4.1.44-1.git.0.e98176f.el7.x86_64
atomic-openshift-excluder-3.4.1.44-1.git.0.e98176f.el7.noarch
atomic-openshift-docker-excluder-3.4.1.44-1.git.0.e98176f.el7.noarch
openshift-ansible-docs-3.4.119-1.git.0.2b36c8a.el7.noarch
openshift-ansible-filter-plugins-3.4.119-1.git.0.2b36c8a.el7.noarch
openshift-ansible-callback-plugins-3.4.119-1.git.0.2b36c8a.el7.noarch
openshift-ansible-playbooks-3.4.119-1.git.0.2b36c8a.el7.noarch
atomic-openshift-clients-3.4.1.44-1.git.0.e98176f.el7.x86_64
atomic-openshift-node-3.4.1.44-1.git.0.e98176f.el7.x86_64
openshift-ansible-lookup-plugins-3.4.119-1.git.0.2b36c8a.el7.noarch
atomic-openshift-utils-3.4.119-1.git.0.2b36c8a.el7.noarch
tuned-profiles-atomic-openshift-node-3.4.1.44-1.git.0.e98176f.el7.x86_64

[cloud-user@bastion ~]$ oc get pods
NAME                          READY     STATUS      RESTARTS   AGE
logging-curator-1-vcsfq       1/1       Running     0          1h
logging-deployer-kxxgh        0/1       Completed   0          1h
logging-es-8bnwfrsq-1-zi1cl   1/1       Running     0          1h
logging-es-ps2gwwso-1-qathy   1/1       Running     0          1h
logging-es-swdwnrcl-1-ec2oo   1/1       Running     0          1h
logging-fluentd-1togp         1/1       Running     0          53m
logging-fluentd-esl4p         1/1       Running     0          54m
logging-fluentd-hygkb         1/1       Running     0          54m
logging-fluentd-s0gmf         1/1       Running     0          54m
logging-fluentd-tmiel         1/1       Running     0          54m
logging-fluentd-to3wi         1/1       Running     0          54m
logging-kibana-1-ce07t        2/2       Running     1          1h


[cloud-user@bastion ~]$ oc logs logging-fluentd-s0gmf
umounts of dead containers will fail. Ignoring...
umount: /var/lib/docker/containers/*/shm: mountpoint not found
2017-08-07 07:21:51 -0400 [info]: reading config file path="/etc/fluent/fluent.conf"
2017-08-07 07:22:00 -0400 [warn]: /var/log/containers/logging-deployer-kxxgh_logging_POD-c2d96d5bd64f4432fea176e3f7eb382bb3b72d927f0d3406e1292e439747ef21.log unreadable. It is excluded and would be examined next time.
2017-08-07 07:22:00 -0400 [warn]: /var/log/containers/logging-es-swdwnrcl-1-ec2oo_logging_elasticsearch-796ade94d1096d7acaf31cac89a6907f7e7effe874e8b846bb59e364d062887e.log unreadable. It is excluded and would be examined next time.
2017-08-07 07:22:00 -0400 [warn]: /var/log/containers/logging-fluentd-s0gmf_logging_fluentd-elasticsearch-4df4382ea7d99c333abf745577dda33c6f5e16b54e55054363e8696c57c7080c.log unreadable. It is excluded and would be examined next time.
2017-08-07 07:22:00 -0400 [warn]: /var/log/containers/logging-fluentd-7wj4e_logging_POD-4753aac9c86485017963d0e795a3c25187f030792a8aa257df411ddc66bd682e.log unreadable. It is excluded and would be examined next time.
...
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/logging-es-swdwnrcl-1-ec2oo_logging_elasticsearch-796ade94d1096d7acaf31cac89a6907f7e7effe874e8b846bb59e364d062887e.log unreadable. It is excluded and would be examined next time.
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/logging-fluentd-s0gmf_logging_fluentd-elasticsearch-4df4382ea7d99c333abf745577dda33c6f5e16b54e55054363e8696c57c7080c.log unreadable. It is excluded and would be examined next time.
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/skydive-agent-bivvj_skydive_skydive-agent-1debce5fc61a007a66f5eef6dc4d0865492a81c1348f050b7d11ae127dc1d728.log unreadable. It is excluded and would be examined next time.
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/logging-fluentd-s0gmf_logging_POD-a4ac77f49176215e498182a8a7d82440ba53db37cfa92e393227e7f4d99bde13.log unreadable. It is excluded and would be examined next time.
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/logging-es-swdwnrcl-1-ec2oo_logging_POD-549de4119b74944c4434f508d7165726836ead3267167b3a474a4e16cd427235.log unreadable. It is excluded and would be examined next time.
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/logging-deployer-kxxgh_logging_deployer-946969b34f067d5aee9188f354f3070298174b1e32c31c7a382412677bf71a04.log unreadable. It is excluded and would be examined next time.
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/router-1-j8hfz_default_router-8a00745e16624724d4285e4d86d3a90831cc6cec1bc1e438584352c0f6224905.log unreadable. It is excluded and would be examined next time.
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/logging-curator-1-vcsfq_logging_curator-36aa5a17a0da5c7b24c4b5718a3e9d2e85e936474727ce12664adb6e7c36884a.log unreadable. It is excluded and would be examined next time.
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/router-1-j8hfz_default_POD-2f57d772b48fe263eb8a5f7229131730b6d6b8bcaad07f3abe48e9a2dcea9b22.log unreadable. It is excluded and would be examined next time.
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/skydive-agent-bivvj_skydive_POD-53e631b6d8222618e500552a71fb981792af725f627eea14fd9e10b37eff6c5b.log unreadable. It is excluded and would be examined next time.
2017-08-07 08:16:00 -0400 [warn]: /var/log/containers/logging-curator-1-vcsfq_logging_POD-f60a3ebd1d24813347929dd69fd50a05c28774b327d7fa0d07898ef6da634666.log unreadable. It is excluded and would be examined next time.
Comment 4 Eduardo Minguez 2017-08-07 08:34:56 EDT
It looks like the /var/lib/docker/containers mountpoint is empty:

[cloud-user@bastion ~]$ oc describe pod logging-fluentd-dpo7t | grep -i lib
      /var/lib/docker/containers from varlibdockercontainers (ro)
  varlibdockercontainers:
    Path:	/var/lib/docker/containers
[cloud-user@bastion ~]$ oc get pods -o wide | grep dpo7t
logging-fluentd-dpo7t         1/1       Running     1          9m        172.30.91.4    infra-node-0.control.edu.example.com

[cloud-user@bastion ~]$ oc rsh logging-fluentd-dpo7t
sh-4.2# cd /var/lib/docker/containers/
sh-4.2# ls
sh-4.2# ls -la
total 0
drwxr-xr-x. 2 root root  6 Aug  7 08:29 .
drwxr-xr-x. 3 root root 24 Aug  7 08:29 ..

And in the infra-node-0:

15058ae63bbfe1333a43eaaca31bf11bcf80b6896170ea2ba561c353865f463b
1b0e42d3528d0d788395a42c3ee4e40e3d9f0989eea79f4042698ea5eccedf30
1eae6a3459557a21852dd2f41440f23303ec5ae26847da25f7f12fc4d6c1a70b
5619f6146280a948e581da1addf8f60422e3d01edb7f072d1d3bb562b2388115
7a87481a121cc26697c2e06ca3a111578f8218cae925e75a58ec73a534bf167c
8d73a14adc26d31f02102d5b0609ab73a811ba14a6876414b995ee2bf040a171
9871b1c7972ab136b02724b5fadc4234cba28e8c82260e4e1801ab285b322a3a
989f5db8514c446ea201d3c84ba2ea5a0f24f060ee4fc11bcd98579658c7fb22
9ad5005e624a7737c142f1907fb33350280634e7198a1936fae84fbbe542b658
9ff97b92b8ca87318d90ce487bd3f53e05a3b6319f028a11a6c513217d14e72b
c5cc7607583c553dd52bd11badd8ad31fc2e32f70f5f2269fdf808910bb20a35
d7e51149e2255445eff8288117bee868867a005fad520c4f7def54a7bab71a71
e3d5d457ee9e841ecc47539cf582875a3a0c88d49c81842dda248c6d11abe70a
[root@infra-node-0 containers]# find . -type f -iname *.log
./7a87481a121cc26697c2e06ca3a111578f8218cae925e75a58ec73a534bf167c/7a87481a121cc26697c2e06ca3a111578f8218cae925e75a58ec73a534bf167c-json.log
./c5cc7607583c553dd52bd11badd8ad31fc2e32f70f5f2269fdf808910bb20a35/c5cc7607583c553dd52bd11badd8ad31fc2e32f70f5f2269fdf808910bb20a35-json.log
./e3d5d457ee9e841ecc47539cf582875a3a0c88d49c81842dda248c6d11abe70a/e3d5d457ee9e841ecc47539cf582875a3a0c88d49c81842dda248c6d11abe70a-json.log
./989f5db8514c446ea201d3c84ba2ea5a0f24f060ee4fc11bcd98579658c7fb22/989f5db8514c446ea201d3c84ba2ea5a0f24f060ee4fc11bcd98579658c7fb22-json.log
./15058ae63bbfe1333a43eaaca31bf11bcf80b6896170ea2ba561c353865f463b/15058ae63bbfe1333a43eaaca31bf11bcf80b6896170ea2ba561c353865f463b-json.log
./d7e51149e2255445eff8288117bee868867a005fad520c4f7def54a7bab71a71/d7e51149e2255445eff8288117bee868867a005fad520c4f7def54a7bab71a71-json.log
./9ad5005e624a7737c142f1907fb33350280634e7198a1936fae84fbbe542b658/9ad5005e624a7737c142f1907fb33350280634e7198a1936fae84fbbe542b658-json.log
./1eae6a3459557a21852dd2f41440f23303ec5ae26847da25f7f12fc4d6c1a70b/1eae6a3459557a21852dd2f41440f23303ec5ae26847da25f7f12fc4d6c1a70b-json.log
./8d73a14adc26d31f02102d5b0609ab73a811ba14a6876414b995ee2bf040a171/8d73a14adc26d31f02102d5b0609ab73a811ba14a6876414b995ee2bf040a171-json.log
./9ff97b92b8ca87318d90ce487bd3f53e05a3b6319f028a11a6c513217d14e72b/9ff97b92b8ca87318d90ce487bd3f53e05a3b6319f028a11a6c513217d14e72b-json.log
./9871b1c7972ab136b02724b5fadc4234cba28e8c82260e4e1801ab285b322a3a/9871b1c7972ab136b02724b5fadc4234cba28e8c82260e4e1801ab285b322a3a-json.log
./5619f6146280a948e581da1addf8f60422e3d01edb7f072d1d3bb562b2388115/5619f6146280a948e581da1addf8f60422e3d01edb7f072d1d3bb562b2388115-json.log
./1b0e42d3528d0d788395a42c3ee4e40e3d9f0989eea79f4042698ea5eccedf30/1b0e42d3528d0d788395a42c3ee4e40e3d9f0989eea79f4042698ea5eccedf30-json.log


Restorecon shows log files are not properly flagged:

[root@app-node-2 ~]# ls -lZ /var/lib/docker/containers/
drwx------. root root system_u:object_r:container_var_lib_t:s0 0324acfa995d8d170fef350ac7c7fc083bbe3d102c8dbf614ac912676635e744
drwx------. root root system_u:object_r:container_var_lib_t:s0 a6216825dfcc9e20d55b5da9031a8e7fb776e64e4765b213c543d581b01a7c41
drwx------. root root system_u:object_r:container_var_lib_t:s0 e250508c319060139cbb26991b02157c44f3b04c6d48fc3d4238c65312c30124
drwx------. root root system_u:object_r:container_var_lib_t:s0 f2d8cbe979f4af4b4a1e95c983fc481c41dc439ae475b13645f1ba0bd4372c36
drwx------. root root system_u:object_r:container_var_lib_t:s0 f66543b868b8fc267cfbabfe2ac5a864771607708f4935370ca3431f3a762935
drwx------. root root system_u:object_r:container_var_lib_t:s0 fc4cac63b4234841bff4c37f8a078572015aee19d0b9ab405d5b2de3664ec958


[root@app-node-2 ~]# restorecon -Rv /var/lib/docker/containers/
restorecon reset /var/lib/docker/containers/a6216825dfcc9e20d55b5da9031a8e7fb776e64e4765b213c543d581b01a7c41/a6216825dfcc9e20d55b5da9031a8e7fb776e64e4765b213c543d581b01a7c41-json.log context system_u:object_r:container_var_lib_t:s0->system_u:object_r:container_log_t:s0
restorecon reset /var/lib/docker/containers/fc4cac63b4234841bff4c37f8a078572015aee19d0b9ab405d5b2de3664ec958/fc4cac63b4234841bff4c37f8a078572015aee19d0b9ab405d5b2de3664ec958-json.log context system_u:object_r:container_var_lib_t:s0->system_u:object_r:container_log_t:s0
restorecon reset /var/lib/docker/containers/f66543b868b8fc267cfbabfe2ac5a864771607708f4935370ca3431f3a762935/f66543b868b8fc267cfbabfe2ac5a864771607708f4935370ca3431f3a762935-json.log context system_u:object_r:container_var_lib_t:s0->system_u:object_r:container_log_t:s0
restorecon reset /var/lib/docker/containers/e250508c319060139cbb26991b02157c44f3b04c6d48fc3d4238c65312c30124/e250508c319060139cbb26991b02157c44f3b04c6d48fc3d4238c65312c30124-json.log context system_u:object_r:container_var_lib_t:s0->system_u:object_r:container_log_t:s0
restorecon reset /var/lib/docker/containers/f2d8cbe979f4af4b4a1e95c983fc481c41dc439ae475b13645f1ba0bd4372c36/f2d8cbe979f4af4b4a1e95c983fc481c41dc439ae475b13645f1ba0bd4372c36-json.log context system_u:object_r:container_var_lib_t:s0->system_u:object_r:container_log_t:s0
restorecon reset /var/lib/docker/containers/0324acfa995d8d170fef350ac7c7fc083bbe3d102c8dbf614ac912676635e744/0324acfa995d8d170fef350ac7c7fc083bbe3d102c8dbf614ac912676635e744-json.log context system_u:object_r:container_var_lib_t:s0->system_u:object_r:container_log_t:s0

Note You need to log in before you can comment on or make changes to this bug.