Description of problem: /var/log/containers should contain a symlink for every pod and container log found in /var/log/pods Because the symlink is missing, fluentd does not tail the log files and send them to kibana, but logs are still visible in the webconsole and when using crictl logs command. Kubernetes issue for similar problem. The status is closed, but many comments say it can be recreated and we are seeing this behavior: https://github.com/kubernetes/kubernetes/issues/52172 Recreating the symlinks causes the logs currently in the /var/log/pods/ path to be sent by fluentd, but any logs lost after the log-rotation completes remain missing. Version-Release number of selected component (if applicable): atomic-openshift-node-3.11.161-1.git.0.4ccbe25.el7.x86_64 atomic-openshift-clients-3.11.161-1.git.0.4ccbe25.el7.x86_64 cri-o-1.11.16-0.5.dev.rhaos3.11.git3f89eba.el7.x86_64 Actual results: Using Docker os CRI-O the same happens. Expected results: This should happen automatically and do not lose the symlinks Additional info:
Issue #52172 was reopened. The race condition existed in 3.11 PR #89160 has been merged upstream. My back port PR : https://github.com/openshift/origin/pull/24926
*** Bug 1814859 has been marked as a duplicate of this bug. ***
verified with version : 4.5.0-0.nightly-2020-05-27-010813 run test with the following script and found no cases: #!/bin/bash crictl ps -q | while read line; do if [ `find /var/log/containers -name *$line* | wc -l` = "0" ] then name=`crictl inspect $line | grep "cri-o.Name" | cut -d ":" -f 2 | head -c 5` if [ $name != '"k8s_' ] then crictl inspect $line | grep "cri-o.Name" fi fi done
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409
Hello, I can see that this Bugzilla was opened for OCP 3.11 and it is fixed for OCP 4.5. This is not going to be backported to OCP 3.11 where the bug was reported? Regards, Oscar
4.4.z backport is here https://bugzilla.redhat.com/show_bug.cgi?id=1878795
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days