Log files in /var/log/kube-apiserver are sometimes corrupted at the tail. We suspect that kubelet does not run instances of the same pod in sequence, but with overlap.
$ oc version -o yaml clientVersion: buildDate: "2021-05-14T22:17:07Z" compiler: gc gitCommit: 629bdbe335bbf2f68e5a5f6e3fc25de8c249fd3c gitTreeState: clean gitVersion: 4.8.0-202105142152.p0-629bdbe goVersion: go1.16.1 major: "" minor: "" platform: linux/amd64 openshiftVersion: 4.8.0-0.nightly-2021-05-18-033553 releaseClientVersion: 4.8.0-0.nightly-2021-05-17-231618 serverVersion: buildDate: "2021-05-17T20:26:19Z" compiler: gc gitCommit: 9d99e1c27544615392364de66fc7fa926bd9e752 gitTreeState: clean gitVersion: v1.21.0-rc.0+9d99e1c goVersion: go1.16.1 major: "1" minor: 21+ platform: linux/amd64 $ oc describe pod -n openshift-kube-apiserver kube-apiserver-ip-10-0-142-223.us-east-2.compute.internal ... Containers: kube-apiserver: Container ID: cri-o://b588396f834dfbb0298fd22ada48bcc68ae822e50f2267b5208bd961c1905d28 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bb66d45f64e61b8896d933d5efccbde43d0d1bfa59ef0e970bc21dd19f662a05 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bb66d45f64e61b8896d933d5efccbde43d0d1bfa59ef0e970bc21dd19f662a05 Port: 6443/TCP Host Port: 6443/TCP Command: /bin/bash -ec Args: LOCK=/var/log/kube-apiserver/.lock echo -n "Acquiring exclusive lock ${LOCK}" exec {LOCK_FD}>${LOCK} && flock -n "${LOCK_FD}" || { echo "$(date -Iseconds -u) kubelet did not terminate old kube-apiserver before new one" >> /var/log/kube-apiserver/lock.log echo -n ": WARNING: kubelet did not terminate old kube-apiserver before new one." # we didn't get an exclusive lock. We keep going with the risk to corrupt audit logs. } echo ... flock was added in startup of kube-apiserver. Observed the testgrid https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-ci-4.8-e2e-aws&sort-by-flakiness, the following case [1] e2etest ran passed since May 10th. [1] openshift-tests.[sig-cli] oc adm must-gather when looking at the audit logs [sig-node] kubelet runs apiserver processes strictly sequentially in order to not risk audit log corruption [Suite:openshift/conformance/parallel] From above, the PR works as expected, so move the bug VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438