Bug 1958094
| Summary: | Audit log files are corrupted sometimes | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Stefan Schimanski <sttts> |
| Component: | kube-apiserver | Assignee: | Stefan Schimanski <sttts> |
| Status: | CLOSED ERRATA | QA Contact: | Ke Wang <kewang> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.8 | CC: | aos-bugs, mfojtik, xxia |
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-27 23:07:23 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Stefan Schimanski
2021-05-07 07:35:38 UTC
$ oc version -o yaml
clientVersion:
buildDate: "2021-05-14T22:17:07Z"
compiler: gc
gitCommit: 629bdbe335bbf2f68e5a5f6e3fc25de8c249fd3c
gitTreeState: clean
gitVersion: 4.8.0-202105142152.p0-629bdbe
goVersion: go1.16.1
major: ""
minor: ""
platform: linux/amd64
openshiftVersion: 4.8.0-0.nightly-2021-05-18-033553
releaseClientVersion: 4.8.0-0.nightly-2021-05-17-231618
serverVersion:
buildDate: "2021-05-17T20:26:19Z"
compiler: gc
gitCommit: 9d99e1c27544615392364de66fc7fa926bd9e752
gitTreeState: clean
gitVersion: v1.21.0-rc.0+9d99e1c
goVersion: go1.16.1
major: "1"
minor: 21+
platform: linux/amd64
$ oc describe pod -n openshift-kube-apiserver kube-apiserver-ip-10-0-142-223.us-east-2.compute.internal
...
Containers:
kube-apiserver:
Container ID: cri-o://b588396f834dfbb0298fd22ada48bcc68ae822e50f2267b5208bd961c1905d28
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bb66d45f64e61b8896d933d5efccbde43d0d1bfa59ef0e970bc21dd19f662a05
Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bb66d45f64e61b8896d933d5efccbde43d0d1bfa59ef0e970bc21dd19f662a05
Port: 6443/TCP
Host Port: 6443/TCP
Command:
/bin/bash
-ec
Args:
LOCK=/var/log/kube-apiserver/.lock
echo -n "Acquiring exclusive lock ${LOCK}"
exec {LOCK_FD}>${LOCK} && flock -n "${LOCK_FD}" || {
echo "$(date -Iseconds -u) kubelet did not terminate old kube-apiserver before new one" >> /var/log/kube-apiserver/lock.log
echo -n ": WARNING: kubelet did not terminate old kube-apiserver before new one."
# we didn't get an exclusive lock. We keep going with the risk to corrupt audit logs.
}
echo
...
flock was added in startup of kube-apiserver.
Observed the testgrid https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-ci-4.8-e2e-aws&sort-by-flakiness, the following case [1] e2etest ran passed since May 10th.
[1] openshift-tests.[sig-cli] oc adm must-gather when looking at the audit logs [sig-node] kubelet runs apiserver processes strictly sequentially in order to not risk audit log corruption [Suite:openshift/conformance/parallel]
From above, the PR works as expected, so move the bug VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |