Bug 1956898

Summary:	fix log files being overwritten on container state loss
Product:	OpenShift Container Platform	Reporter:	Ryan Phillips <rphillips>
Component:	Node	Assignee:	Ryan Phillips <rphillips>
Node sub component:	Kubelet	QA Contact:	Sunil Choudhary <schoudha>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	aos-bugs, ccoleman
Version:	4.8
Target Milestone:	---
Target Release:	4.8.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-07-27 23:06:09 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Ryan Phillips 2021-05-04 15:43:52 UTC

Description of problem:
backport https://github.com/kubernetes/kubernetes/pull/99748

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Sunil Choudhary 2021-05-07 10:37:24 UTC

Checked on 4.8.0-0.nightly-2021-05-06-210840. Rebooted a node multiple times and check restart count of static pods.

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-05-06-210840   True        False         6h51m   Cluster version is 4.8.0-0.nightly-2021-05-06-210840

$ oc get nodes -o wide
NAME                                         STATUS   ROLES    AGE     VERSION                INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION          CONTAINER-RUNTIME
ip-10-0-134-112.us-east-2.compute.internal   Ready    worker   7h7m    v1.21.0-rc.0+291e731   10.0.134.112   <none>        Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa)   4.18.0-293.el8.x86_64   cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8
ip-10-0-159-122.us-east-2.compute.internal   Ready    master   7h16m   v1.21.0-rc.0+291e731   10.0.159.122   <none>        Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa)   4.18.0-293.el8.x86_64   cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8
ip-10-0-178-252.us-east-2.compute.internal   Ready    master   7h15m   v1.21.0-rc.0+291e731   10.0.178.252   <none>        Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa)   4.18.0-293.el8.x86_64   cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8
ip-10-0-190-29.us-east-2.compute.internal    Ready    worker   7h7m    v1.21.0-rc.0+291e731   10.0.190.29    <none>        Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa)   4.18.0-293.el8.x86_64   cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8
ip-10-0-213-46.us-east-2.compute.internal    Ready    master   7h15m   v1.21.0-rc.0+291e731   10.0.213.46    <none>        Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa)   4.18.0-293.el8.x86_64   cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8
ip-10-0-214-118.us-east-2.compute.internal   Ready    worker   7h7m    v1.21.0-rc.0+291e731   10.0.214.118   <none>        Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa)   4.18.0-293.el8.x86_64   cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8


$ oc get pods -A | grep -e "ip-10-0-159-122.us-east-2.compute.internal" -e NAME
NAMESPACE                                          NAME                                                                  READY   STATUS              RESTARTS   AGE
openshift-etcd                                     etcd-ip-10-0-159-122.us-east-2.compute.internal                       3/3     Running             3          4h51m
openshift-etcd                                     revision-pruner-3-ip-10-0-159-122.us-east-2.compute.internal          0/1     Completed           0          4h33m
openshift-kube-apiserver                           kube-apiserver-ip-10-0-159-122.us-east-2.compute.internal             5/5     Running             5          4h48m
openshift-kube-apiserver                           revision-pruner-7-ip-10-0-159-122.us-east-2.compute.internal          0/1     Completed           0          4h33m
openshift-kube-controller-manager                  kube-controller-manager-ip-10-0-159-122.us-east-2.compute.internal    4/4     Running             4          4h46m
openshift-kube-controller-manager                  revision-pruner-7-ip-10-0-159-122.us-east-2.compute.internal          0/1     Completed           0          4h30m
openshift-kube-scheduler                           openshift-kube-scheduler-ip-10-0-159-122.us-east-2.compute.internal   3/3     Running             3          4h45m
openshift-kube-scheduler                           revision-pruner-6-ip-10-0-159-122.us-east-2.compute.internal          0/1     Completed           0          4h30m

$ oc get pods -A | grep -e "ip-10-0-159-122.us-east-2.compute.internal" -e NAME
NAMESPACE                                          NAME                                                                  READY   STATUS      RESTARTS   AGE
openshift-etcd                                     etcd-ip-10-0-159-122.us-east-2.compute.internal                       3/3     Running     6          7h7m
openshift-etcd                                     revision-pruner-3-ip-10-0-159-122.us-east-2.compute.internal          0/1     Completed   0          6h49m
openshift-kube-apiserver                           installer-8-ip-10-0-159-122.us-east-2.compute.internal                0/1     Completed   0          73m
openshift-kube-apiserver                           kube-apiserver-ip-10-0-159-122.us-east-2.compute.internal             5/5     Running     5          70m
openshift-kube-apiserver                           revision-pruner-7-ip-10-0-159-122.us-east-2.compute.internal          0/1     Completed   0          6h49m
openshift-kube-apiserver                           revision-pruner-8-ip-10-0-159-122.us-east-2.compute.internal          0/1     Completed   0          70m
openshift-kube-controller-manager                  kube-controller-manager-ip-10-0-159-122.us-east-2.compute.internal    4/4     Running     8          7h3m
openshift-kube-controller-manager                  revision-pruner-7-ip-10-0-159-122.us-east-2.compute.internal          0/1     Completed   0          6h47m
openshift-kube-scheduler                           openshift-kube-scheduler-ip-10-0-159-122.us-east-2.compute.internal   3/3     Running     6          7h2m
openshift-kube-scheduler                           revision-pruner-6-ip-10-0-159-122.us-east-2.compute.internal          0/1     Completed   0          6h47m

Comment 3 Ryan Phillips 2021-06-01 16:55:43 UTC

*** Bug 1934867 has been marked as a duplicate of this bug. ***

Comment 6 errata-xmlrpc 2021-07-27 23:06:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438