If a pod is restarted, the prior logs are lost. In case of a crash, for example, we lose all context leading up to the crash, which is critical for debugging. One example is here: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jnk-vuf1cs33-p/jnk-vuf1cs33-p_20200815T005845/logs/failed_testcase_ocs_logs_1597456807/test_fio_workload_simple[CephBlockPool-sequential]_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-191e1a9fadc5b379104a64cc6516b8712acaf72f7c1ec31ad80263f5a3ba8128/ceph/namespaces/openshift-storage/pods/ osd.1 crashed, but the only log for it is from after the crash, when it was restarted.
Doesn't look like a 4.5 candidate to me, moving it to 4.6. Please retarget if required.
K8s only saves one previous container after failure, so logs are only available for the one container previous to the current one. These two logs are already being collected by must-gather. I haven't found a way to configure K8s to store more logs. It is unfortunate when a pod is crashlooping as you have found since you quickly lose the logs from the original crash.
Doesn't look like something we can fix in OCS, can we close it as WONT_FIX? @karthick: In any case, shouldn't be a blocker for 4.6
(In reply to Mudit Agarwal from comment #8) > Doesn't look like something we can fix in OCS, can we close it as WONT_FIX? > > @karthick: In any case, shouldn't be a blocker for 4.6 Ack, removing the blocker flag for 4.6 Thanks Travis & Pulkit for the explanation.
Closing it as there is no good way to fix this.
Re-opening as this renders the product unsupportable. It may not be easy, but there must be a way to get logs from kubernetes. Will discuss more with Travis and Sebastien.
Moving it to 4.7 as it needs more discussion, please bring it back if we can fix it in 4.6 timeframe.
It should be under the dataDirHostPath, which for OCS should be /var/lib/rook/openshift-storage. Note that the sidecar is not enabled by default, see Seb's PR for the setting to enable: https://github.com/rook/rook/pull/6679
Pulkit, https://bugzilla.redhat.com/show_bug.cgi?id=1901134 is now fixed. Do we have plans to fix this one in 4.7?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041