Back to bug 2094320

Who When What Removed Added
yati padia 2022-06-07 11:42:27 UTC Assignee hchiramm ypadia
CC ypadia
Mudit Agarwal 2022-06-21 13:27:05 UTC CC muagarwa
Pratik Surve 2022-07-04 10:25:18 UTC Severity unspecified high
Shyamsundar 2022-07-06 12:37:40 UTC CC idryomov, jdurgin, srangana, uchapaga
Flags needinfo?(jdurgin) needinfo?(idryomov) needinfo?(uchapaga)
Sunil Kumar Acharya 2022-11-02 03:27:35 UTC Flags needinfo?(ypadia)
Mudit Agarwal 2022-11-02 03:38:15 UTC Flags needinfo?(ypadia)
Ilya Dryomov 2022-11-02 11:29:27 UTC Flags needinfo?(idryomov)
Ilya Dryomov 2022-11-02 11:29:45 UTC Flags needinfo?(jdurgin)
Shyamsundar 2022-11-02 21:10:58 UTC Flags needinfo?(muagarwa)
Mudit Agarwal 2022-11-16 02:27:01 UTC Flags needinfo?(uchapaga) needinfo?(muagarwa)
Doc Type If docs needed, set a value Known Issue
Mudit Agarwal 2022-11-16 07:10:29 UTC Blocks 2107226
Red Hat Bugzilla 2022-12-31 19:46:40 UTC CC uchapaga
Red Hat Bugzilla 2022-12-31 23:39:25 UTC CC idryomov
Red Hat Bugzilla 2023-01-01 05:47:46 UTC CC srangana
Red Hat Bugzilla 2023-01-01 08:31:55 UTC QA Contact kramdoss
Alasdair Kergon 2023-01-04 04:40:03 UTC QA Contact kramdoss
Alasdair Kergon 2023-01-04 05:46:39 UTC CC srangana
Alasdair Kergon 2023-01-04 05:53:02 UTC CC uchapaga
Alasdair Kergon 2023-01-04 06:46:24 UTC CC idryomov
Erin Donnelly 2023-01-06 18:30:13 UTC CC edonnell
Flags needinfo?(ypadia)
yati padia 2023-01-10 06:42:09 UTC Doc Text Cause:
blocklisting due to either network issues or a heavily overloaded/imbalanced cluster with huge tail latency spikes.
Consequence:
Pods are stuck in CreateContainerError with msg Error: relabel failed /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount: lsetxattr /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount/#ib_16384_0.dblwr: read-only file system


Workaround (if any):


Result:
Flags needinfo?(ypadia) needinfo?(srangana)
Erin Donnelly 2023-01-18 19:07:20 UTC Doc Text Cause:
blocklisting due to either network issues or a heavily overloaded/imbalanced cluster with huge tail latency spikes.
Consequence:
Pods are stuck in CreateContainerError with msg Error: relabel failed /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount: lsetxattr /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount/#ib_16384_0.dblwr: read-only file system


Workaround (if any):


Result:
.Blocklisting can lead to Pods stuck in an error state

Blocklisting due to either network issues or a heavily overloaded/imbalanced cluster with huge tail latency spikes. Because of this, Pods get stuck in `CreateContainerError` with the message `Error: relabel failed /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount: lsetxattr /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount/#ib_16384_0.dblwr: read-only file system`.
Shyamsundar 2023-01-18 19:30:12 UTC Flags needinfo?(srangana)
Doc Text .Blocklisting can lead to Pods stuck in an error state

Blocklisting due to either network issues or a heavily overloaded/imbalanced cluster with huge tail latency spikes. Because of this, Pods get stuck in `CreateContainerError` with the message `Error: relabel failed /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount: lsetxattr /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount/#ib_16384_0.dblwr: read-only file system`.
Cause: A worker node is blocklisted from accessing Ceph cluster

Consequence: Pods with ceph RBD backed volumes scheduled to this node will fail to start, reporting errors such as "Error: relabel failed" and "read-only file system"

Workaround (if any): The node to which these pods are scheduled and fail to start has to be rebooted following the procedure as follows:
- Cordon and then drain the node having the issue
- Reboot the node having the issue
- Uncordon the node having the issue

Result: Pods start succesfully on a different worker node in the cluster
Aman Agrawal 2023-01-19 17:37:29 UTC CC amagrawa
Shyamsundar 2023-01-20 14:00:51 UTC Blocks 2138210
Shyamsundar 2023-01-23 13:39:39 UTC Doc Text Cause: A worker node is blocklisted from accessing Ceph cluster

Consequence: Pods with ceph RBD backed volumes scheduled to this node will fail to start, reporting errors such as "Error: relabel failed" and "read-only file system"

Workaround (if any): The node to which these pods are scheduled and fail to start has to be rebooted following the procedure as follows:
- Cordon and then drain the node having the issue
- Reboot the node having the issue
- Uncordon the node having the issue

Result: Pods start succesfully on a different worker node in the cluster
Cause: A worker node is blocklisted from accessing Ceph cluster

Consequence: Pods with ceph RBD backed volumes scheduled to this node will fail to start or workload IO reports, "read-only file system"

Workaround (if any): The node to which these pods are scheduled and fail to start, or run has to be rebooted following the procedure as follows:
- Cordon and then drain the node having the issue
- Reboot the node having the issue
- Uncordon the node having the issue

Result: Pods start successfully on a different worker node in the cluster
Erin Donnelly 2023-01-27 20:20:23 UTC Doc Text Cause: A worker node is blocklisted from accessing Ceph cluster

Consequence: Pods with ceph RBD backed volumes scheduled to this node will fail to start or workload IO reports, "read-only file system"

Workaround (if any): The node to which these pods are scheduled and fail to start, or run has to be rebooted following the procedure as follows:
- Cordon and then drain the node having the issue
- Reboot the node having the issue
- Uncordon the node having the issue

Result: Pods start successfully on a different worker node in the cluster
A Specific worker node is blocklisted from accessing the Ceph cluster. Pods with Ceph RBD backed volumes scheduled to this worker node fail to start, or the workload IO reports: `read-only file system`.

To workaround this issue, reboot the node to which these pods are scheduled and failing to start or run by following these steps:
. Cordon and then drain the node having the issue
. Reboot the node having the issue
. Uncordon the node having the issue

The pods will start successfully on a different worker node in the cluster.
Erin Donnelly 2023-01-27 20:23:00 UTC Doc Text A Specific worker node is blocklisted from accessing the Ceph cluster. Pods with Ceph RBD backed volumes scheduled to this worker node fail to start, or the workload IO reports: `read-only file system`.

To workaround this issue, reboot the node to which these pods are scheduled and failing to start or run by following these steps:
. Cordon and then drain the node having the issue
. Reboot the node having the issue
. Uncordon the node having the issue

The pods will start successfully on a different worker node in the cluster.
.Blocklisting can lead to Pods stuck in an error state

Blocklisting due to either network issues or a heavily overloaded/imbalanced cluster with huge tail latency spikes. Because of this, Pods get stuck in `CreateContainerError` with the message `Error: relabel failed /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount: lsetxattr /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount/#ib_16384_0.dblwr: read-only file system`.

To workaround this issue, reboot the node to which these pods are scheduled and failing by following these steps:
. Cordon and then drain the node having the issue
. Reboot the node having the issue
. Uncordon the node having the issue

The pods will start successfully on a different worker node in the cluster.
Olive Lakra 2023-01-30 16:15:45 UTC CC olakra
Doc Text .Blocklisting can lead to Pods stuck in an error state

Blocklisting due to either network issues or a heavily overloaded/imbalanced cluster with huge tail latency spikes. Because of this, Pods get stuck in `CreateContainerError` with the message `Error: relabel failed /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount: lsetxattr /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount/#ib_16384_0.dblwr: read-only file system`.

To workaround this issue, reboot the node to which these pods are scheduled and failing by following these steps:
. Cordon and then drain the node having the issue
. Reboot the node having the issue
. Uncordon the node having the issue

The pods will start successfully on a different worker node in the cluster.
.Blocklisting can lead to Pods stuck in an error state

Blocklisting due to either network issues or a heavily overloaded/imbalanced cluster with huge tail latency spikes. Because of this, Pods get stuck in `CreateContainerError` with the message `Error: relabel failed /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount: lsetxattr /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount/#ib_16384_0.dblwr: read-only file system`.

Workaround: Reboot the node to which these pods are scheduled and failing by following these steps:
. Cordon and then drain the node having the issue
. Reboot the node having the issue
. Uncordon the node having the issue

The pods will start successfully on a different worker node in the cluster.
Red Hat Bugzilla 2023-01-31 23:37:06 UTC CC madam
Matthias Muench 2023-02-15 08:46:25 UTC CC mmuench
Elvir Kuric 2023-02-15 15:47:11 UTC CC ekuric
Mudit Agarwal 2023-03-16 16:42:29 UTC Status NEW ASSIGNED
Summary Pods are stuck in CreateContainerError with msg lsetxattr /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount/#ib_16384_0.dblwr: read-only file system Pods are stuck in CreateContainerError because of blocklisting
Sunil Kumar Acharya 2023-03-19 17:58:34 UTC Flags needinfo?(ypadia)
Shyamsundar 2023-03-20 12:55:50 UTC Status ASSIGNED ON_QA
Flags needinfo?(ypadia)
krishnaram Karthick 2023-04-03 05:12:00 UTC QA Contact kramdoss prsurve
Mudit Agarwal 2023-04-03 10:46:48 UTC Doc Type Known Issue Bug Fix
Doc Text .Blocklisting can lead to Pods stuck in an error state

Blocklisting due to either network issues or a heavily overloaded/imbalanced cluster with huge tail latency spikes. Because of this, Pods get stuck in `CreateContainerError` with the message `Error: relabel failed /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount: lsetxattr /var/lib/kubelet/pods/cb27938e-f66f-401d-85f0-9eb5cf565ace/volumes/kubernetes.io~csi/pvc-86e7da91-29f9-4418-80a7-4ae7610bb613/mount/#ib_16384_0.dblwr: read-only file system`.

Workaround: Reboot the node to which these pods are scheduled and failing by following these steps:
. Cordon and then drain the node having the issue
. Reboot the node having the issue
. Uncordon the node having the issue

The pods will start successfully on a different worker node in the cluster.
Neha Berry 2023-04-07 05:26:40 UTC CC nberry
RHEL Program Management 2023-04-07 05:26:49 UTC Target Release --- ODF 4.13.0
Aman Agrawal 2023-05-24 07:23:38 UTC Flags needinfo?(srangana)
Sunil Kumar Acharya 2023-06-01 15:36:13 UTC Flags needinfo?(ypadia)
yati padia 2023-06-05 08:20:24 UTC Flags needinfo?(ypadia)
Shyamsundar 2023-06-14 12:33:35 UTC Flags needinfo?(srangana) needinfo?(idryomov) needinfo?(dkamboj)
CC dkamboj
Ilya Dryomov 2023-06-14 13:03:27 UTC Flags needinfo?(idryomov)
Elad 2023-06-19 06:02:59 UTC CC ebenahar
RHEL Program Management 2023-06-19 06:03:11 UTC Target Release ODF 4.13.0 ---
Divyansh Kamboj 2023-06-20 06:44:46 UTC Flags needinfo?(dkamboj)
Red Hat Bugzilla 2023-08-03 08:29:35 UTC CC ocs-bugs
Elad 2023-08-09 16:37:41 UTC CC odf-bz-bot

Back to bug 2094320