Bug 2021068

Summary:	[Tracker for Ceph BZ #2022190] [arbiter]: alertmanager-main-0 is in ContainerCreating state
Product:	[Red Hat Storage] Red Hat OpenShift Data Foundation	Reporter:	Vijay Avuthu <vavuthu>
Component:	ceph	Assignee:	Greg Farnum <gfarnum>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Petr Balogh <pbalogh>
Severity:	urgent	Docs Contact:
Priority:	unspecified
Version:	4.9	CC:	bniver, ebenahar, hnallurv, idryomov, jarrpa, madam, mbukatov, mmuench, mrajanna, muagarwa, ocs-bugs, odf-bz-bot, owasserm, pbalogh, rcyriac, rtalur, sostapov, sunkumar
Target Milestone:	---	Keywords:	Automation, Regression, Tracking
Target Release:	ODF 4.9.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	v4.9.0-247.ci	Doc Type:	No Doc Update
Doc Text:		Story Points:	---
Clone Of:
Clones:	2022190 (view as bug list)		Environment:
Last Closed:	2022-01-07 17:46:31 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	2022190, 2025079
Bug Blocks:	1974344, 1992247, 2029744

Description Vijay Avuthu 2021-11-08 09:25:41 UTC

Description of problem (please be detailed as possible and provide log
snippests):

Arbiter deployment( 3M + 6W ) failed with alertmanager-main-0 is in ContainerCreating state


Version of all relevant components (if applicable):

ocs-registry:4.9.0-228.ci
openshift installer (4.9.0-0.nightly-2021-11-06-034743)


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, not able install arbiter deployment

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
3/3

Can this issue reproduce from the UI?
Not tried

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. install arbiter deployment using ocs-ci
2. check alertmanager status
3.


Actual results:

$ oc -n openshift-monitoring get Pod
NAME                                           READY   STATUS              RESTARTS   AGE
alertmanager-main-0                            0/5     ContainerCreating   0          54m
alertmanager-main-1                            0/5     ContainerCreating   0          54m
alertmanager-main-2                            0/5     ContainerCreating   0          54m



Expected results:

All the pods in openshift-monitoring should be in running status

Additional info:

> 
$ oc -n openshift-monitoring describe pod alertmanager-main-0
Name:                 alertmanager-main-0
Namespace:            openshift-monitoring
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 compute-1/10.1.160.236
Start Time:           Mon, 08 Nov 2021 13:57:03 +0530
Labels:               alertmanager=main
                      app=alertmanager
                      app.kubernetes.io/component=alert-router
                      app.kubernetes.io/instance=main
                      app.kubernetes.io/managed-by=prometheus-operator
                      app.kubernetes.io/name=alertmanager
                      app.kubernetes.io/part-of=openshift-monitoring
                      app.kubernetes.io/version=0.22.2
                      controller-revision-hash=alertmanager-main-7677898c78
                      statefulset.kubernetes.io/pod-name=alertmanager-main-0
Annotations:          kubectl.kubernetes.io/default-container: alertmanager
                      openshift.io/scc: nonroot
Status:               Pending

Events:
  Type     Reason                  Age   From                     Message
  ----     ------                  ----  ----                     -------
  Warning  FailedScheduling        56m   default-scheduler        0/9 nodes are available: 9 pod has unbound immediate PersistentVolumeClaims.
  Normal   Scheduled               56m   default-scheduler        Successfully assigned openshift-monitoring/alertmanager-main-0 to compute-1
  Normal   SuccessfulAttachVolume  56m   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-5f571d2e-44e2-4055-a352-bd01349ec438"
  Warning  FailedMount             54m   kubelet                  MountVolume.MountDevice failed for volume "pvc-5f571d2e-44e2-4055-a352-bd01349ec438" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 172.30.246.98:6789,172.30.118.99:6789,172.30.6.101:6789,172.30.92.74:6789,172.30.57.143:6789 --keyfile=***stripped*** map ocs-storagecluster-cephblockpool/csi-vol-a5f3512f-406d-11ec-93cb-0a580a810213 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (110) Connection timed out
  Warning  FailedMount  52m                kubelet  MountVolume.MountDevice failed for volume "pvc-5f571d2e-44e2-4055-a352-bd01349ec438" : rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0011-openshift-storage-0000000000000002-a5f3512f-406d-11ec-93cb-0a580a810213 already exists
  Warning  FailedMount  48m (x3 over 52m)  kubelet  MountVolume.MountDevice failed for volume "pvc-5f571d2e-44e2-4055-a352-bd01349ec438" : rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Warning  FailedMount  42m (x2 over 51m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[my-alertmanager-claim], unattached volumes=[config-volume tls-assets my-alertmanager-claim secret-alertmanager-main-tls secret-alertmanager-main-proxy secret-alertmanager-kube-rbac-proxy alertmanager-trusted-ca-bundle kube-api-access-qt2lt]: timed out waiting for the condition
  Warning  FailedMount  40m                kubelet  Unable to attach or mount volumes: unmounted volumes=[my-alertmanager-claim], unattached volumes=[secret-alertmanager-kube-rbac-proxy alertmanager-trusted-ca-bundle kube-api-access-qt2lt config-volume tls-assets my-alertmanager-claim secret-alertmanager-main-tls secret-alertmanager-main-proxy]: timed out waiting for the condition
  Warning  FailedMount  33m (x4 over 49m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[my-alertmanager-claim], unattached volumes=[secret-alertmanager-main-tls secret-alertmanager-main-proxy secret-alertmanager-kube-rbac-proxy alertmanager-trusted-ca-bundle kube-api-access-qt2lt config-volume tls-assets my-alertmanager-claim]: timed out waiting for the condition
  Warning  FailedMount  28m (x2 over 47m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[my-alertmanager-claim], unattached volumes=[kube-api-access-qt2lt config-volume tls-assets my-alertmanager-claim secret-alertmanager-main-tls secret-alertmanager-main-proxy secret-alertmanager-kube-rbac-proxy alertmanager-trusted-ca-bundle]: timed out waiting for the condition
  Warning  FailedMount  25m                kubelet  MountVolume.MountDevice failed for volume "pvc-5f571d2e-44e2-4055-a352-bd01349ec438" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 172.30.118.99:6789,172.30.6.101:6789,172.30.92.74:6789,172.30.57.143:6789,172.30.246.98:6789 --keyfile=***stripped*** map ocs-storagecluster-cephblockpool/csi-vol-a5f3512f-406d-11ec-93cb-0a580a810213 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (110) Connection timed out
  Warning  FailedMount  24m (x2 over 26m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[my-alertmanager-claim], unattached volumes=[secret-alertmanager-main-proxy secret-alertmanager-kube-rbac-proxy alertmanager-trusted-ca-bundle kube-api-access-qt2lt config-volume tls-assets my-alertmanager-claim secret-alertmanager-main-tls]: timed out waiting for the condition
  Warning  FailedMount  22m (x2 over 53m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[my-alertmanager-claim], unattached volumes=[tls-assets my-alertmanager-claim secret-alertmanager-main-tls secret-alertmanager-main-proxy secret-alertmanager-kube-rbac-proxy alertmanager-trusted-ca-bundle kube-api-access-qt2lt config-volume]: timed out waiting for the condition
  Warning  FailedMount  22m                kubelet  MountVolume.MountDevice failed for volume "pvc-5f571d2e-44e2-4055-a352-bd01349ec438" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 172.30.92.74:6789,172.30.57.143:6789,172.30.246.98:6789,172.30.118.99:6789,172.30.6.101:6789 --keyfile=***stripped*** map ocs-storagecluster-cephblockpool/csi-vol-a5f3512f-406d-11ec-93cb-0a580a810213 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (110) Connection timed out
  Warning  FailedMount  19m (x7 over 38m)  kubelet  (combined from similar events): MountVolume.MountDevice failed for volume "pvc-5f571d2e-44e2-4055-a352-bd01349ec438" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 172.30.57.143:6789,172.30.246.98:6789,172.30.118.99:6789,172.30.6.101:6789,172.30.92.74:6789 --keyfile=***stripped*** map ocs-storagecluster-cephblockpool/csi-vol-a5f3512f-406d-11ec-93cb-0a580a810213 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (110) Connection timed out
  Warning  FailedMount  9m51s (x7 over 46m)  kubelet  MountVolume.MountDevice failed for volume "pvc-5f571d2e-44e2-4055-a352-bd01349ec438" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 172.30.57.143:6789,172.30.246.98:6789,172.30.118.99:6789,172.30.6.101:6789,172.30.92.74:6789 --keyfile=***stripped*** map ocs-storagecluster-cephblockpool/csi-vol-a5f3512f-406d-11ec-93cb-0a580a810213 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (110) Connection timed out
  Warning  FailedMount  3m58s (x4 over 13m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[my-alertmanager-claim], unattached volumes=[my-alertmanager-claim secret-alertmanager-main-tls secret-alertmanager-main-proxy secret-alertmanager-kube-rbac-proxy alertmanager-trusted-ca-bundle kube-api-access-qt2lt config-volume tls-assets]: timed out waiting for the condition



> pvc

$ oc -n openshift-monitoring get pvc
NAME                                        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
my-alertmanager-claim-alertmanager-main-0   Bound    pvc-5f571d2e-44e2-4055-a352-bd01349ec438   40Gi       RWO            ocs-storagecluster-ceph-rbd   57m
my-alertmanager-claim-alertmanager-main-1   Bound    pvc-95b4edec-47e9-469c-8645-01ccf769cfd7   40Gi       RWO            ocs-storagecluster-ceph-rbd   57m
my-alertmanager-claim-alertmanager-main-2   Bound    pvc-3803c53e-c3cc-429b-b3b8-6f16fd5f5e60   40Gi       RWO            ocs-storagecluster-ceph-rbd   57m
my-prometheus-claim-prometheus-k8s-0        Bound    pvc-0cb2311a-c412-46c6-afb8-a4355ea9574f   40Gi       RWO            ocs-storagecluster-ceph-rbd   57m
my-prometheus-claim-prometheus-k8s-1        Bound    pvc-c9f5aa2d-cd83-4c55-8c6a-d57dcc7e2cd8   40Gi       RWO            ocs-storagecluster-ceph-rbd   57m

> job: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/2241/console

> must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-012vu2clva36-t4b/j-012vu2clva36-t4b_20211108T073634/logs/failed_testcase_ocs_logs_1636357274/test_deployment_ocs_logs/ocs_must_gather/

Comment 4 Mudit Agarwal 2021-11-08 14:42:30 UTC

Vijay, can you please gather dmesg logs on the node where the mount is failing.

Also, when was this test passed last. I don't remember any recent changes in this area.

Comment 7 Petr Balogh 2021-11-08 19:56:13 UTC

Just connected to cluster to check the status of the pods in monitoring namespace :

$ oc -n openshift-monitoring get pod
NAME                                           READY   STATUS              RESTARTS   AGE
alertmanager-main-0                            0/5     ContainerCreating   0          11h
alertmanager-main-1                            0/5     ContainerCreating   0          11h
alertmanager-main-2                            0/5     ContainerCreating   0          11h
cluster-monitoring-operator-75f48597b5-tdw5m   2/2     Running             0          11h
grafana-756487f787-hwlqt                       2/2     Running             0          11h
kube-state-metrics-6bcc85759f-n2gm2            3/3     Running             0          11h
node-exporter-7pw44                            2/2     Running             2          12h
node-exporter-hkdsp                            2/2     Running             2          11h
node-exporter-jkbsq                            2/2     Running             2          11h
node-exporter-kxvd9                            2/2     Running             2          11h
node-exporter-l5n96                            2/2     Running             2          12h
node-exporter-pfn5v                            2/2     Running             2          12h
node-exporter-pzb5w                            2/2     Running             2          11h
node-exporter-t97z7                            2/2     Running             2          11h
node-exporter-tgw7c                            2/2     Running             2          11h
openshift-state-metrics-769bdd45bc-4j6wx       3/3     Running             0          11h
prometheus-adapter-578ff485dd-hkgrd            1/1     Running             0          11h
prometheus-adapter-578ff485dd-zv2rg            1/1     Running             0          11h
prometheus-k8s-0                               0/7     Init:0/1            0          11h
prometheus-k8s-1                               0/7     Init:0/1            0          11h
prometheus-operator-5c5f4d6d94-mb7bf           2/2     Running             0          11h
telemeter-client-66fc8dd69f-xtx62              3/3     Running             0          11h
thanos-querier-5b767cc45c-8x55h                5/5     Running             0          11h
thanos-querier-5b767cc45c-rxscl                5/5     Running             0          11h


pbalogh@pbalogh-mac arbiter-bug $ oc -n openshift-monitoring get pvc
NAME                                        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
my-alertmanager-claim-alertmanager-main-0   Bound    pvc-5f571d2e-44e2-4055-a352-bd01349ec438   40Gi       RWO            ocs-storagecluster-ceph-rbd   11h
my-alertmanager-claim-alertmanager-main-1   Bound    pvc-95b4edec-47e9-469c-8645-01ccf769cfd7   40Gi       RWO            ocs-storagecluster-ceph-rbd   11h
my-alertmanager-claim-alertmanager-main-2   Bound    pvc-3803c53e-c3cc-429b-b3b8-6f16fd5f5e60   40Gi       RWO            ocs-storagecluster-ceph-rbd   11h
my-prometheus-claim-prometheus-k8s-0        Bound    pvc-0cb2311a-c412-46c6-afb8-a4355ea9574f   40Gi       RWO            ocs-storagecluster-ceph-rbd   11h
my-prometheus-claim-prometheus-k8s-1        Bound    pvc-c9f5aa2d-cd83-4c55-8c6a-d57dcc7e2cd8   40Gi       RWO            ocs-storagecluster-ceph-rbd   11h
pbalogh@pbalogh-mac arbiter-bug $ oc -n openshift-monitoring get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                            STORAGECLASS                  REASON   AGE
local-pv-1b0a6be3                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-1-data-1kxzq6         localblock                             11h
local-pv-28399b17                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-2-data-05bwtr         localblock                             11h
local-pv-2e878e2b                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-2-data-26fps9         localblock                             11h
local-pv-6a965969                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-0-data-2b2fng         localblock                             11h
local-pv-7d8fb2a3                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-1-data-26rw9g         localblock                             11h
local-pv-91941fae                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-2-data-1542rl         localblock                             11h
local-pv-949a2847                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-3-data-2g6rnn         localblock                             11h
local-pv-b5299f85                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-0-data-1ftw7d         localblock                             11h
local-pv-b75bfb86                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-0-data-0zlttw         localblock                             11h
local-pv-be797c94                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-3-data-1kdf9d         localblock                             11h
local-pv-d49b092a                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-3-data-09nncq         localblock                             11h
local-pv-e112c5d2                          512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-localblock-1-data-0t7qrm         localblock                             11h
pvc-0cb2311a-c412-46c6-afb8-a4355ea9574f   40Gi       RWO            Delete           Bound    openshift-monitoring/my-prometheus-claim-prometheus-k8s-0        ocs-storagecluster-ceph-rbd            11h
pvc-3803c53e-c3cc-429b-b3b8-6f16fd5f5e60   40Gi       RWO            Delete           Bound    openshift-monitoring/my-alertmanager-claim-alertmanager-main-2   ocs-storagecluster-ceph-rbd            11h
pvc-5f571d2e-44e2-4055-a352-bd01349ec438   40Gi       RWO            Delete           Bound    openshift-monitoring/my-alertmanager-claim-alertmanager-main-0   ocs-storagecluster-ceph-rbd            11h
pvc-893c6731-3f7b-4819-9876-9fe78bf94f79   50Gi       RWO            Delete           Bound    openshift-storage/db-noobaa-db-pg-0                              ocs-storagecluster-ceph-rbd            11h
pvc-95b4edec-47e9-469c-8645-01ccf769cfd7   40Gi       RWO            Delete           Bound    openshift-monitoring/my-alertmanager-claim-alertmanager-main-1   ocs-storagecluster-ceph-rbd            11h
pvc-c9f5aa2d-cd83-4c55-8c6a-d57dcc7e2cd8   40Gi       RWO            Delete           Bound    openshift-monitoring/my-prometheus-claim-prometheus-k8s-1        ocs-storagecluster-ceph-rbd            11h



Events from oc -n openshift-monitoring describe pod prometheus-k8s-0 

Events:
  Type     Reason       Age                From     Message
  ----     ------       ----               ----     -------
  Warning  FailedMount  79m (x16 over 9h)  kubelet  MountVolume.MountDevice failed for volume "pvc-0cb2311a-c412-46c6-afb8-a4355ea9574f" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 172.30.246.98:6789,172.30.118.99:6789,172.30.6.101:6789,172.30.92.74:6789,172.30.57.143:6789 --keyfile=***stripped*** map ocs-storagecluster-cephblockpool/csi-vol-a2807ae4-406d-11ec-93cb-0a580a810213 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (110) Connection timed out
  Warning  FailedMount  55m (x24 over 11h)  kubelet  MountVolume.MountDevice failed for volume "pvc-0cb2311a-c412-46c6-afb8-a4355ea9574f" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 172.30.92.74:6789,172.30.57.143:6789,172.30.246.98:6789,172.30.118.99:6789,172.30.6.101:6789 --keyfile=***stripped*** map ocs-storagecluster-cephblockpool/csi-vol-a2807ae4-406d-11ec-93cb-0a580a810213 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (110) Connection timed out
  Warning  FailedMount  46m (x23 over 11h)  kubelet  MountVolume.MountDevice failed for volume "pvc-0cb2311a-c412-46c6-afb8-a4355ea9574f" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 172.30.6.101:6789,172.30.92.74:6789,172.30.57.143:6789,172.30.246.98:6789,172.30.118.99:6789 --keyfile=***stripped*** map ocs-storagecluster-cephblockpool/csi-vol-a2807ae4-406d-11ec-93cb-0a580a810213 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (110) Connection timed out
  Warning  FailedMount  15m (x27 over 11h)  kubelet  MountVolume.MountDevice failed for volume "pvc-0cb2311a-c412-46c6-afb8-a4355ea9574f" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 172.30.57.143:6789,172.30.246.98:6789,172.30.118.99:6789,172.30.6.101:6789,172.30.92.74:6789 --keyfile=***stripped*** map ocs-storagecluster-cephblockpool/csi-vol-a2807ae4-406d-11ec-93cb-0a580a810213 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (110) Connection timed out
  Warning  FailedMount  6m15s (x66 over 10h)  kubelet  MountVolume.MountDevice failed for volume "pvc-0cb2311a-c412-46c6-afb8-a4355ea9574f" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 172.30.118.99:6789,172.30.6.101:6789,172.30.92.74:6789,172.30.57.143:6789,172.30.246.98:6789 --keyfile=***stripped*** map ocs-storagecluster-cephblockpool/csi-vol-a2807ae4-406d-11ec-93cb-0a580a810213 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (110) Connection timed out
  Warning  FailedMount  43s (x355 over 11h)  kubelet  (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[my-prometheus-claim], unattached volumes=[prometheus-trusted-ca-bundle tls-assets config secret-metrics-client-certs secret-kube-rbac-proxy prometheus-k8s-rulefiles-0 secret-prometheus-k8s-htpasswd secret-prometheus-k8s-proxy configmap-kubelet-serving-ca-bundle configmap-serving-certs-ca-bundle secret-prometheus-k8s-thanos-sidecar-tls config-out my-prometheus-claim web-config secret-kube-etcd-client-certs metrics-client-ca secret-grpc-tls kube-api-access-2nl5n secret-prometheus-k8s-tls]: timed out waiting for the condition

Comment 8 Petr Balogh 2021-11-08 20:02:54 UTC

BTW on 4.9 because of other blockers we had before we didn't have any successful deployment of such combination with Arbiter.

https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-trigger-vsphere-upi-2az-rhcos-lso-vmdk-arbiter-3m-6w-tier4b/3/

This one was last on 4.8 where it worked: Started 3 mo 15 days ago

As this combination we don't run on regular bases - mainly for first y-stream version like 4.8.0, 4.9.0 we don't have much results for this.

Comment 11 Mudit Agarwal 2021-11-09 07:13:18 UTC

Ilya/Sunny, please take a look.

Comment 12 Petr Balogh 2021-11-10 20:27:09 UTC

Hello,

any update here? We are still blocking the resource and cluster because of this. If no one will reply here we are going to destroy the cluster by tomorrow.

@muagarwa FYI

Comment 14 Greg Farnum 2021-11-11 02:24:11 UTC

(In reply to Petr Balogh from comment #12)
> Hello,
> 
> any update here? We are still blocking the resource and cluster because of
> this. If no one will reply here we are going to destroy the cluster by
> tomorrow.
> 
> @muagarwa FYI

You can clean it up. The logging makes clear what's happened.

Comment 16 Vijay Avuthu 2021-11-11 09:52:32 UTC

(In reply to Greg Farnum from comment #14)
> (In reply to Petr Balogh from comment #12)
> > Hello,
> > 
> > any update here? We are still blocking the resource and cluster because of
> > this. If no one will reply here we are going to destroy the cluster by
> > tomorrow.
> > 
> > @muagarwa FYI
> 
> You can clean it up. The logging makes clear what's happened.

Thanks. Destroyed the cluster

Comment 21 Petr Balogh 2021-11-24 09:46:59 UTC

Running verification job here:
https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-trigger-vsphere-upi-2az-rhcos-lso-vmdk-arbiter-3m-6w-tier4b/13

Comment 22 Petr Balogh 2021-11-24 13:12:51 UTC

https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/2335/console

I see deployment passed on this job and now it's running tier4b suite. As this issue blocked deployment which passed, I am marking this one as verified.