From /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_...-brick.log in the glusterfs-storage-* pods: [2018-11-22 09:01:32.193888] I [MSGID: 115029] [server-handshake.c:564:server_setvolume] 0-heketidbstorage-server: accepted client from cnv-executor-qwang-node2.example.com-4328-2018/11/22-09:01:32:217547-vol_c41a1f522a82424fcbd7df73b20c8369-client-2-0-0 (version: 3.12.2) with subvol /var/lib/heketi/mounts/vg_12699ad6ca15895d52324387c77b83ad/brick_cede07f142076e244057a871bd17be8d/brick pending frames: frame : type(0) op(37) frame : type(0) op(29) frame : type(0) op(16) patchset: git://git.gluster.org/glusterfs.git signal received: 6 time of crash: 2018-11-22 09:01:32 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.12.2 /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x9d)[0x7f6503715dfd] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f650371fec4] /lib64/libc.so.6(+0x36280)[0x7f6501d75280] /lib64/libc.so.6(gsignal+0x37)[0x7f6501d75207] /lib64/libc.so.6(abort+0x148)[0x7f6501d768f8] /lib64/libcrypto.so.10(+0x6da8f)[0x7f6502179a8f] /lib64/libcrypto.so.10(MD5_Init+0x49)[0x7f6502180309] /lib64/libcrypto.so.10(MD5+0x39)[0x7f6502180349] /usr/lib64/glusterfs/3.12.2/xlator/storage/posix.so(+0x1d35b)[0x7f64fc56c35b] /lib64/libglusterfs.so.0(default_rchecksum+0xd5)[0x7f6503796405] /lib64/libglusterfs.so.0(default_rchecksum+0xd5)[0x7f6503796405] /lib64/libglusterfs.so.0(default_rchecksum+0xd5)[0x7f6503796405] /lib64/libglusterfs.so.0(default_rchecksum+0xd5)[0x7f6503796405] /lib64/libglusterfs.so.0(default_rchecksum+0xd5)[0x7f6503796405] /usr/lib64/glusterfs/3.12.2/xlator/features/locks.so(+0xd249)[0x7f64f6a6c249] /lib64/libglusterfs.so.0(default_rchecksum+0xd5)[0x7f6503796405] /lib64/libglusterfs.so.0(default_rchecksum+0xd5)[0x7f6503796405] /lib64/libglusterfs.so.0(default_rchecksum+0xd5)[0x7f6503796405] /lib64/libglusterfs.so.0(default_rchecksum+0xd5)[0x7f6503796405] /lib64/libglusterfs.so.0(default_rchecksum_resume+0x1e3)[0x7f65037b37e3] /lib64/libglusterfs.so.0(call_resume+0x75)[0x7f650373a865] /usr/lib64/glusterfs/3.12.2/xlator/performance/io-threads.so(+0x4f98)[0x7f64f6010f98] /lib64/libpthread.so.0(+0x7dd5)[0x7f6502574dd5] /lib64/libc.so.6(clone+0x6d)[0x7f6501e3cead] --------- This is a segfault of the brick process in libcrypto.so while calling MD5 related functions. OCS is not FIPS tolerant (yet), only recently RHGS-3.4 replaced non-FIPS approved hashing algorithms to prevent these kind of segfaults. And indeed, fips=1 is set on the kernel commandline: sh-4.2# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-3.10.0-957.el7.x86_64 root=UUID=1243690d-6b5d-4068-8296-f50864f430d7 ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0 crashkernel=auto LANG=en_US.UTF-8 fips=1
Also note that the "Heketi pod" crashes because the Gluster Volume for the heketi database can not get mounted. Mounting fails because the bricks on the server-side are unavailable (due to segfault).
Hi there, Any update on this? We hit this issue again when testing CNV 1.4.
(In reply to Qixuan Wang from comment #15) > Any update on this? We hit this issue again when testing CNV 1.4. The last time this problem was fixed with an update to the container runtime (runc bug 1650512). Could you verify that you can 'oc exec' into some pods that are unrelated to Heketi/Gluster? Also please provide details of the environment (versions of OCP components, OCS components and logs). Thanks!
Yes, I can 'oc exec' into some pods that are unrelated to Heketi/Gluster. [root@cnv-executor-cdn-stage-master-b83726-1 ~]# oc exec -it cdi-apiserver-7596c74489-pfbsw bash bash-4.2$ OCS images registry.access.stage.redhat.com/rhgs3/rhgs-server-rhel7 v3.11.1 d075bf120994 4 months ago 304 MB registry.access.stage.redhat.com/rhgs3/rhgs-volmanager-rhel7 v3.11.1 f4a8b6113476 4 months ago 287 MB registry.access.stage.redhat.com/rhgs3/rhgs-gluster-block-prov-rhel7 v3.11.1 e74761279746 4 months ago 971 MB OCP images docker.io/openshift/origin-node v3.11 304c69ee04c3 3 days ago 1.19 GB registry.access.redhat.com/openshift3/node v3.11 be8a09b5514c 3 weeks ago 1.98 GB registry.access.stage.redhat.com/openshift3/ose-node v3.11 be8a09b5514c 3 weeks ago 1.98 GB registry.access.stage.redhat.com/openshift3/ose-deployer v3.11 1500740029de 3 weeks ago 1.17 GB registry.access.stage.redhat.com/openshift3/ose-pod v3.11 6759d8752074 3 weeks ago 1.04 GB registry.access.stage.redhat.com/openshift3/ose-kube-rbac-proxy v3.11 cdfa9d0da060 3 weeks ago 1.07 GB registry.access.stage.redhat.com/openshift3/local-storage-provisioner v3.11 d6b3fc9da546 3 weeks ago 1.04 GB registry.access.stage.redhat.com/openshift3/metrics-hawkular-metrics v3.11 e10d429e8b10 3 weeks ago 1.68 GB registry.access.stage.redhat.com/openshift3/metrics-schema-installer v3.11 3c29bd72cc69 3 weeks ago 503 MB registry.access.stage.redhat.com/openshift3/prometheus-node-exporter v3.11 0f508556d522 3 weeks ago 1.03 GB CNV images registry.access.stage.redhat.com/cnv-tech-preview/kubevirt-web-ui-operator v1.4.0 ec1d7c948d17 12 days ago 1.29 GB registry.access.stage.redhat.com/cnv-tech-preview/kubevirt-web-ui v1.4.0 7fec13c8b83a 2 weeks ago 1.06 GB registry.access.stage.redhat.com/cnv-tech-preview/virt-handler v1.4.0 9a86bcaca215 3 weeks ago 272 MB registry.access.stage.redhat.com/cnv-tech-preview/virt-controller v1.4.0 2bd18da270ea 3 weeks ago 255 MB registry.access.stage.redhat.com/cnv-tech-preview/virt-api v1.4.0 fdd076773175 3 weeks ago 255 MB registry.access.stage.redhat.com/cnv-tech-preview/virt-launcher v1.4.0 6bb754142817 3 weeks ago 493 MB registry.access.stage.redhat.com/cnv-tech-preview/virt-operator v1.4.0 7e1e32ab9f2f 3 weeks ago 253 MB registry.access.stage.redhat.com/cnv-tech-preview/cnv-libvirt latest 6ce7e5abc16a 3 weeks ago 426 MB registry.access.stage.redhat.com/cnv-tech-preview/virt-cdi-importer v1.4.0 84afc9a1ed07 4 weeks ago 313 MB registry.access.stage.redhat.com/cnv-tech-preview/virt-cdi-controller v1.4.0 eaa565b518f8 6 weeks ago 258 MB registry.access.stage.redhat.com/cnv-tech-preview/multus-cni v1.4.0 dc0ec22bb21a 6 weeks ago 250 MB registry.access.stage.redhat.com/cnv-tech-preview/virt-cdi-apiserver v1.4.0 eba41663b815 6 weeks ago 258 MB registry.access.stage.redhat.com/cnv-tech-preview/virt-cdi-uploadproxy v1.4.0 bf55d1556bcc 6 weeks ago 258 MB registry.access.stage.redhat.com/cnv-tech-preview/kubevirt-cpu-model-nfd-plugin v1.4.0 824888646787 6 weeks ago 216 MB registry.access.stage.redhat.com/cnv-tech-preview/kubevirt-cpu-node-labeller v1.4.0 4b6f3cae56bc 6 weeks ago 247 MB registry.access.stage.redhat.com/cnv-tech-preview/virt-cdi-operator v1.4.0 4428190b92b5 6 weeks ago 259 MB registry.access.stage.redhat.com/cnv-tech-preview/ovs-cni-plugin v1.4.0 b9ab2f8bbe05 6 weeks ago 218 MB [root@cnv-executor-cdn-stage-master-b83726-1 ~]# oc get pod NAME READY STATUS RESTARTS AGE glusterblock-storage-provisioner-dc-1-zk755 1/1 Running 1 3d glusterfs-storage-7c744 1/1 Running 1 3d glusterfs-storage-jznzc 1/1 Running 1 3d glusterfs-storage-z4w5n 1/1 Running 1 3d heketi-storage-1-deploy 1/1 Running 0 2m heketi-storage-1-xxk24 0/1 ContainerCreating 0 2m [root@cnv-executor-cdn-stage-master-b83726-1 ~]# oc describe pod heketi-storage-1-xxk24 Name: heketi-storage-1-xxk24 Namespace: glusterfs Priority: 0 PriorityClassName: <none> Node: cnv-executor-cdn-stage-master-b83726-1.example.com/172.16.0.26 Start Time: Mon, 17 Jun 2019 23:46:23 -0400 Labels: deployment=heketi-storage-1 deploymentconfig=heketi-storage glusterfs=heketi-storage-pod heketi=storage-pod Annotations: openshift.io/deployment-config.latest-version=1 openshift.io/deployment-config.name=heketi-storage openshift.io/deployment.name=heketi-storage-1 openshift.io/scc=privileged Status: Pending IP: Controlled By: ReplicationController/heketi-storage-1 Containers: heketi: Container ID: Image: registry.access.stage.redhat.com/rhgs3/rhgs-volmanager-rhel7:v3.11.1 Image ID: Port: 8080/TCP Host Port: 0/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Liveness: http-get http://:8080/hello delay=30s timeout=3s period=10s #success=1 #failure=3 Readiness: http-get http://:8080/hello delay=3s timeout=3s period=10s #success=1 #failure=3 Environment: HEKETI_USER_KEY: 5zziUMvTNCdV42jfNL6u7LEfnKSiejXo1MzUCNxfBhc= HEKETI_ADMIN_KEY: ld/mwMgEiBaonPiEFscaUPb9pTSE62WLYbRdaCdIvDI= HEKETI_CLI_USER: admin HEKETI_CLI_KEY: ld/mwMgEiBaonPiEFscaUPb9pTSE62WLYbRdaCdIvDI= HEKETI_EXECUTOR: kubernetes HEKETI_FSTAB: /var/lib/heketi/fstab HEKETI_SNAPSHOT_LIMIT: 14 HEKETI_KUBE_GLUSTER_DAEMONSET: 1 HEKETI_IGNORE_STALE_OPERATIONS: true HEKETI_DEBUG_UMOUNT_FAILURES: true Mounts: /etc/heketi from config (rw) /var/lib/heketi from db (rw) /var/run/secrets/kubernetes.io/serviceaccount from heketi-storage-service-account-token-6mqj9 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: db: Type: Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime) EndpointsName: heketi-db-storage-endpoints Path: heketidbstorage ReadOnly: false config: Type: Secret (a volume populated by a Secret) SecretName: heketi-storage-config-secret Optional: false heketi-storage-service-account-token-6mqj9: Type: Secret (a volume populated by a Secret) SecretName: heketi-storage-service-account-token-6mqj9 Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 4m default-scheduler Successfully assigned glusterfs/heketi-storage-1-xxk24 to cnv-executor-cdn-stage-master-b83726-1.example.com Warning FailedMount 4m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db Output: Running scope as unit run-109421.scope. Mount failed. Please check the log file for more details. the following error information was pulled from the glusterfs log to help diagnose this issue: [2019-06-18 03:46:23.738413] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:23.725457] and [2019-06-18 03:46:23.728806] Warning FailedMount 4m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db Output: Running scope as unit run-109596.scope. Mount failed. Please check the log file for more details. the following error information was pulled from the glusterfs log to help diagnose this issue: [2019-06-18 03:46:24.671280] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:24.657693] and [2019-06-18 03:46:24.664253] Warning FailedMount 4m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db Output: Running scope as unit run-109777.scope. Mount failed. Please check the log file for more details. the following error information was pulled from the glusterfs log to help diagnose this issue: [2019-06-18 03:46:26.374291] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:26.364894] and [2019-06-18 03:46:26.368786] Warning FailedMount 4m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db Output: Running scope as unit run-109904.scope. Mount failed. Please check the log file for more details. the following error information was pulled from the glusterfs log to help diagnose this issue: [2019-06-18 03:46:28.787553] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:28.777526] and [2019-06-18 03:46:28.781571] Warning FailedMount 3m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db Output: Running scope as unit run-110112.scope. Mount failed. Please check the log file for more details. the following error information was pulled from the glusterfs log to help diagnose this issue: [2019-06-18 03:46:32.963394] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:32.954223] and [2019-06-18 03:46:32.958289] Warning FailedMount 3m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db Output: Running scope as unit run-110449.scope. Mount failed. Please check the log file for more details. the following error information was pulled from the glusterfs log to help diagnose this issue: [2019-06-18 03:46:41.173776] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:41.163604] and [2019-06-18 03:46:41.168343] Warning FailedMount 3m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db Output: Running scope as unit run-110878.scope. Mount failed. Please check the log file for more details. the following error information was pulled from the glusterfs log to help diagnose this issue: [2019-06-18 03:46:57.571960] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:57.562753] and [2019-06-18 03:46:57.566529] Warning FailedMount 3m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o auto_unmount,log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db Output: Running scope as unit run-111895.scope. Mount failed. Please check the log file for more details. the following error information was pulled from the glusterfs log to help diagnose this issue: [2019-06-18 03:47:29.787311] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:47:29.775438] and [2019-06-18 03:47:29.781864] Warning FailedMount 2m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com Unable to mount volumes for pod "heketi-storage-1-xxk24_glusterfs(a400fc0f-917b-11e9-a065-fa163e3f7da1)": timeout expired waiting for volumes to attach or mount for pod "glusterfs"/"heketi-storage-1-xxk24". list of unmounted volumes=[db]. list of unattached volumes=[db config heketi-storage-service-account-token-6mqj9] Warning FailedMount 1m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com (combined from similar events): MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount,log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db Output: Running scope as unit run-113468.scope. Mount failed. Please check the log file for more details. the following error information was pulled from the glusterfs log to help diagnose this issue: [2019-06-18 03:48:33.990581] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:48:33.972037] and [2019-06-18 03:48:33.979536]