Bug 1652546
| Summary: | Heketi pod crashed with error "Transport endpoint is not connected" and "All subvolumes are down" on FIPS enabled systems | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Qixuan Wang <qixuan.wang> |
| Component: | heketi | Assignee: | Niels de Vos <ndevos> |
| Status: | CLOSED WONTFIX | QA Contact: | Qixuan Wang <qixuan.wang> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | ocs-3.11 | CC: | cnv-qe-bugs, deparker, fdeutsch, hchiramm, jmulligan, khiremat, kramdoss, lbednar, madam, ncredi, ndevos, pkundra, puebele, ravishankar, rhs-bugs, storage-qa-internal |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| URL: | https://www.redhat.com/en/about/press-releases/red-hat-continues-drive-more-secure-enterprise-it-re-certifies-red-hat-enterprise-linux-fips-140-2 | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-09-23 17:57:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1459709, 1650512 | ||
| Bug Blocks: | |||
|
Comment 2
Niels de Vos
2018-11-22 15:12:44 UTC
Also note that the "Heketi pod" crashes because the Gluster Volume for the heketi database can not get mounted. Mounting fails because the bricks on the server-side are unavailable (due to segfault). Hi there, Any update on this? We hit this issue again when testing CNV 1.4. (In reply to Qixuan Wang from comment #15) > Any update on this? We hit this issue again when testing CNV 1.4. The last time this problem was fixed with an update to the container runtime (runc bug 1650512). Could you verify that you can 'oc exec' into some pods that are unrelated to Heketi/Gluster? Also please provide details of the environment (versions of OCP components, OCS components and logs). Thanks! Yes, I can 'oc exec' into some pods that are unrelated to Heketi/Gluster.
[root@cnv-executor-cdn-stage-master-b83726-1 ~]# oc exec -it cdi-apiserver-7596c74489-pfbsw bash
bash-4.2$
OCS images
registry.access.stage.redhat.com/rhgs3/rhgs-server-rhel7 v3.11.1 d075bf120994 4 months ago 304 MB
registry.access.stage.redhat.com/rhgs3/rhgs-volmanager-rhel7 v3.11.1 f4a8b6113476 4 months ago 287 MB
registry.access.stage.redhat.com/rhgs3/rhgs-gluster-block-prov-rhel7 v3.11.1 e74761279746 4 months ago 971 MB
OCP images
docker.io/openshift/origin-node v3.11 304c69ee04c3 3 days ago 1.19 GB
registry.access.redhat.com/openshift3/node v3.11 be8a09b5514c 3 weeks ago 1.98 GB
registry.access.stage.redhat.com/openshift3/ose-node v3.11 be8a09b5514c 3 weeks ago 1.98 GB
registry.access.stage.redhat.com/openshift3/ose-deployer v3.11 1500740029de 3 weeks ago 1.17 GB
registry.access.stage.redhat.com/openshift3/ose-pod v3.11 6759d8752074 3 weeks ago 1.04 GB
registry.access.stage.redhat.com/openshift3/ose-kube-rbac-proxy v3.11 cdfa9d0da060 3 weeks ago 1.07 GB
registry.access.stage.redhat.com/openshift3/local-storage-provisioner v3.11 d6b3fc9da546 3 weeks ago 1.04 GB
registry.access.stage.redhat.com/openshift3/metrics-hawkular-metrics v3.11 e10d429e8b10 3 weeks ago 1.68 GB
registry.access.stage.redhat.com/openshift3/metrics-schema-installer v3.11 3c29bd72cc69 3 weeks ago 503 MB
registry.access.stage.redhat.com/openshift3/prometheus-node-exporter v3.11 0f508556d522 3 weeks ago 1.03 GB
CNV images
registry.access.stage.redhat.com/cnv-tech-preview/kubevirt-web-ui-operator v1.4.0 ec1d7c948d17 12 days ago 1.29 GB
registry.access.stage.redhat.com/cnv-tech-preview/kubevirt-web-ui v1.4.0 7fec13c8b83a 2 weeks ago 1.06 GB
registry.access.stage.redhat.com/cnv-tech-preview/virt-handler v1.4.0 9a86bcaca215 3 weeks ago 272 MB
registry.access.stage.redhat.com/cnv-tech-preview/virt-controller v1.4.0 2bd18da270ea 3 weeks ago 255 MB
registry.access.stage.redhat.com/cnv-tech-preview/virt-api v1.4.0 fdd076773175 3 weeks ago 255 MB
registry.access.stage.redhat.com/cnv-tech-preview/virt-launcher v1.4.0 6bb754142817 3 weeks ago 493 MB
registry.access.stage.redhat.com/cnv-tech-preview/virt-operator v1.4.0 7e1e32ab9f2f 3 weeks ago 253 MB
registry.access.stage.redhat.com/cnv-tech-preview/cnv-libvirt latest 6ce7e5abc16a 3 weeks ago 426 MB
registry.access.stage.redhat.com/cnv-tech-preview/virt-cdi-importer v1.4.0 84afc9a1ed07 4 weeks ago 313 MB
registry.access.stage.redhat.com/cnv-tech-preview/virt-cdi-controller v1.4.0 eaa565b518f8 6 weeks ago 258 MB
registry.access.stage.redhat.com/cnv-tech-preview/multus-cni v1.4.0 dc0ec22bb21a 6 weeks ago 250 MB
registry.access.stage.redhat.com/cnv-tech-preview/virt-cdi-apiserver v1.4.0 eba41663b815 6 weeks ago 258 MB
registry.access.stage.redhat.com/cnv-tech-preview/virt-cdi-uploadproxy v1.4.0 bf55d1556bcc 6 weeks ago 258 MB
registry.access.stage.redhat.com/cnv-tech-preview/kubevirt-cpu-model-nfd-plugin v1.4.0 824888646787 6 weeks ago 216 MB
registry.access.stage.redhat.com/cnv-tech-preview/kubevirt-cpu-node-labeller v1.4.0 4b6f3cae56bc 6 weeks ago 247 MB
registry.access.stage.redhat.com/cnv-tech-preview/virt-cdi-operator v1.4.0 4428190b92b5 6 weeks ago 259 MB
registry.access.stage.redhat.com/cnv-tech-preview/ovs-cni-plugin v1.4.0 b9ab2f8bbe05 6 weeks ago 218 MB
[root@cnv-executor-cdn-stage-master-b83726-1 ~]# oc get pod
NAME READY STATUS RESTARTS AGE
glusterblock-storage-provisioner-dc-1-zk755 1/1 Running 1 3d
glusterfs-storage-7c744 1/1 Running 1 3d
glusterfs-storage-jznzc 1/1 Running 1 3d
glusterfs-storage-z4w5n 1/1 Running 1 3d
heketi-storage-1-deploy 1/1 Running 0 2m
heketi-storage-1-xxk24 0/1 ContainerCreating 0 2m
[root@cnv-executor-cdn-stage-master-b83726-1 ~]# oc describe pod heketi-storage-1-xxk24
Name: heketi-storage-1-xxk24
Namespace: glusterfs
Priority: 0
PriorityClassName: <none>
Node: cnv-executor-cdn-stage-master-b83726-1.example.com/172.16.0.26
Start Time: Mon, 17 Jun 2019 23:46:23 -0400
Labels: deployment=heketi-storage-1
deploymentconfig=heketi-storage
glusterfs=heketi-storage-pod
heketi=storage-pod
Annotations: openshift.io/deployment-config.latest-version=1
openshift.io/deployment-config.name=heketi-storage
openshift.io/deployment.name=heketi-storage-1
openshift.io/scc=privileged
Status: Pending
IP:
Controlled By: ReplicationController/heketi-storage-1
Containers:
heketi:
Container ID:
Image: registry.access.stage.redhat.com/rhgs3/rhgs-volmanager-rhel7:v3.11.1
Image ID:
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Liveness: http-get http://:8080/hello delay=30s timeout=3s period=10s #success=1 #failure=3
Readiness: http-get http://:8080/hello delay=3s timeout=3s period=10s #success=1 #failure=3
Environment:
HEKETI_USER_KEY: 5zziUMvTNCdV42jfNL6u7LEfnKSiejXo1MzUCNxfBhc=
HEKETI_ADMIN_KEY: ld/mwMgEiBaonPiEFscaUPb9pTSE62WLYbRdaCdIvDI=
HEKETI_CLI_USER: admin
HEKETI_CLI_KEY: ld/mwMgEiBaonPiEFscaUPb9pTSE62WLYbRdaCdIvDI=
HEKETI_EXECUTOR: kubernetes
HEKETI_FSTAB: /var/lib/heketi/fstab
HEKETI_SNAPSHOT_LIMIT: 14
HEKETI_KUBE_GLUSTER_DAEMONSET: 1
HEKETI_IGNORE_STALE_OPERATIONS: true
HEKETI_DEBUG_UMOUNT_FAILURES: true
Mounts:
/etc/heketi from config (rw)
/var/lib/heketi from db (rw)
/var/run/secrets/kubernetes.io/serviceaccount from heketi-storage-service-account-token-6mqj9 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
db:
Type: Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime)
EndpointsName: heketi-db-storage-endpoints
Path: heketidbstorage
ReadOnly: false
config:
Type: Secret (a volume populated by a Secret)
SecretName: heketi-storage-config-secret
Optional: false
heketi-storage-service-account-token-6mqj9:
Type: Secret (a volume populated by a Secret)
SecretName: heketi-storage-service-account-token-6mqj9
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m default-scheduler Successfully assigned glusterfs/heketi-storage-1-xxk24 to cnv-executor-cdn-stage-master-b83726-1.example.com
Warning FailedMount 4m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db
Output: Running scope as unit run-109421.scope.
Mount failed. Please check the log file for more details.
the following error information was pulled from the glusterfs log to help diagnose this issue:
[2019-06-18 03:46:23.738413] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed
The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:23.725457] and [2019-06-18 03:46:23.728806]
Warning FailedMount 4m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db
Output: Running scope as unit run-109596.scope.
Mount failed. Please check the log file for more details.
the following error information was pulled from the glusterfs log to help diagnose this issue:
[2019-06-18 03:46:24.671280] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed
The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:24.657693] and [2019-06-18 03:46:24.664253]
Warning FailedMount 4m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db
Output: Running scope as unit run-109777.scope.
Mount failed. Please check the log file for more details.
the following error information was pulled from the glusterfs log to help diagnose this issue:
[2019-06-18 03:46:26.374291] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed
The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:26.364894] and [2019-06-18 03:46:26.368786]
Warning FailedMount 4m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db
Output: Running scope as unit run-109904.scope.
Mount failed. Please check the log file for more details.
the following error information was pulled from the glusterfs log to help diagnose this issue:
[2019-06-18 03:46:28.787553] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed
The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:28.777526] and [2019-06-18 03:46:28.781571]
Warning FailedMount 3m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db
Output: Running scope as unit run-110112.scope.
Mount failed. Please check the log file for more details.
the following error information was pulled from the glusterfs log to help diagnose this issue:
[2019-06-18 03:46:32.963394] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed
The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:32.954223] and [2019-06-18 03:46:32.958289]
Warning FailedMount 3m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db
Output: Running scope as unit run-110449.scope.
Mount failed. Please check the log file for more details.
the following error information was pulled from the glusterfs log to help diagnose this issue:
[2019-06-18 03:46:41.173776] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed
The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:41.163604] and [2019-06-18 03:46:41.168343]
Warning FailedMount 3m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db
Output: Running scope as unit run-110878.scope.
Mount failed. Please check the log file for more details.
the following error information was pulled from the glusterfs log to help diagnose this issue:
[2019-06-18 03:46:57.571960] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed
The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:46:57.562753] and [2019-06-18 03:46:57.566529]
Warning FailedMount 3m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o auto_unmount,log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log,backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db
Output: Running scope as unit run-111895.scope.
Mount failed. Please check the log file for more details.
the following error information was pulled from the glusterfs log to help diagnose this issue:
[2019-06-18 03:47:29.787311] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed
The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:47:29.775438] and [2019-06-18 03:47:29.781864]
Warning FailedMount 2m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com Unable to mount volumes for pod "heketi-storage-1-xxk24_glusterfs(a400fc0f-917b-11e9-a065-fa163e3f7da1)": timeout expired waiting for volumes to attach or mount for pod "glusterfs"/"heketi-storage-1-xxk24". list of unmounted volumes=[db]. list of unattached volumes=[db config heketi-storage-service-account-token-6mqj9]
Warning FailedMount 1m kubelet, cnv-executor-cdn-stage-master-b83726-1.example.com (combined from similar events): MountVolume.SetUp failed for volume "db" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db --scope -- mount -t glusterfs -o backup-volfile-servers=172.16.0.15:172.16.0.17:172.16.0.26,auto_unmount,log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/db/heketi-storage-1-xxk24-glusterfs.log 172.16.0.15:heketidbstorage /var/lib/origin/openshift.local.volumes/pods/a400fc0f-917b-11e9-a065-fa163e3f7da1/volumes/kubernetes.io~glusterfs/db
Output: Running scope as unit run-113468.scope.
Mount failed. Please check the log file for more details.
the following error information was pulled from the glusterfs log to help diagnose this issue:
[2019-06-18 03:48:33.990581] E [fuse-bridge.c:900:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed
The message "E [MSGID: 108006] [afr-common.c:4944:__afr_handle_child_down_event] 0-heketidbstorage-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-06-18 03:48:33.972037] and [2019-06-18 03:48:33.979536]
|