Block device creation fails "Create Block Volume Failed:failed to configure on xxx" in OCS 3.11.1 OCP 3.11.51 Description of problem: ===================== Created a fresh OCP 3.11.51 + OCS 3.11.1(gluster-block-0.2.1-30.el7rhgs.x86_64) 3 node setup with docker as container runtime. With no existing block hosting volume or block devices, a block PVC request was created. " # date && ./pvc-create.sh jerry 2 Mon Dec 17 17:26:14 IST 2018 persistentvolumeclaim/jerry created [root@dhcp47-135 scripts]# oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE jerry Pending block-sc 30m " The PVC creation failed with the following error: " gluster.org/glusterblock 117b14fa-01e6-11e9-be30-0a580a820204 Failed to provision volume with StorageClass "block-sc": failed to create volume: heketi block volume creation failed: [heketi] failed to create volume: { "RESULT": "FAIL", "errCode": 255, "errMsg": "failed to configure on 10.70.42.35 configure failed\nfailed to configure on 10.70.46.149 configure failed\nfailed to configure on 10.70.46.146 configure failed" } " On checking the heketi logs, the following error message was seen: -------------------------------------------------------- [kubeexec] ERROR 2018/12/17 11:56:24 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:242: Failed to run command [gluster-block create vol_79d21bc5ec8ffed0aec6272a835bdf72/blk_glusterfs_jerry_c1099fa5-01f2-11e9-be31-0a580a820204 ha 3 auth enable prealloc full 10.70.46.149,10.70.42.35,10.70.46.146 2GiB --json] on glusterfs-storage-76xdh: Err[command terminated with exit code 255]: Stdout []: Stderr [{ "RESULT": "FAIL", "errCode": 255, "errMsg": "failed to configure on 10.70.46.149 configure failed\nfailed to configure on 10.70.42.35 configure failed\nfailed to configure on 10.70.46.146 configure failed" } On checking the gluster-blockd logs, following error message was seen ------------------------------------------------------- [2018-12-17 11:56:24.104669] ERROR: backend creation failed for: vol_79d21bc5ec8ffed0aec6272a835bdf72/blk_glusterfs_jerry_c1099fa5-01f2-11e9-be31-0a580a820204 [at block_svc_routines.c+4033 :<blockValidateCommandOutput>] [2018-12-17 11:56:24.104736] DEBUG: raw output, targetcli shell version 2.1.fb46 on checking the tcmu-runner logs, following error message was seen ------------------------------------------------------------ 2018-12-17 11:56:24.062 436 [ERROR] add_device:516: could not open /dev/uio0 2018-12-17 11:56:44.146 436 [ERROR] add_device:516: could not open /dev/uio0 2018-12-17 11:57:03.629 436 [ERROR] add_device:516: could not open /dev/uio0 2018-12-17 11:56:24.094 430 [ERROR] add_device:516: could not open /dev/uio0 2018-12-17 11:56:44.165 430 [ERROR] add_device:516: could not open /dev/uio0 Some more details: ----------------------- 1. We tried creating multiple pvcs over a span of time, each creation failed with similar error message. 2. File volume creations(via pvc) and manual Block Hosting volume creations(via heketi) are succeeding. 3. Since Block device creations fail, the underlying BHV is also deleted (expected behavior) 4. Similar behavior is seen on 2 deployments - Greenfield(OCP 3.11.51+ OCS 3.11.1 together) and brownfield (first OCP 3.11.51 and then OCS 3.11.1) 5. Almost similar issue is seen in CRI-O setups as well - BZ#1653571 How reproducible: =================== 2x2 on two different fresh setups of OCP 3.11.51 and OCS 3.11.1 Steps to Reproduce: 1. Create an 3 node OCP 3.11.51 and OCS 3.11.1 setup 2. Send a pvc request for a block device. 3. Check for success/failure. In case of failure, check for heekti logs, gluster-blockd logs and pvc describe messages. Actual results: ================= Unable to create a single Block device on a OCP 3.11.51 and OCS 3.11.1 with docker as a container runtime. Expected results: ================= Block device creations should succeed.
doing a PoC now with a customer and we are hitting this issue exactly. Please advise...
The rhgs-server container image needs to get the updated version of update-params.sh that configures the /dev rbind-mount. The change has been posted upstream as PR#115. The current version of the script is at https://github.com/gluster/gluster-containers/blob/45497f475a9ff008e35dc7da8bbd43e77ecdbcc2/CentOS/update-params.sh
The changes to the daemonset explained in comment #15 and comment #16 will be included in cns-deploy through bug 1653571 and pushed into openshift-ansible (bug 1662312).
still getting this error on the recently released v3.11.59. [kubeexec] ERROR 2019/01/10 20:15:48 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:242: Failed to run command [bash -c "set -o pipefail && gluster-block delete vol_f022f1e90cb33b0e76fb4faa4295ed69/blockvol_fd50002df6d075d5c290b2bff0bd5e4e --json |tee /dev/stderr"] on glusterfs-storage-wxhlv: Err[command terminated with exit code 2]: Stdout [{ "RESULT": "FAIL", "errCode": 2, "errMsg": "block vol_f022f1e90cb33b0e76fb4faa4295ed69\/blockvol_fd50002df6d075d5c290b2bff0bd5e4e doesn't exist" } ]: Stderr [{ "RESULT": "FAIL", "errCode": 2, "errMsg": "block vol_f022f1e90cb33b0e76fb4faa4295ed69\/blockvol_fd50002df6d075d5c290b2bff0bd5e4e doesn't exist" }
(In reply to Nicholas Nachefski from comment #35) > still getting this error on the recently released v3.11.59. > > > [kubeexec] ERROR 2019/01/10 20:15:48 > /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:242: Failed to > run command [bash -c "set -o pipefail && gluster-block delete > vol_f022f1e90cb33b0e76fb4faa4295ed69/ > blockvol_fd50002df6d075d5c290b2bff0bd5e4e --json |tee /dev/stderr"] on > glusterfs-storage-wxhlv: Err[command terminated with exit code 2]: Stdout [{ > "RESULT": "FAIL", "errCode": 2, "errMsg": "block > vol_f022f1e90cb33b0e76fb4faa4295ed69\/ > blockvol_fd50002df6d075d5c290b2bff0bd5e4e doesn't exist" } > ]: Stderr [{ "RESULT": "FAIL", "errCode": 2, "errMsg": "block > vol_f022f1e90cb33b0e76fb4faa4295ed69\/ > blockvol_fd50002df6d075d5c290b2bff0bd5e4e doesn't exist" } That may be a non-fatal error from when it tries to clean up after a create volume error. Was there an earlier error in the logs for a create command?
Hi, The block volume creation is successful in OCP 3.11.67-1 and OCS 3.11.1 (latest available builds)
This looks like it might be present as an issue in 3.11.43 as well so it may have been introduced earlier than originally thought.
(In reply to Mark Szczewski from comment #39) > This looks like it might be present as an issue in 3.11.43 as well so it may > have been introduced earlier than originally thought. Customer made mistake when reporting the issue with 3.11.43. They were not running a complete teardown and used the 3.11.59 playbooks to do the install so the issue would have been presented. 3.11.43 does not show this issue!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0287