Description of problem: ProvisioningFailed: Failed to provision volume with StorageClass "gold": glusterfs: create volume err: failed to get hostip Id not found. ################ # oc describe pvc claim1 Name: claim1 Namespace: storage-project StorageClass: gold Status: Pending Volume: Labels: <none> Capacity: Access Modes: Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 43m 42m 5 {persistentvolume-controller } Warning ProvisioningFailed Failed to provision volume with StorageClass "gold": glusterfs: create volume err: error creating volume . 42m 42m 1 {persistentvolume-controller } Warning ProvisioningFailed Failed to provision volume with StorageClass "gold": glusterfs: create volume err: error creating volume Unable to execute command on glusterfs-dc-dhcp47-53.lab.eng.blr.redhat.com-1-jmjuf: volume create: vol_004a90e4dae8970b28b1cac2f9de41e1: failed: Host 10.70.47.54 not connected . 41m 41m 1 {persistentvolume-controller } Warning ProvisioningFailed Failed to provision volume with StorageClass "gold": glusterfs: create volume err: error creating volume Unable to execute command on glusterfs-dc-dhcp47-121.lab.eng.blr.redhat.com-1-zv8jq: volume create: vol_c41667c7c2974724718c79d5ab995d22: failed: Host 10.70.47.54 not connected . 41m 1s 84 {persistentvolume-controller } Warning ProvisioningFailed Failed to provision volume with StorageClass "gold": glusterfs: create volume err: failed to get hostip Id not found ####################### Version-Release number of selected component (if applicable): openshift v3.4.0.24+52fd77b kubernetes v1.4.0+776c994 heketi-cli 3.0.0 How reproducible: Seen once and will try to reproduce it again Steps to Reproduce: 1. Create a claim of 100G for example 2. Reboot one of the gluster node among the OCP nodes 3. Check # oc get pvc 4. Check # oc describe pvc <claim> Actual results: Claim continues to be in "Pending" state but heketi continues to create gluster volumes in the back-end. Expected results: claim should be created successfully once the node comes back and moved to "Ready" state and the gluster pod is in "Running" Status. Additional info: I'll attach more details soon.
The error "glusterfs: create volume err: failed to get hostip Id not found." point to the same issue ( node info failed and Heketi returned the error which cause the provisioner to try volume creation again) which we are discussing in below bugzillas: https://bugzilla.redhat.com/show_bug.cgi?id=1392377 https://bugzilla.redhat.com/show_bug.cgi?id=1346621
It has ended up creating around 125 gluster volumes while the claim is still in "Pending" State. --------- # heketi-cli volume list |wc -l 125 # oc get pvc NAME STATUS VOLUME CAPACITY ACCESSMODES AGE claim1 Pending --------- 2h This should not be the case at any point. It should not create more volumes than requested at any point of time. So we need to prevent this from happening.
(In reply to Prasanth from comment #5) > It has ended up creating around 125 gluster volumes while the claim is still > in "Pending" State. > > --------- > # heketi-cli volume list |wc -l > 125 > > # oc get pvc > NAME STATUS VOLUME > CAPACITY ACCESSMODES AGE > claim1 Pending > --------- 2h > > This should not be the case at any point. It should not create more volumes > than requested at any point of time. So we need to prevent this from > happening. The workflow looks same as mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1392377#c27, so fix of Heketi's bug ( ID not found) should solve this issue.
yet another duplicate of bz #1346621 with different test. Should be fixed with patch from https://github.com/heketi/heketi/pull/579 in next build.
Verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2017-0148.html