Created attachment 1473794 [details] glusterfs server log from OCP app server Description of problem: After OCP nodes hosting gluster pods are powered off and then powered back on gluster-block OCP pvs will not delete correctly. Further investigation shows that attempting to delete via 'heketi-cli blockvolume delete <blockvolume_ID>' also fails. Version-Release number of selected component (if applicable): Running from clone of openshift-ansible branch=release-3.10, commit 3e53753df1e1a592b3f594c9393e805d7a1ee735 With OCS image files: rhgs-server-rhel7:3.3.1-289 rhgs-gluster-block-prov-rhel7:3.3.1-20 rhgs-volmanager-rhel7:3.3.1-21 How reproducible: Intermittent Steps to Reproduce: 1. Create OCS cluster on at least 3 nodes 2. Install metrics, logging and/or prometheus with gluster-block volumes 3. Turn off OCP nodes hosting gluster pods and OCP infra node (in this case all pods on infra node). Leave off for at least 10 mins. 4) Turn on OCP nodes hosting gluster pods, wait until all gluster pods/heketi online 5) Turn on OCP infra node (has pods for metrics, logging and prometheus if they all are deployed). 6) Delete deployments of logging, metrics and prometheus including associated PVCs in each project, openshift-infra, openshift-logging, openshift-metrics. 7) 'oc get pv' find all glusterblock PVs are in Released status and not deleted Actual results: glusterblock PVs are in Released status and not deleted Expected results: glusterblock PVs are deleted when associated OCP PVC is deleted. Results of heketi-cli blockvolume list and 'gluster-block list <block_vol_hosting_volume>' all show blockvolume is deleted. Additional info: Example errors in gluster log for attempted 'heketi-cli glusterblock blockvol_43ca76f2672377530be7f1ec245eb191'. https://gist.github.com/netzzer/6c4499ff2920630ac36fab69af0285db Results for 'targetcli list' form all 3 gluster pods. $ oc get pods NAME READY STATUS RESTARTS AGE glusterblock-registry-provisioner-dc-1-jwbmt 1/1 Running 0 5h glusterfs-registry-8tk9m 1/1 Running 1 5d glusterfs-registry-8xkc8 1/1 Running 1 5d glusterfs-registry-l7555 1/1 Running 1 5d heketi-registry-1-cgkgc 1/1 Running 1 5d glusterfs-registry-8tk9m 'targetcli' ls - https://gist.github.com/netzzer/1b73f4aab9311b5780a881b7fd588aa5 https://gist.github.com/netzzer/641ce4d15b9f6d9b185e50babda2ddce glusterfs-registry-8xkc8 'targetcli' ls - glusterfs-registry-l7555 'targetcli' ls - https://gist.github.com/netzzer/0f849605d175e97a7f5e046d13531a7c All logs for 3 pods (/var/log/glusterfs/*) attached as tar files.
Created attachment 1473795 [details] glusterfs server log from OCP app server
Created attachment 1473796 [details] glusterfs server log from OCP app server
Questions answered. 1. > With OCS image files: > rhgs-server-rhel7:3.3.1-289 should this be rhgs-server-rhel7:3.3.1-28 ? Yes, typo! 2. Are you waiting for anything else ? No, from reading BZ#1598322 this looks to be root cause of the issue.