Bug 1668181 - heketi fails to cleanup stale block entries
Summary: heketi fails to cleanup stale block entries
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: heketi
Version: ocs-3.11
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: OCS 3.11.z Batch Update 2
Assignee: John Mulligan
QA Contact: Nitin Goyal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-22 06:52 UTC by Rachael
Modified: 2019-03-27 04:55 UTC (History)
12 users (show)

Fixed In Version: heketi-8.0.0-9.el7rhgs
Doc Type: Bug Fix
Doc Text:
Previously, when the block hosting volume did not exist in gluster, Heketi could not automatically clean up certain failed or stale block volumes. With this fix, Heketi can clean up failed or stale block volumes automatically.
Clone Of:
Environment:
Last Closed: 2019-03-27 04:55:35 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0663 0 None None None 2019-03-27 04:55:44 UTC

Comment 2 RamaKasturi 2019-01-22 12:32:54 UTC
I am hitting a similar issue as reported above and below are the steps i performed:
===================================================================================

1) I had created 100 block pvcs using dynamic provisioning and i was unable to create pvcs after 89th one because of the issue https://bugzilla.redhat.com/show_bug.cgi?id=1607520.

2) So i had restarted master  api and controller services so that pvcs start to get bound again.

3) After that i tried deleting all the block pvcs and they were sucessfull.

4) Now i tried deleting a BHV and i hit the issue https://bugzilla.redhat.com/show_bug.cgi?id=1654703 

5) I tried deleting again and get an error,  there are block volumes present in the volume.

[root@dhcp46-220 pgsql]# heketi-cli volume delete 037a62a428cc34f540102d91333b8f34
Error: Cannot delete a block hosting volume containing block volumes

There are no block volumes present in this volume and this does not exist in gluster back end too. When i read the db dump i could see that block volumes are present, so i tried doing a clean up of stale operations and it fails.

[root@dhcp46-220 pgsql]# heketi-cli server operations list
Id:2a69343309ecba78dae095ab60c53a6f  Type:create-block-volume  Status:stale 
Id:65360dc2cf709a634e27169036d6a429  Type:create-block-volume  Status:stale 
Id:732bb721aa205003161c40f09c959675  Type:create-block-volume  Status:stale 
Id:8b0e4f1464b36aa9b4ec287585efa8db  Type:create-block-volume  Status:stale 
Id:9969529711c4597c5f127824348a6dd2  Type:create-block-volume  Status:stale 
Id:afde4abf9e0500cf56fece16dc5fdcd5  Type:create-block-volume  Status:stale 
Id:d4d882618adcab81cf1846fb3bc32943  Type:create-block-volume  Status:stale 
Id:d8f5a0aa2370bb33a0d2f835c6d3a89d  Type:create-block-volume  Status:stale 
Id:dc25f3928eaae18428ecde2df04cdf66  Type:create-block-volume  Status:stale 


heketi_logs:
=================

[negroni] Started POST /operations/pending/cleanup
[negroni] Completed 202 Accepted in 187.36µs
[asynchttp] INFO 2019/01/22 11:56:24 asynchttp.go:288: Started job a7c6014e0ed5d46f21fb701e4a3f658b
[heketi] INFO 2019/01/22 11:56:24 Found operation 2a69343309ecba78dae095ab60c53a6f in need of clean up
[heketi] INFO 2019/01/22 11:56:24 Starting Clean for Create Block Volume op:2a69343309ecba78dae095ab60c53a6f
[heketi] INFO 2019/01/22 11:56:24 preparing to remove block volume 06860d029d4884644ad7e2b42287dc72 in op:2a69343309ecba78dae095ab60c53a6f
[negroni] Started GET /queue/a7c6014e0ed5d46f21fb701e4a3f658b
[negroni] Completed 200 OK in 146.56µs
[heketi] INFO 2019/01/22 11:56:24 executing removal of block volume 06860d029d4884644ad7e2b42287dc72 in op:2a69343309ecba78dae095ab60c53a6f
[negroni] Started GET /queue/a7c6014e0ed5d46f21fb701e4a3f658b
[negroni] Completed 200 OK in 215.865µs
[kubeexec] ERROR 2019/01/22 11:56:25 heketi/pkg/remoteexec/kube/exec.go:85:kube.ExecCommands: Failed to run command [gluster-block delete vol_037a62a428cc34f540102d91333b8f34/blockvol_06860d029d4884644ad7e2b42287dc72 --json] on [pod:glusterfs-storage-mk9dp c:glusterfs ns:glusterfs (from host:dhcp46-55.lab.eng.blr.redhat.com selector:glusterfs-node)]: Err[command terminated with exit code 2]: Stdout []: Stderr [{ "RESULT": "FAIL", "errCode": 2, "errMsg": "Volume vol_037a62a428cc34f540102d91333b8f34 does not exist" }
]
[cmdexec] ERROR 2019/01/22 11:56:25 heketi/executors/cmdexec/block_volume.go:134:cmdexec.(*CmdExecutor).BlockVolumeDestroy: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] ERROR 2019/01/22 11:56:25 heketi/apps/glusterfs/block_volume_entry.go:315:glusterfs.(*BlockVolumeEntry).destroyFromHost: Unable to delete volume: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] WARNING 2019/01/22 11:56:25 Clean phase of operation 2a69343309ecba78dae095ab60c53a6f encountered error: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] WARNING 2019/01/22 11:56:25 Unable to clean operation 2a69343309ecba78dae095ab60c53a6f: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] INFO 2019/01/22 11:56:25 Found operation 65360dc2cf709a634e27169036d6a429 in need of clean up
[heketi] INFO 2019/01/22 11:56:25 Starting Clean for Create Block Volume op:65360dc2cf709a634e27169036d6a429
[heketi] INFO 2019/01/22 11:56:25 preparing to remove block volume b303ddd56faab3fd35e8a3195546d506 in op:65360dc2cf709a634e27169036d6a429
[heketi] INFO 2019/01/22 11:56:25 executing removal of block volume b303ddd56faab3fd35e8a3195546d506 in op:65360dc2cf709a634e27169036d6a429
[negroni] Started GET /queue/a7c6014e0ed5d46f21fb701e4a3f658b
[negroni] Completed 200 OK in 129.664µs
[heketi] WARNING 2019/01/22 11:56:26 Clean phase of operation 65360dc2cf709a634e27169036d6a429 encountered error: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] WARNING 2019/01/22 11:56:26 Unable to clean operation 65360dc2cf709a634e27169036d6a429: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] INFO 2019/01/22 11:56:26 Found operation 732bb721aa205003161c40f09c959675 in need of clean up
[kubeexec] ERROR 2019/01/22 11:56:26 heketi/pkg/remoteexec/kube/exec.go:85:kube.ExecCommands: Failed to run command [gluster-block delete vol_037a62a428cc34f540102d91333b8f34/blockvol_b303ddd56faab3fd35e8a3195546d506 --json] on [pod:glusterfs-storage-mk9dp c:glusterfs ns:glusterfs (from host:dhcp46-55.lab.eng.blr.redhat.com selector:glusterfs-node)]: Err[command terminated with exit code 2]: Stdout []: Stderr [{ "RESULT": "FAIL", "errCode": 2, "errMsg": "Volume vol_037a62a428cc34f540102d91333b8f34 does not exist" }
]
[cmdexec] ERROR 2019/01/22 11:56:26 heketi/executors/cmdexec/block_volume.go:134:cmdexec.(*CmdExecutor).BlockVolumeDestroy: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] ERROR 2019/01/22 11:56:26 heketi/apps/glusterfs/block_volume_entry.go:315:glusterfs.(*BlockVolumeEntry).destroyFromHost: Unable to delete volume: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] INFO 2019/01/22 11:56:26 Starting Clean for Create Block Volume op:732bb721aa205003161c40f09c959675
[heketi] INFO 2019/01/22 11:56:26 preparing to remove block volume f3873b5d17cf2606e91910a34f34c71c in op:732bb721aa205003161c40f09c959675
[heketi] INFO 2019/01/22 11:56:26 executing removal of block volume f3873b5d17cf2606e91910a34f34c71c in op:732bb721aa205003161c40f09c959675
[negroni] Started GET /queue/a7c6014e0ed5d46f21fb701e4a3f658b
[negroni] Completed 200 OK in 193.75µs
[kubeexec] ERROR 2019/01/22 11:56:27 heketi/pkg/remoteexec/kube/exec.go:85:kube.ExecCommands: Failed to run command [gluster-block delete vol_037a62a428cc34f540102d91333b8f34/blockvol_f3873b5d17cf2606e91910a34f34c71c --json] on [pod:glusterfs-storage-mk9dp c:glusterfs ns:glusterfs (from host:dhcp46-55.lab.eng.blr.redhat.com selector:glusterfs-node)]: Err[command terminated with exit code 2]: Stdout []: Stderr [{ "RESULT": "FAIL", "errCode": 2, "errMsg": "Volume vol_037a62a428cc34f540102d91333b8f34 does not exist" }
]
[cmdexec] ERROR 2019/01/22 11:56:27 heketi/executors/cmdexec/block_volume.go:134:cmdexec.(*CmdExecutor).BlockVolumeDestroy: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] ERROR 2019/01/22 11:56:27 heketi/apps/glusterfs/block_volume_entry.go:315:glusterfs.(*BlockVolumeEntry).destroyFromHost: Unable to delete volume: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] WARNING 2019/01/22 11:56:27 Clean phase of operation 732bb721aa205003161c40f09c959675 encountered error: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] WARNING 2019/01/22 11:56:27 Unable to clean operation 732bb721aa205003161c40f09c959675: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] INFO 2019/01/22 11:56:27 Found operation 8b0e4f1464b36aa9b4ec287585efa8db in need of clean up
[heketi] INFO 2019/01/22 11:56:27 Starting Clean for Create Block Volume op:8b0e4f1464b36aa9b4ec287585efa8db
[heketi] INFO 2019/01/22 11:56:27 preparing to remove block volume 5ba10853f5437930dfa1d71351e26117 in op:8b0e4f1464b36aa9b4ec287585efa8db
[heketi] INFO 2019/01/22 11:56:27 executing removal of block volume 5ba10853f5437930dfa1d71351e26117 in op:8b0e4f1464b36aa9b4ec287585efa8db
[negroni] Started GET /queue/a7c6014e0ed5d46f21fb701e4a3f658b
[negroni] Completed 200 OK in 175.885µs
[kubeexec] ERROR 2019/01/22 11:56:28 heketi/pkg/remoteexec/kube/exec.go:85:kube.ExecCommands: Failed to run command [gluster-block delete vol_037a62a428cc34f540102d91333b8f34/blockvol_5ba10853f5437930dfa1d71351e26117 --json] on [pod:glusterfs-storage-mk9dp c:glusterfs ns:glusterfs (from host:dhcp46-55.lab.eng.blr.redhat.com selector:glusterfs-node)]: Err[command terminated with exit code 2]: Stdout []: Stderr [{ "RESULT": "FAIL", "errCode": 2, "errMsg": "Volume vol_037a62a428cc34f540102d91333b8f34 does not exist" }
]
[cmdexec] ERROR 2019/01/22 11:56:28 heketi/executors/cmdexec/block_volume.go:134:cmdexec.(*CmdExecutor).BlockVolumeDestroy: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] ERROR 2019/01/22 11:56:28 heketi/apps/glusterfs/block_volume_entry.go:315:glusterfs.(*BlockVolumeEntry).destroyFromHost: Unable to delete volume: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] WARNING 2019/01/22 11:56:28 Clean phase of operation 8b0e4f1464b36aa9b4ec287585efa8db encountered error: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] WARNING 2019/01/22 11:56:28 Unable to clean operation 8b0e4f1464b36aa9b4ec287585efa8db: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] INFO 2019/01/22 11:56:28 Found operation 9969529711c4597c5f127824348a6dd2 in need of clean up
[heketi] INFO 2019/01/22 11:56:28 Starting Clean for Create Block Volume op:9969529711c4597c5f127824348a6dd2
[heketi] INFO 2019/01/22 11:56:28 preparing to remove block volume 3afd0a1b4bcf284b066f16fecffd35de in op:9969529711c4597c5f127824348a6dd2
[heketi] INFO 2019/01/22 11:56:28 executing removal of block volume 3afd0a1b4bcf284b066f16fecffd35de in op:9969529711c4597c5f127824348a6dd2
[negroni] Started GET /queue/a7c6014e0ed5d46f21fb701e4a3f658b
[negroni] Completed 200 OK in 148.517µs
[heketi] WARNING 2019/01/22 11:56:29 Clean phase of operation 9969529711c4597c5f127824348a6dd2 encountered error: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] WARNING 2019/01/22 11:56:29 Unable to clean operation 9969529711c4597c5f127824348a6dd2: Volume vol_037a62a428cc34f540102d91333b8f34 does not exist
[heketi] INFO 2019/01/22 11:56:29 Found operation afde4abf9e0500cf56fece16dc5fdcd5 in need of clean up
[kubeexec] ERROR 2019/01/22 11:56:29 heketi/pkg/remoteexec/kube/exec.go:85:kube.ExecCommands: Failed to run command [gluster-block delete vol_037a62a428cc34f540102d91333b8f34/blockvol_3afd0a1b4bcf284b066f16fecffd35de --json] on [pod:glusterfs-storage-p46p6 c:glusterfs ns:glusterfs (from host:dhcp47-122.lab.eng.blr.redhat.com selector:glusterfs-node)]: Err[command terminated with exit code 2]: Stdout []: Stderr [{ "RESULT": "FAIL", "errCode": 2, "errMsg": "Volume vol_037a62a428cc34f540102d91333b8f34 does not exist" }

Comment 3 John Mulligan 2019-01-22 13:49:48 UTC
https://github.com/heketi/heketi/pull/1508

Comment 5 Nitin Goyal 2019-01-24 07:20:58 UTC
Below is a command to create failed entries in list of operations

# for i in {001..020}; do heketi-cli blockvolume create --size 1 --name block001 & done


# heketi-cli server operations list
Id:05c298e62332908e09b786a8a4d00893  Type:create-block-volume  Status:failed 
Id:3761acf712cb7a16b63ada685b6787ea  Type:create-block-volume  Status:failed 
Id:4220d265afd760d4ab6b0fc2be71aba2  Type:create-block-volume  Status:New in-flight
Id:46bcd62b51bd6040b3e80abb1fbe6e5b  Type:create-block-volume  Status:failed 
Id:53ab252bfeaa4c4c1be56becaffa6428  Type:create-block-volume  Status:failed 
Id:b305d55c7507f8a2143d0dec2dfb193a  Type:create-block-volume  Status:failed

Comment 7 Michael Adam 2019-02-05 14:15:47 UTC
fixed in upstream.
important fix to the auto-cleanup feature.
not too difficult to verify
proposing for 3.11.2

Comment 13 Anjana KD 2019-03-15 03:07:53 UTC
Hi John,

I have updated the doc text. Kindly review it.

Comment 14 John Mulligan 2019-03-15 14:59:12 UTC
(In reply to Anjana from comment #13)
> Hi John,
> 
> I have updated the doc text. Kindly review it.

Looks good to me.

Comment 16 errata-xmlrpc 2019-03-27 04:55:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0663


Note You need to log in before you can comment on or make changes to this bug.