Description of problem: Following Two parallel operations were performed on heketi 1) series of 'heketi device delete <>' 2) heketi device disable of the device where volumes are being deleted Both commands errored and there seems to be a crash of heketi service. [kubeexec] DEBUG 2017/08/11 08:58:46 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:250: Host: dhcp47-49.lab.eng.blr.redhat.com Pod: glusterfs-t4gvj Command: lvremove -f vg_c0be1577809232e2c2a5e557ece2b050/tp_1806abf11f460f1329e545134355fcea Result: Logical volume "brick_1806abf11f460f1329e545134355fcea" successfully removed Logical volume "tp_1806abf11f460f1329e545134355fcea" successfully removed 2017/08/11 08:58:58 http: multiple response.WriteHeader calls panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x14dfdf7] goroutine 2454 [running]: github.com/heketi/heketi/apps/glusterfs.(*NodeEntry).SetState(0x0, 0xc42021a5a0, 0x2304520, 0xc4204060b0, 0x22fe560, 0xc42033ae80, 0xc420200268, 0x7, 0x48feb2, 0x598d71d2) /builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/heketi/apps/glusterfs/node_entry.go:301 +0x57 github.com/heketi/heketi/apps/glusterfs.(*App).NodeSetState.func2(0xed11f68d2, 0x588fcb2, 0x235cee0, 0xc420210f90) /builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/heketi/apps/glusterfs/app_node.go:360 +0x80 github.com/heketi/rest.(*AsyncHttpManager).AsyncHttpRedirectFunc.func1(0xc42066cb80, 0xc4204c24b0) /builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/rest/asynchttp.go:128 +0xf4 created by github.com/heketi/rest.(*AsyncHttpManager).AsyncHttpRedirectFunc /builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/rest/asynchttp.go:138 +0x60 [root@dhcp47-10 ~]# oc logs heketi-1-g1dcn Heketi 5.0.0 [kubeexec] WARNING 2017/08/11 08:59:23 Rebalance on volume expansion has been enabled. This is an EXPERIMENTAL feature [heketi] INFO 2017/08/11 08:59:23 Loaded kubernetes executor [heketi] INFO 2017/08/11 08:59:23 Block: Auto Create Block Hosting Volume set to true [heketi] INFO 2017/08/11 08:59:23 Block: New Block Hosting Volume size 500 GB [heketi] INFO 2017/08/11 08:59:23 Loaded simple allocator [heketi] INFO 2017/08/11 08:59:23 GlusterFS Application Loaded Listening on port 8080 Version-Release number of selected component (if applicable): heketi-client-5.0.0-7.el7rhgs.x86_64 How reproducible: 1/1 Steps to Reproduce: 1. create 200 volumes from a same device 2. once all volumes are created, delete all volumes. # heketi-cli volume list | awk {'print $1'} | cut -c 4- >> list # while read id; do heketi-cli volume delete $id; done<list 3. From a different window, disable the device on which the volumes are created Actual results: Both the operations errored out window-1: ========= Volume 255d8fa3b1fa90e543a420b9a0a0626a deleted Volume 2bee7175cc381e2d95a4834b66ba10b6 deleted Volume 2c9faefa80170f05d4f4850578236180 deleted Volume 2d35619c6bef760a336b957ef182bdad deleted Volume 2d58b9eb77425d86e9c220a6a5ef389b deleted Error: Error: Error: Error: Window-2: ========== Error: Expected results: Both operations should complete seamlessly Additional info:
Created attachment 1312046 [details] heketi_logs
https://github.com/heketi/heketi/pull/839
verified in build - cns-deploy-5.0.0-34.el7rhgs.x86_64 heketi volume create, delete, device remove operations were run concurrently and no crashes were seen. Moving the bug to verified.
doc text looks good to me
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:2879