Previously, performing concurrent operations which refer same Gluster node crashed Heketi. With this fix, no crash is observed when multiple operations are performed referring to the same Gluster node.
Descriptionkrishnaram Karthick
2017-08-11 09:27:17 UTC
Description of problem:
Following Two parallel operations were performed on heketi
1) series of 'heketi device delete <>'
2) heketi device disable of the device where volumes are being deleted
Both commands errored and there seems to be a crash of heketi service.
[kubeexec] DEBUG 2017/08/11 08:58:46 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:250: Host: dhcp47-49.lab.eng.blr.redhat.com Pod: glusterfs-t4gvj Command: lvremove -f vg_c0be1577809232e2c2a5e557ece2b050/tp_1806abf11f460f1329e545134355fcea
Result: Logical volume "brick_1806abf11f460f1329e545134355fcea" successfully removed
Logical volume "tp_1806abf11f460f1329e545134355fcea" successfully removed
2017/08/11 08:58:58 http: multiple response.WriteHeader calls
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x14dfdf7]
goroutine 2454 [running]:
github.com/heketi/heketi/apps/glusterfs.(*NodeEntry).SetState(0x0, 0xc42021a5a0, 0x2304520, 0xc4204060b0, 0x22fe560, 0xc42033ae80, 0xc420200268, 0x7, 0x48feb2, 0x598d71d2)
/builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/heketi/apps/glusterfs/node_entry.go:301 +0x57
github.com/heketi/heketi/apps/glusterfs.(*App).NodeSetState.func2(0xed11f68d2, 0x588fcb2, 0x235cee0, 0xc420210f90)
/builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/heketi/apps/glusterfs/app_node.go:360 +0x80
github.com/heketi/rest.(*AsyncHttpManager).AsyncHttpRedirectFunc.func1(0xc42066cb80, 0xc4204c24b0)
/builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/rest/asynchttp.go:128 +0xf4
created by github.com/heketi/rest.(*AsyncHttpManager).AsyncHttpRedirectFunc
/builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/rest/asynchttp.go:138 +0x60
[root@dhcp47-10 ~]# oc logs heketi-1-g1dcn
Heketi 5.0.0
[kubeexec] WARNING 2017/08/11 08:59:23 Rebalance on volume expansion has been enabled. This is an EXPERIMENTAL feature
[heketi] INFO 2017/08/11 08:59:23 Loaded kubernetes executor
[heketi] INFO 2017/08/11 08:59:23 Block: Auto Create Block Hosting Volume set to true
[heketi] INFO 2017/08/11 08:59:23 Block: New Block Hosting Volume size 500 GB
[heketi] INFO 2017/08/11 08:59:23 Loaded simple allocator
[heketi] INFO 2017/08/11 08:59:23 GlusterFS Application Loaded
Listening on port 8080
Version-Release number of selected component (if applicable):
heketi-client-5.0.0-7.el7rhgs.x86_64
How reproducible:
1/1
Steps to Reproduce:
1. create 200 volumes from a same device
2. once all volumes are created, delete all volumes.
# heketi-cli volume list | awk {'print $1'} | cut -c 4- >> list
# while read id; do heketi-cli volume delete $id; done<list
3. From a different window, disable the device on which the volumes are created
Actual results:
Both the operations errored out
window-1:
=========
Volume 255d8fa3b1fa90e543a420b9a0a0626a deleted
Volume 2bee7175cc381e2d95a4834b66ba10b6 deleted
Volume 2c9faefa80170f05d4f4850578236180 deleted
Volume 2d35619c6bef760a336b957ef182bdad deleted
Volume 2d58b9eb77425d86e9c220a6a5ef389b deleted
Error:
Error:
Error:
Error:
Window-2:
==========
Error:
Expected results:
Both operations should complete seamlessly
Additional info:
Comment 2krishnaram Karthick
2017-08-11 09:30:07 UTC
Comment 7krishnaram Karthick
2017-09-14 06:21:08 UTC
verified in build - cns-deploy-5.0.0-34.el7rhgs.x86_64
heketi volume create, delete, device remove operations were run concurrently and no crashes were seen.
Moving the bug to verified.
Comment 9Raghavendra Talur
2017-10-04 15:46:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHEA-2017:2879
Description of problem: Following Two parallel operations were performed on heketi 1) series of 'heketi device delete <>' 2) heketi device disable of the device where volumes are being deleted Both commands errored and there seems to be a crash of heketi service. [kubeexec] DEBUG 2017/08/11 08:58:46 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:250: Host: dhcp47-49.lab.eng.blr.redhat.com Pod: glusterfs-t4gvj Command: lvremove -f vg_c0be1577809232e2c2a5e557ece2b050/tp_1806abf11f460f1329e545134355fcea Result: Logical volume "brick_1806abf11f460f1329e545134355fcea" successfully removed Logical volume "tp_1806abf11f460f1329e545134355fcea" successfully removed 2017/08/11 08:58:58 http: multiple response.WriteHeader calls panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x14dfdf7] goroutine 2454 [running]: github.com/heketi/heketi/apps/glusterfs.(*NodeEntry).SetState(0x0, 0xc42021a5a0, 0x2304520, 0xc4204060b0, 0x22fe560, 0xc42033ae80, 0xc420200268, 0x7, 0x48feb2, 0x598d71d2) /builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/heketi/apps/glusterfs/node_entry.go:301 +0x57 github.com/heketi/heketi/apps/glusterfs.(*App).NodeSetState.func2(0xed11f68d2, 0x588fcb2, 0x235cee0, 0xc420210f90) /builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/heketi/apps/glusterfs/app_node.go:360 +0x80 github.com/heketi/rest.(*AsyncHttpManager).AsyncHttpRedirectFunc.func1(0xc42066cb80, 0xc4204c24b0) /builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/rest/asynchttp.go:128 +0xf4 created by github.com/heketi/rest.(*AsyncHttpManager).AsyncHttpRedirectFunc /builddir/build/BUILD/heketi-5.0.0/src/github.com/heketi/rest/asynchttp.go:138 +0x60 [root@dhcp47-10 ~]# oc logs heketi-1-g1dcn Heketi 5.0.0 [kubeexec] WARNING 2017/08/11 08:59:23 Rebalance on volume expansion has been enabled. This is an EXPERIMENTAL feature [heketi] INFO 2017/08/11 08:59:23 Loaded kubernetes executor [heketi] INFO 2017/08/11 08:59:23 Block: Auto Create Block Hosting Volume set to true [heketi] INFO 2017/08/11 08:59:23 Block: New Block Hosting Volume size 500 GB [heketi] INFO 2017/08/11 08:59:23 Loaded simple allocator [heketi] INFO 2017/08/11 08:59:23 GlusterFS Application Loaded Listening on port 8080 Version-Release number of selected component (if applicable): heketi-client-5.0.0-7.el7rhgs.x86_64 How reproducible: 1/1 Steps to Reproduce: 1. create 200 volumes from a same device 2. once all volumes are created, delete all volumes. # heketi-cli volume list | awk {'print $1'} | cut -c 4- >> list # while read id; do heketi-cli volume delete $id; done<list 3. From a different window, disable the device on which the volumes are created Actual results: Both the operations errored out window-1: ========= Volume 255d8fa3b1fa90e543a420b9a0a0626a deleted Volume 2bee7175cc381e2d95a4834b66ba10b6 deleted Volume 2c9faefa80170f05d4f4850578236180 deleted Volume 2d35619c6bef760a336b957ef182bdad deleted Volume 2d58b9eb77425d86e9c220a6a5ef389b deleted Error: Error: Error: Error: Window-2: ========== Error: Expected results: Both operations should complete seamlessly Additional info: