Bug 1474256

Summary: [Gluster-block]: Block continues to exist in a non-healthy state after a failed delete
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Sweta Anandpara <sanandpa>
Component: gluster-blockAssignee: Prasanna Kumar Kalever <prasanna.kalever>
Status: CLOSED ERRATA QA Contact: Sweta Anandpara <sanandpa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.3CC: amukherj, rcyriac, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: gluster-block-0.2.1-8.el7rhgs Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-21 04:20:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1417151    

Description Sweta Anandpara 2017-07-24 08:59:22 UTC
Description of problem:
========================
Hit this while verifying https://bugzilla.redhat.com/show_bug.cgi?id=1449245.

While deleting a block if gluster-block daemon of one of the HA nodes goes down, the CLI reports the delete to have failed. However, the block would have undergone deletion on the other HA nodes (which were up) and hence, the block even though still exists, is not fully functional.

It leads to unpredictable behaviour of the block, like:
* initiator discovery succeeds. Login succeeds. But lsblk does not show up the block device.
* Meta information shows the CLEANUP to have taken place on one node, and not the other node.
* New create of a block with the same name fails with the error - 'block already exists'

Version-Release number of selected component (if applicable):
==============================================================
glusterfs-3.8.4-33 and gluster-block-02.1-6


How reproducible:
=================
2:2


Steps to Reproduce:
===================
1. Create a block with ha=3 on node1,node2 and node3
2. Execute block delete on node1, and kill gluster-block daemon on node2
3. Bring back gluster-block daemon on node2

Actual results:
===============
Step2 reports the delete to have failed. Meta information of the block shows that the delete has succeeded on node1 and node3, and failed on node2
'gluster-block info <volname>/<blockname>' shows the block to be present, with the parameter 'BLOCK CONFIG NODES' showing only node2.

Expected results:
===============
Block should not be left in a non-healthy/partial state, post the failure of a create or delete. The changes that have taken place in the system should either be rolled back, or an explicit guidance has to be given to take corrective measures on such sub-optimal blocks.


Additional info:
=================

[root@dhcp47-121 ~]# gluster v info testvol
 
Volume Name: testvol
Type: Replicate
Volume ID: 35a0b1a7-0dc3-4536-96aa-bd181b91c381
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.47.121:/bricks/brick2/testvol0
Brick2: 10.70.47.113:/bricks/brick2/testvol1
Brick3: 10.70.47.114:/bricks/brick2/testvol2
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
performance.open-behind: off
performance.readdir-ahead: off
network.remote-dio: enable
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
server.allow-insecure: on
cluster.brick-multiplex: disable
cluster.enable-shared-storage: enable
[root@dhcp47-121 ~]# gluster-block info testvol/*
block name(*) should contain only aplhanumeric,'-' and '_' characters
[root@dhcp47-121 ~]# gluster-block info testvol/testblock
NAME: testblock
VOLUME: testvol
GBID: 80701a23-8483-46fc-a7aa-ef4588e036ba
SIZE: 1048576
HA: 2
PASSWORD: 
BLOCK CONFIG NODE(S): 10.70.47.113 10.70.47.121
[root@dhcp47-121 ~]# gluster-block info testvol/bk1
NAME: bk1
VOLUME: testvol
GBID: 4949789d-dc5c-47d6-a18e-7fc09c988d62
SIZE: 1048576
HA: 3
PASSWORD: 
BLOCK CONFIG NODE(S): 10.70.47.114 10.70.47.113 10.70.47.121
[root@dhcp47-121 ~]# gluster-block info testvol/bk2
NAME: bk2
VOLUME: testvol
GBID: 0288af11-603a-42f7-896f-d2fc498b900f
SIZE: 1048576
HA: 1
PASSWORD: 8ed48837-f22b-43f1-911c-6e635ca7a711
BLOCK CONFIG NODE(S): 10.70.47.114
[root@dhcp47-121 ~]# gluster-block info testvol/bk4
NAME: bk4
VOLUME: testvol
GBID: 626a71ca-2372-420b-b0b4-b449fa2d6f88
SIZE: 1048576
HA: 2
PASSWORD: c18af2c1-fc49-4ec5-bdd4-a1b3b358e905
BLOCK CONFIG NODE(S): 10.70.47.114 10.70.47.113
[root@dhcp47-121 ~]# 
[root@dhcp47-121 ~]# 
[root@dhcp47-121 ~]# 
[root@dhcp47-121 ~]# 
[root@dhcp47-121 ~]# gluster-block delete
Inadequate arguments for delete:
gluster-block delete <volname/blockname> [--json*]
[root@dhcp47-121 ~]# gluster-block delete testvol/bk1
FAILED ON:   10.70.47.113
SUCCESSFUL ON:   10.70.47.114 10.70.47.121
RESULT: FAIL
[root@dhcp47-121 ~]# gluster-block info testvol/bk1
NAME: bk1
VOLUME: testvol
GBID: 4949789d-dc5c-47d6-a18e-7fc09c988d62
SIZE: 1048576
HA: 3
PASSWORD: 
BLOCK CONFIG NODE(S): 10.70.47.113
[root@dhcp47-121 ~]# mount | grep testvol
[root@dhcp47-121 ~]# gluster-block list testvol
testblock
bk1
bk2
bk4
[root@dhcp47-121 ~]# mkdir /mnt/testvol
[root@dhcp47-121 ~]# mount -t glusterfs 10.70.47.121:testvol /mnt/testvol
[root@dhcp47-121 ~]# cd /mnt/testvol
[root@dhcp47-121 testvol]# cd block-meta/
[root@dhcp47-121 block-meta]# cat bk1
VOLUME: testvol
GBID: 4949789d-dc5c-47d6-a18e-7fc09c988d62
SIZE: 1048576
HA: 3
ENTRYCREATE: INPROGRESS
ENTRYCREATE: SUCCESS
10.70.47.114: CONFIGINPROGRESS
10.70.47.113: CONFIGINPROGRESS
10.70.47.121: CONFIGINPROGRESS
10.70.47.113: CONFIGSUCCESS
10.70.47.114: CONFIGSUCCESS
10.70.47.121: CONFIGSUCCESS
10.70.47.114: CLEANUPINPROGRESS
10.70.47.121: CLEANUPINPROGRESS
10.70.47.113: CLEANUPINPROGRESS
10.70.47.114: CLEANUPSUCCESS
10.70.47.121: CLEANUPSUCCESS
[root@dhcp47-121 block-meta]#

Comment 2 Sweta Anandpara 2017-07-24 09:05:41 UTC
Sosreports at the location http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/<bugnumber>/

[qe@rhsqe-repo 1474256]$ 
[qe@rhsqe-repo 1474256]$ pwd
/home/repo/sosreports/1474256
[qe@rhsqe-repo 1474256]$ 
[qe@rhsqe-repo 1474256]$ 
[qe@rhsqe-repo 1474256]$ hostname
rhsqe-repo.lab.eng.blr.redhat.com
[qe@rhsqe-repo 1474256]$ 
[qe@rhsqe-repo 1474256]$ ll
total 12
drwxr-xr-x. 2 qe qe 4096 Jul 24 14:30 gluster-block_dhcp47-113
drwxr-xr-x. 2 qe qe 4096 Jul 24 14:31 gluster-block_dhcp47-114
drwxr-xr-x. 2 qe qe 4096 Jul 24 14:29 gluster-block_dhcp47-121
[qe@rhsqe-repo 1474256]$

Comment 4 Sweta Anandpara 2017-07-25 04:18:59 UTC
> So, the design idea was that the user might run a cron job, ideally its duty is to look at the partially failed blocks and issue a delete every now and then.

Firstly, "issuing a delete every now and then" does not sound like good design decision, nor is it safe with probability of going wrong _every_now_and_then_.
Secondly, where is the book-keeping done of partially failed blocks? How do we get to know that a block is good, or bad?
Lastly, why rely on a cron job, or on the user side for doing the required clean-up? The moment we expose partially-baked-data to the user, there is more chance for a user to introduce harm to the system/environment.

My knowledge is a little limited in this area, but could you please give an example/reference of any other storage device that we ship, where we let the failed creates/deletes to be present in the system as is, and not really roll-back (or take corrective measures) on the internal changes that have proceeded partially?

Comment 10 Sweta Anandpara 2017-08-27 13:39:03 UTC
Tested and verified this on the build gluster-block-0.2.1-8 and glusterfs-3.8.4-42.

'gluster-block delete' command is working as expected. If gluster-blockd service is brought down on one of the ha nodes (in the middle of deletion), block-delete command does go ahead and delete the block. No stale entries of the block remain. Meta information of the block does not exist. 

However, 'gluster-block create' command when executed with gluster-blockd service going down in the middle, rollback of internal changes does not happen cleanly. Pasted below is the supporting output. The block 'bki' continues to exist in a partial-state, after a failed create.

[root@dhcp47-121 block-meta]# 
[root@dhcp47-121 block-meta]# for i in {a,b,c,d,e}; do gluster-block create ozone/bk$i ha 2 10.70.47.121,10.70.47.113 20M
> done
IQN: iqn.2016-12.org.gluster-block:26c9bf4d-b70d-4b9d-a8eb-bbf63ae1dc5d
PORTAL(S):  10.70.47.121:3260 10.70.47.113:3260
RESULT: SUCCESS
IQN: iqn.2016-12.org.gluster-block:2b6cbb1c-b7c9-4fce-985d-c17052d6e068
PORTAL(S):  10.70.47.121:3260 10.70.47.113:3260
RESULT: SUCCESS
IQN: iqn.2016-12.org.gluster-block:aa84393d-c82c-40b1-b5f2-d4c725ba0a1a
PORTAL(S):  10.70.47.121:3260 10.70.47.113:3260
RESULT: SUCCESS
IQN: iqn.2016-12.org.gluster-block:f34d5aa8-c4aa-4bf7-ad49-eb6094d89688
PORTAL(S):  10.70.47.121:3260 10.70.47.113:3260
RESULT: SUCCESS
failed to configure on 10.70.47.113 : Connection refused
RESULT:FAIL
[root@dhcp47-121 block-meta]# gluster-block list ozone
bk1
bka
bkb
bkc
bkd
[root@dhcp47-121 block-meta]# ls
bk1  bka  bkb  bkc  bkd  meta.lock
[root@dhcp47-121 block-meta]# 
[root@dhcp47-121 block-meta]# 
[root@dhcp47-121 block-meta]# for i in {f,g,h,i,j}; do gluster-block create ozone/bk$i ha 3 10.70.47.121,10.70.47.113,10.70.47.114 50M; done
IQN: iqn.2016-12.org.gluster-block:56afaae1-05fd-4e16-ba9f-7938eb6387bc
PORTAL(S):  10.70.47.121:3260 10.70.47.113:3260 10.70.47.114:3260
RESULT: SUCCESS
IQN: iqn.2016-12.org.gluster-block:5d5e6eb1-5e73-4fb3-a14b-013e634478ef
PORTAL(S):  10.70.47.121:3260 10.70.47.113:3260 10.70.47.114:3260
RESULT: SUCCESS
IQN: iqn.2016-12.org.gluster-block:496cf83d-6f89-4808-975e-bd5a0033542b
PORTAL(S):  10.70.47.121:3260 10.70.47.113:3260 10.70.47.114:3260
RESULT: SUCCESS
IQN: iqn.2016-12.org.gluster-block:cdd2a854-9ea7-4236-aeba-6321fa7b4247
PORTAL(S):  10.70.47.121:3260 10.70.47.113:3260
ROLLBACK ON: 10.70.47.114  10.70.47.113 10.70.47.121 
RESULT: FAIL
failed to configure on 10.70.47.114 : Connection refused
RESULT:FAIL
[root@dhcp47-121 block-meta]# gluster-block list ozone
bk1
bka
bkb
bkc
bkd
bkg
bki
bkf
bkh
[root@dhcp47-121 block-meta]# ls
bk1  bka  bkb  bkc  bkd  bkf  bkg  bkh  bki  meta.lock
[root@dhcp47-121 block-meta]# cat bki
VOLUME: ozone
GBID: cdd2a854-9ea7-4236-aeba-6321fa7b4247
SIZE: 52428800
HA: 3
ENTRYCREATE: INPROGRESS
ENTRYCREATE: SUCCESS
10.70.47.113: CONFIGINPROGRESS
10.70.47.114: CONFIGINPROGRESS
10.70.47.121: CONFIGINPROGRESS
10.70.47.114: CONFIGFAIL
10.70.47.113: CONFIGSUCCESS
10.70.47.121: CONFIGSUCCESS
10.70.47.114: CLEANUPINPROGRESS
10.70.47.113: CLEANUPINPROGRESS
10.70.47.121: CLEANUPINPROGRESS
10.70.47.113: CLEANUPSUCCESS
10.70.47.121: CLEANUPSUCCESS
[root@dhcp47-121 block-meta]# 
[root@dhcp47-121 block-meta]# gluster-block info ozone/bki
NAME: bki
VOLUME: ozone
GBID: cdd2a854-9ea7-4236-aeba-6321fa7b4247
SIZE: 52428800
HA: 3
PASSWORD: 
BLOCK CONFIG NODE(S): 10.70.47.114
[root@dhcp47-121 block-meta]#

Comment 11 Sweta Anandpara 2017-08-27 14:31:32 UTC
Gluster-block logs and sosreports copied to http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/<bugnumber>/new/

Comment 13 Prasanna Kumar Kalever 2017-08-29 11:15:18 UTC
Related change:
https://review.gluster.org/#/c/18131/

Comment 15 Sweta Anandpara 2017-09-12 10:15:00 UTC
Have raised https://bugzilla.redhat.com/show_bug.cgi?id=1490818 for the concern mentioned in comment 10, as instructed in comment 14.

Moving this bug to verified in rhgs 3.3.0. Supporting logs present in comment10.

Comment 17 errata-xmlrpc 2017-09-21 04:20:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:2773