Bug 1598300 - Occasional 'Version check failed' errors seen while creating/deleting blocks
Summary: Occasional 'Version check failed' errors seen while creating/deleting blocks
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: gluster-block
Version: cns-3.10
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: CNS 3.10
Assignee: Prasanna Kumar Kalever
QA Contact: Sweta Anandpara
URL:
Whiteboard:
Depends On:
Blocks: 1568862
TreeView+ depends on / blocked
 
Reported: 2018-07-05 03:56 UTC by Sweta Anandpara
Modified: 2018-09-24 04:01 UTC (History)
11 users (show)

Fixed In Version: gluster-block-0.2.1-22.el7rhgs
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-12 09:27:16 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:2691 0 None None None 2018-09-12 09:28:22 UTC

Description Sweta Anandpara 2018-07-05 03:56:34 UTC
Description of problem:
=======================
On a 3node brick-mux enabled cluster on a replica 3 volume with group profile set to 'gluster-block', when gluster-block create/delete is done in a loop, for one of the seemingly random blocks in the middle of the loop, the create (or delete) fails with this error - "Version check failed between block servers. (host 10.70.46.176 returned -1)". All the block creates/deletes  before/after the concerned block do succeed. After the loop finishes its run, when the same command is re triggered for the block for which it failed, it succeeds. 

Did not get much idea from the gluster-block logs, and I don't see a pattern of failure. But it has happened about 3 times now, in the past 1 week. Raising a relatively low priority bug for now, as the same command when attempted another time does succeed. 

Sosreports have been run immediately after latest such occurrence, and will be copied at  http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/swetas/<bugnumber>.

Version-Release number of selected component (if applicable):
============================================================
glusterfs-3.8.4-54.13
gluster-block-0.2.1-20
tcmu-runner-1.2.0-20


How reproducible:
================
Intermittent


Additional info:
===============
[root@dhcp46-50 ozone0]# gluster-block list ozone
ob2
ob3
ob4
ob5
ob6
ob7
[root@dhcp46-50 ozone0]# for i in {2..7}; do gluster-block delete ozone/ob$i; done
SUCCESSFUL ON:   10.70.46.50 10.70.46.176 10.70.46.102
RESULT: SUCCESS
SUCCESSFUL ON:   10.70.46.50 10.70.46.176 10.70.46.102
RESULT: SUCCESS
Version check failed between block servers. (host 10.70.46.176 returned -1)
RESULT:FAIL
SUCCESSFUL ON:   10.70.46.50 10.70.46.102 10.70.46.176
RESULT: SUCCESS
SUCCESSFUL ON:   10.70.46.102 10.70.46.50 10.70.46.176
RESULT: SUCCESS
SUCCESSFUL ON:   10.70.46.176 10.70.46.102 10.70.46.50
RESULT: SUCCESS
[root@dhcp46-50 ozone0]# gluster-block list ozone
ob4
[root@dhcp46-50 ozone0]# gluster-block info ozone/ob4
NAME: ob4
VOLUME: ozone
GBID: 2e58d99f-c212-4d85-9390-b4d017d1a544
SIZE: 448.0 MiB
HA: 3
PASSWORD: b109ffff-3898-4fbe-97fb-9bab52f919db
EXPORTED ON: 10.70.46.50 10.70.46.102 10.70.46.176
[root@dhcp46-50 ozone0]# 
[root@dhcp46-50 ozone0]# 
[root@dhcp46-50 ozone0]# gluster-block delete ozone/ob4
SUCCESSFUL ON:   10.70.46.50 10.70.46.102 10.70.46.176
RESULT: SUCCESS
[root@dhcp46-50 ozone0]#

Comment 3 Pranith Kumar K 2018-07-09 09:06:58 UTC
This must be fixed in 3.10. Giving devel-ack. Please provide QE-ack.

Comment 20 errata-xmlrpc 2018-09-12 09:27:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2691


Note You need to log in before you can comment on or make changes to this bug.