1408426 – with granular-entry-self-heal enabled i see that there is a gfid mismatch and vm goes to paused state after migrating to another host

Bug 1408426 - with granular-entry-self-heal enabled i see that there is a gfid mismatch and vm goes to paused state after migrating to another host

Summary: with granular-entry-self-heal enabled i see that there is a gfid mismatch and...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	arbiter
Sub Component:
Version:	rhgs-3.2
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.2.0
Assignee:	Krutika Dhananjay
QA Contact:	RamaKasturi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	Gluster-HC-2 1351528 1400057 1408712 1408785 1408786
TreeView+	depends on / blocked

Reported:	2016-12-23 10:56 UTC by RamaKasturi
Modified:	2017-03-23 05:59 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.8.4-11
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1408712 (view as bug list)
Environment:
Last Closed:	2017-03-23 05:59:24 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:0486	0	normal	SHIPPED_LIVE	Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update	2017-03-23 09:18:45 UTC

Description RamaKasturi 2016-12-23 10:56:06 UTC

Description of problem:
vm creation happens when one of the data brick is down and once the brick is brought up back i see that there are some entries which does not get healed and when the vm is migrated to another node it goes to paused state by logging the following errors in the mount logs.

[2016-12-23 09:14:16.481519] W [MSGID: 108008] [afr-self-heal-name.c:369:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for <gfid:be318638-e8a0-4c6d-977d-7a937aa84806>/f735902d-12fa-4e4d-88c9-1b8ba06e3063.1673 6e17b733-b8a4-4563-bc3d-f659c9a46c2a on engine-client-1 and 55648f43-7e09-4e62-b7d2-16fe1ff7b23e on engine-client-0
[2016-12-23 09:14:16.482442] E [MSGID: 133010] [shard.c:1582:shard_common_lookup_shards_cbk] 0-engine-shard: Lookup on shard 1673 failed. Base file gfid = f735902d-12fa-4e4d-88c9-1b8ba06e3063 [Input/output error]
[2016-12-23 09:14:16.482474] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 11280842: READ => -1 gfid=f735902d-12fa-4e4d-88c9-1b8ba06e3063 fd=0x7faeda380210 (Input/output error)
[2016-12-23 10:08:41.956330] W [MSGID: 108008] [afr-self-heal-name.c:369:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for <gfid:be318638-e8a0-4c6d-977d-7a937aa84806>/f735902d-12fa-4e4d-88c9-1b8ba06e3063.1673 6e17b733-b8a4-4563-bc3d-f659c9a46c2a on engine-client-1 and 55648f43-7e09-4e62-b7d2-16fe1ff7b23e on engine-client-0
[2016-12-23 10:08:41.957422] E [MSGID: 133010] [shard.c:1582:shard_common_lookup_shards_cbk] 0-engine-shard: Lookup on shard 1673 failed. Base file gfid = f735902d-12fa-4e4d-88c9-1b8ba06e3063 [Input/output error]
[2016-12-23 10:08:41.957444] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 11427307: READ => -1 gfid=f735902d-12fa-4e4d-88c9-1b8ba06e3063 fd=0x7faeda380328 (Input/output error)
[2016-12-23 10:45:10.609600] W [MSGID: 108008] [afr-self-heal-name.c:369:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for <gfid:be318638-e8a0-4c6d-977d-7a937aa84806>/f735902d-12fa-4e4d-88c9-1b8ba06e3063.1673 6e17b733-b8a4-4563-bc3d-f659c9a46c2a on engine-client-1 and 55648f43-7e09-4e62-b7d2-16fe1ff7b23e on engine-client-0
[2016-12-23 10:45:10.610550] E [MSGID: 133010] [shard.c:1582:shard_common_lookup_shards_cbk] 0-engine-shard: Lookup on shard 1673 failed. Base file gfid = f735902d-12fa-4e4d-88c9-1b8ba06e3063 [Input/output error]
[2016-12-23 10:45:10.610574] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 11526955: READ => -1 gfid=f735902d-12fa-4e4d-88c9-1b8ba06e3063 fd=0x7faeda380184 (Input/output error)


Version-Release number of selected component (if applicable):
glusterfs-3.8.4-9.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install HC with three nodes.
2. Create a arbiter volume and enable all the options using gdeploy.
3. Now bring down the first brick in the arbiter volume and create vm.
4. Once the vm creation is completed, bring back the brick and wait for self heal to happen.
5. Now migrate the vm to another host.

Actual results:
There are two issues which i have seen.
1) There are still some entries present in the node which are not healed even after a long time
2) And once the vm is migrated i see that vm goes to paused state.

Expected results:
Vm should not go to paused state after migration plus there should not be any entries present in volume heal info.

Additional info:

Comment 2 RamaKasturi 2016-12-23 10:59:44 UTC

As suggested by pranith i disabled granluar entry self heal on the volume and i do not see the issue

Comment 3 RamaKasturi 2016-12-23 11:07:34 UTC

gluster volume info:
==============================
[root@rhsqa-grafton1 ~]# gluster volume info engine
 
Volume Name: engine
Type: Replicate
Volume ID: f0ae3c3a-44ca-4a5e-aafa-b32be8330c11
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.36.79:/rhgs/brick1/engine
Brick2: 10.70.36.80:/rhgs/brick1/engine
Brick3: 10.70.36.81:/rhgs/brick1/engine (arbiter)
Options Reconfigured:
auth.ssl-allow: 10.70.36.80,10.70.36.79,10.70.36.81
server.ssl: on
client.ssl: on
cluster.use-compound-fops: on
cluster.granular-entry-heal: on
performance.strict-o-direct: on
user.cifs: off
network.ping-timeout: 30
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
cluster.locking-scheme: granular
performance.low-prio-threads: 32
features.shard-block-size: 4MB
storage.owner-gid: 36
storage.owner-uid: 36
cluster.data-self-heal-algorithm: full
features.shard: on
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: off
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on

Comment 4 RamaKasturi 2016-12-23 11:13:13 UTC

sosreports can be found in the link below:
==============================================
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1408426/

Comment 5 Ravishankar N 2016-12-23 11:50:58 UTC

Note: The issue is not specific to arbiter per se. Assigning the bug to Krutika who is working with Sas on the same issue in granular esh. Not changing the component to replicate though, since Kasturi tested it on arbiter configuration.c

Comment 6 Nag Pavan Chilakam 2016-12-26 10:13:38 UTC

marking blocker? as VM pause means data unavailability

Comment 7 SATHEESARAN 2016-12-26 10:18:38 UTC

With the latest update from Pranith & Krutika, the issue is caused because of explanation as in https://bugzilla.redhat.com/show_bug.cgi?id=1400057#c11

Though both the issues ( BZ 1400057 & this bug ) will be solved with the patch, both the scenarios needs to be re-tested with the patch in place.

This bug needs to be acked as per process for RHGS 3.2.0

Comment 8 Krutika Dhananjay 2016-12-26 10:41:04 UTC

Resuming from https://bugzilla.redhat.com/show_bug.cgi?id=1400057#c11 to explain why there would be a gfid mismatch. So please go through https://bugzilla.redhat.com/show_bug.cgi?id=1400057#c11 first.

... the pending xattrs on .shard are at this point erased. Now when the brick that was down comes back online, another MKNOD on this shard's name triggered by shard readv fop, whenever it happens, would cause the fop to give EEXIST from the bricks that were already online; and on the brick that was previously offline, the creation of this shard would succeed, although with a new gfid. This leads to the gfid mismatch.

Comment 10 Krutika Dhananjay 2016-12-26 17:17:55 UTC

http://review.gluster.org/#/c/16286/

Comment 11 Pranith Kumar K 2016-12-27 11:54:13 UTC

https://code.engineering.redhat.com/gerrit/#/c/93754/

Comment 13 RamaKasturi 2017-01-13 11:39:53 UTC

verified and works fine with build glusterfs-3.8.4-11.el7rhgs.x86_64.

Followed steps below to verify the bug:
========================================
1. Install HC with three nodes.
2. Create a arbiter volume and enable all the options using gdeploy.
3. Now bring down the first brick in the arbiter volume and create vm.
4. Once the vm creation is completed, bring back the brick and wait for self heal to happen.
5. Now migrate the vm to another host.

I see that vm has been migrated successfully and do not see vm pause once migration is completed.

Did not observe any gfid mismatch in the client logs.

Comment 15 errata-xmlrpc 2017-03-23 05:59:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html

Note You need to log in before you can comment on or make changes to this bug.