Bug 1408426

Summary:	with granular-entry-self-heal enabled i see that there is a gfid mismatch and vm goes to paused state after migrating to another host
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	RamaKasturi <knarra>
Component:	arbiter	Assignee:	Krutika Dhananjay <kdhananj>
Status:	CLOSED ERRATA	QA Contact:	RamaKasturi <knarra>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.2	CC:	amukherj, nchilaka, pkarampu, rcyriac, rhinduja, rhs-bugs, sasundar, storage-qa-internal
Target Milestone:	---
Target Release:	RHGS 3.2.0
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.8.4-11	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1408712 (view as bug list)		Environment:
Last Closed:	2017-03-23 05:59:24 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1277939, 1351528, 1400057, 1408712, 1408785, 1408786

Description RamaKasturi 2016-12-23 10:56:06 UTC

Description of problem:
vm creation happens when one of the data brick is down and once the brick is brought up back i see that there are some entries which does not get healed and when the vm is migrated to another node it goes to paused state by logging the following errors in the mount logs.

[2016-12-23 09:14:16.481519] W [MSGID: 108008] [afr-self-heal-name.c:369:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for <gfid:be318638-e8a0-4c6d-977d-7a937aa84806>/f735902d-12fa-4e4d-88c9-1b8ba06e3063.1673 6e17b733-b8a4-4563-bc3d-f659c9a46c2a on engine-client-1 and 55648f43-7e09-4e62-b7d2-16fe1ff7b23e on engine-client-0
[2016-12-23 09:14:16.482442] E [MSGID: 133010] [shard.c:1582:shard_common_lookup_shards_cbk] 0-engine-shard: Lookup on shard 1673 failed. Base file gfid = f735902d-12fa-4e4d-88c9-1b8ba06e3063 [Input/output error]
[2016-12-23 09:14:16.482474] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 11280842: READ => -1 gfid=f735902d-12fa-4e4d-88c9-1b8ba06e3063 fd=0x7faeda380210 (Input/output error)
[2016-12-23 10:08:41.956330] W [MSGID: 108008] [afr-self-heal-name.c:369:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for <gfid:be318638-e8a0-4c6d-977d-7a937aa84806>/f735902d-12fa-4e4d-88c9-1b8ba06e3063.1673 6e17b733-b8a4-4563-bc3d-f659c9a46c2a on engine-client-1 and 55648f43-7e09-4e62-b7d2-16fe1ff7b23e on engine-client-0
[2016-12-23 10:08:41.957422] E [MSGID: 133010] [shard.c:1582:shard_common_lookup_shards_cbk] 0-engine-shard: Lookup on shard 1673 failed. Base file gfid = f735902d-12fa-4e4d-88c9-1b8ba06e3063 [Input/output error]
[2016-12-23 10:08:41.957444] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 11427307: READ => -1 gfid=f735902d-12fa-4e4d-88c9-1b8ba06e3063 fd=0x7faeda380328 (Input/output error)
[2016-12-23 10:45:10.609600] W [MSGID: 108008] [afr-self-heal-name.c:369:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for <gfid:be318638-e8a0-4c6d-977d-7a937aa84806>/f735902d-12fa-4e4d-88c9-1b8ba06e3063.1673 6e17b733-b8a4-4563-bc3d-f659c9a46c2a on engine-client-1 and 55648f43-7e09-4e62-b7d2-16fe1ff7b23e on engine-client-0
[2016-12-23 10:45:10.610550] E [MSGID: 133010] [shard.c:1582:shard_common_lookup_shards_cbk] 0-engine-shard: Lookup on shard 1673 failed. Base file gfid = f735902d-12fa-4e4d-88c9-1b8ba06e3063 [Input/output error]
[2016-12-23 10:45:10.610574] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 11526955: READ => -1 gfid=f735902d-12fa-4e4d-88c9-1b8ba06e3063 fd=0x7faeda380184 (Input/output error)


Version-Release number of selected component (if applicable):
glusterfs-3.8.4-9.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install HC with three nodes.
2. Create a arbiter volume and enable all the options using gdeploy.
3. Now bring down the first brick in the arbiter volume and create vm.
4. Once the vm creation is completed, bring back the brick and wait for self heal to happen.
5. Now migrate the vm to another host.

Actual results:
There are two issues which i have seen.
1) There are still some entries present in the node which are not healed even after a long time
2) And once the vm is migrated i see that vm goes to paused state.

Expected results:
Vm should not go to paused state after migration plus there should not be any entries present in volume heal info.

Additional info:

Comment 2 RamaKasturi 2016-12-23 10:59:44 UTC

As suggested by pranith i disabled granluar entry self heal on the volume and i do not see the issue

Comment 3 RamaKasturi 2016-12-23 11:07:34 UTC

gluster volume info:
==============================
[root@rhsqa-grafton1 ~]# gluster volume info engine
 
Volume Name: engine
Type: Replicate
Volume ID: f0ae3c3a-44ca-4a5e-aafa-b32be8330c11
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.36.79:/rhgs/brick1/engine
Brick2: 10.70.36.80:/rhgs/brick1/engine
Brick3: 10.70.36.81:/rhgs/brick1/engine (arbiter)
Options Reconfigured:
auth.ssl-allow: 10.70.36.80,10.70.36.79,10.70.36.81
server.ssl: on
client.ssl: on
cluster.use-compound-fops: on
cluster.granular-entry-heal: on
performance.strict-o-direct: on
user.cifs: off
network.ping-timeout: 30
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
cluster.locking-scheme: granular
performance.low-prio-threads: 32
features.shard-block-size: 4MB
storage.owner-gid: 36
storage.owner-uid: 36
cluster.data-self-heal-algorithm: full
features.shard: on
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: off
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on

Comment 4 RamaKasturi 2016-12-23 11:13:13 UTC

sosreports can be found in the link below:
==============================================
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1408426/

Comment 5 Ravishankar N 2016-12-23 11:50:58 UTC

Note: The issue is not specific to arbiter per se. Assigning the bug to Krutika who is working with Sas on the same issue in granular esh. Not changing the component to replicate though, since Kasturi tested it on arbiter configuration.c

Comment 6 Nag Pavan Chilakam 2016-12-26 10:13:38 UTC

marking blocker? as VM pause means data unavailability

Comment 7 SATHEESARAN 2016-12-26 10:18:38 UTC

With the latest update from Pranith & Krutika, the issue is caused because of explanation as in https://bugzilla.redhat.com/show_bug.cgi?id=1400057#c11

Though both the issues ( BZ 1400057 & this bug ) will be solved with the patch, both the scenarios needs to be re-tested with the patch in place.

This bug needs to be acked as per process for RHGS 3.2.0

Comment 8 Krutika Dhananjay 2016-12-26 10:41:04 UTC

Resuming from https://bugzilla.redhat.com/show_bug.cgi?id=1400057#c11 to explain why there would be a gfid mismatch. So please go through https://bugzilla.redhat.com/show_bug.cgi?id=1400057#c11 first.

... the pending xattrs on .shard are at this point erased. Now when the brick that was down comes back online, another MKNOD on this shard's name triggered by shard readv fop, whenever it happens, would cause the fop to give EEXIST from the bricks that were already online; and on the brick that was previously offline, the creation of this shard would succeed, although with a new gfid. This leads to the gfid mismatch.

Comment 10 Krutika Dhananjay 2016-12-26 17:17:55 UTC

http://review.gluster.org/#/c/16286/

Comment 11 Pranith Kumar K 2016-12-27 11:54:13 UTC

https://code.engineering.redhat.com/gerrit/#/c/93754/

Comment 13 RamaKasturi 2017-01-13 11:39:53 UTC

verified and works fine with build glusterfs-3.8.4-11.el7rhgs.x86_64.

Followed steps below to verify the bug:
========================================
1. Install HC with three nodes.
2. Create a arbiter volume and enable all the options using gdeploy.
3. Now bring down the first brick in the arbiter volume and create vm.
4. Once the vm creation is completed, bring back the brick and wait for self heal to happen.
5. Now migrate the vm to another host.

I see that vm has been migrated successfully and do not see vm pause once migration is completed.

Did not observe any gfid mismatch in the client logs.

Comment 15 errata-xmlrpc 2017-03-23 05:59:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html