Description of problem: vm creation happens when one of the data brick is down and once the brick is brought up back i see that there are some entries which does not get healed and when the vm is migrated to another node it goes to paused state by logging the following errors in the mount logs. [2016-12-23 09:14:16.481519] W [MSGID: 108008] [afr-self-heal-name.c:369:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for <gfid:be318638-e8a0-4c6d-977d-7a937aa84806>/f735902d-12fa-4e4d-88c9-1b8ba06e3063.1673 6e17b733-b8a4-4563-bc3d-f659c9a46c2a on engine-client-1 and 55648f43-7e09-4e62-b7d2-16fe1ff7b23e on engine-client-0 [2016-12-23 09:14:16.482442] E [MSGID: 133010] [shard.c:1582:shard_common_lookup_shards_cbk] 0-engine-shard: Lookup on shard 1673 failed. Base file gfid = f735902d-12fa-4e4d-88c9-1b8ba06e3063 [Input/output error] [2016-12-23 09:14:16.482474] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 11280842: READ => -1 gfid=f735902d-12fa-4e4d-88c9-1b8ba06e3063 fd=0x7faeda380210 (Input/output error) [2016-12-23 10:08:41.956330] W [MSGID: 108008] [afr-self-heal-name.c:369:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for <gfid:be318638-e8a0-4c6d-977d-7a937aa84806>/f735902d-12fa-4e4d-88c9-1b8ba06e3063.1673 6e17b733-b8a4-4563-bc3d-f659c9a46c2a on engine-client-1 and 55648f43-7e09-4e62-b7d2-16fe1ff7b23e on engine-client-0 [2016-12-23 10:08:41.957422] E [MSGID: 133010] [shard.c:1582:shard_common_lookup_shards_cbk] 0-engine-shard: Lookup on shard 1673 failed. Base file gfid = f735902d-12fa-4e4d-88c9-1b8ba06e3063 [Input/output error] [2016-12-23 10:08:41.957444] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 11427307: READ => -1 gfid=f735902d-12fa-4e4d-88c9-1b8ba06e3063 fd=0x7faeda380328 (Input/output error) [2016-12-23 10:45:10.609600] W [MSGID: 108008] [afr-self-heal-name.c:369:afr_selfheal_name_gfid_mismatch_check] 0-engine-replicate-0: GFID mismatch for <gfid:be318638-e8a0-4c6d-977d-7a937aa84806>/f735902d-12fa-4e4d-88c9-1b8ba06e3063.1673 6e17b733-b8a4-4563-bc3d-f659c9a46c2a on engine-client-1 and 55648f43-7e09-4e62-b7d2-16fe1ff7b23e on engine-client-0 [2016-12-23 10:45:10.610550] E [MSGID: 133010] [shard.c:1582:shard_common_lookup_shards_cbk] 0-engine-shard: Lookup on shard 1673 failed. Base file gfid = f735902d-12fa-4e4d-88c9-1b8ba06e3063 [Input/output error] [2016-12-23 10:45:10.610574] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 11526955: READ => -1 gfid=f735902d-12fa-4e4d-88c9-1b8ba06e3063 fd=0x7faeda380184 (Input/output error) Version-Release number of selected component (if applicable): glusterfs-3.8.4-9.el7rhgs.x86_64 How reproducible: Always Steps to Reproduce: 1. Install HC with three nodes. 2. Create a arbiter volume and enable all the options using gdeploy. 3. Now bring down the first brick in the arbiter volume and create vm. 4. Once the vm creation is completed, bring back the brick and wait for self heal to happen. 5. Now migrate the vm to another host. Actual results: There are two issues which i have seen. 1) There are still some entries present in the node which are not healed even after a long time 2) And once the vm is migrated i see that vm goes to paused state. Expected results: Vm should not go to paused state after migration plus there should not be any entries present in volume heal info. Additional info:
As suggested by pranith i disabled granluar entry self heal on the volume and i do not see the issue
gluster volume info: ============================== [root@rhsqa-grafton1 ~]# gluster volume info engine Volume Name: engine Type: Replicate Volume ID: f0ae3c3a-44ca-4a5e-aafa-b32be8330c11 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: 10.70.36.79:/rhgs/brick1/engine Brick2: 10.70.36.80:/rhgs/brick1/engine Brick3: 10.70.36.81:/rhgs/brick1/engine (arbiter) Options Reconfigured: auth.ssl-allow: 10.70.36.80,10.70.36.79,10.70.36.81 server.ssl: on client.ssl: on cluster.use-compound-fops: on cluster.granular-entry-heal: on performance.strict-o-direct: on user.cifs: off network.ping-timeout: 30 cluster.shd-max-threads: 8 cluster.shd-wait-qlength: 10000 cluster.locking-scheme: granular performance.low-prio-threads: 32 features.shard-block-size: 4MB storage.owner-gid: 36 storage.owner-uid: 36 cluster.data-self-heal-algorithm: full features.shard: on cluster.server-quorum-type: server cluster.quorum-type: auto network.remote-dio: off cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off transport.address-family: inet performance.readdir-ahead: on nfs.disable: on
sosreports can be found in the link below: ============================================== http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1408426/
Note: The issue is not specific to arbiter per se. Assigning the bug to Krutika who is working with Sas on the same issue in granular esh. Not changing the component to replicate though, since Kasturi tested it on arbiter configuration.c
marking blocker? as VM pause means data unavailability
With the latest update from Pranith & Krutika, the issue is caused because of explanation as in https://bugzilla.redhat.com/show_bug.cgi?id=1400057#c11 Though both the issues ( BZ 1400057 & this bug ) will be solved with the patch, both the scenarios needs to be re-tested with the patch in place. This bug needs to be acked as per process for RHGS 3.2.0
Resuming from https://bugzilla.redhat.com/show_bug.cgi?id=1400057#c11 to explain why there would be a gfid mismatch. So please go through https://bugzilla.redhat.com/show_bug.cgi?id=1400057#c11 first. ... the pending xattrs on .shard are at this point erased. Now when the brick that was down comes back online, another MKNOD on this shard's name triggered by shard readv fop, whenever it happens, would cause the fop to give EEXIST from the bricks that were already online; and on the brick that was previously offline, the creation of this shard would succeed, although with a new gfid. This leads to the gfid mismatch.
http://review.gluster.org/#/c/16286/
https://code.engineering.redhat.com/gerrit/#/c/93754/
verified and works fine with build glusterfs-3.8.4-11.el7rhgs.x86_64. Followed steps below to verify the bug: ======================================== 1. Install HC with three nodes. 2. Create a arbiter volume and enable all the options using gdeploy. 3. Now bring down the first brick in the arbiter volume and create vm. 4. Once the vm creation is completed, bring back the brick and wait for self heal to happen. 5. Now migrate the vm to another host. I see that vm has been migrated successfully and do not see vm pause once migration is completed. Did not observe any gfid mismatch in the client logs.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html