Bug 1335378

Summary:	self-heal is not happening and even heal info command is hung
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	RajeshReddy <rmekala>
Component:	replicate	Assignee:	Ashish Pandey <aspandey>
Status:	CLOSED NOTABUG	QA Contact:	SATHEESARAN <sasundar>
Severity:	unspecified	Docs Contact:
Priority:	medium
Version:	rhgs-3.1	CC:	amukherj, aspandey, mzywusko, pkarampu, rcyriac, rhs-bugs, sabose, sasundar
Target Milestone:	---	Keywords:	ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-11-17 09:35:35 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1277939

Description RajeshReddy 2016-05-12 06:48:33 UTC

Description of problem:
=============
slef-heal is not happening and even heal info command is hung 

Version-Release number of selected component (if applicable):
================
glusterfs-server-3.7.9-4.el7rhgs.x86_64

How reproducible:


Steps to Reproduce:
============
1. Created 1x3 volume and created 30 VM's on it 
2. Bring down one of the brick and do data creation on VM's
3. Bring back the brick and take VM snapshot 

Actual results:
============
Though all bricks are up and running heal not happened even after one day and heal info command is hung  


Expected results:
============
Self-heal should work properly and heal command should not hung 

Additional info:
==============
Volume Name: volume2
Type: Replicate
Volume ID: 427f3752-15b2-4921-ac24-1b4c06e792f4
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: dhcp35-231.lab.eng.blr.redhat.com:/rhgs/vmaddldiskbrick/vmaddl-brick1
Brick2: dhcp35-214.lab.eng.blr.redhat.com:/rhgs/vmaddldiskbrick/vmaddl-brick1
Brick3: dhcp35-115.lab.eng.blr.redhat.com:/rhgs/vmaddldiskbrick/vmaddl-brick1
Options Reconfigured:
network.ping-timeout: 10
cluster.shd-max-threads: 4
cluster.data-self-heal-algorithm: full
performance.low-prio-threads: 32
features.shard-block-size: 512MB
features.shard: on
nfs.disable: enable
user.cifs: off
storage.owner-gid: 36
storage.owner-uid: 36
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.readdir-ahead: on
geo-replication.indexing: on
geo-replication.ignore-pid-check: on
changelog.changelog: on
cluster.enable-shared-storage: enable

Comment 2 RajeshReddy 2016-05-12 06:54:38 UTC

sosreports are available on rhsqe-repo.lab.eng.blr.redhat.com  @/home/repo/sosreports/bug.1335378

Comment 3 Pranith Kumar K 2016-05-12 07:14:06 UTC

There seems to be an issue with unlock not being sent to the brick, even when the connection is available from client to server. I also took some statedumps which are important. I am not sure which condition can lead to this bug yet. The bug is in either client/server translators.

As per the brick logs:
[root@cambridge bricks]# grep yarrow.lab.eng.blr.redhat.com rhgs-vmaddldiskbrick-vmaddl-brick1.log |  grep gf_client_unref | grep -v client-0-0-0
[2016-05-11 09:42:00.587879] I [MSGID: 101055] [client_t.c:420:gf_client_unref] 0-volume2-server: Shutting down connection yarrow.lab.eng.blr.redhat.com-4889-2016/05/11-09:32:06:839412-volume2-client-0-0-1
[2016-05-11 11:56:54.945712] I [MSGID: 101055] [client_t.c:420:gf_client_unref] 0-volume2-server: Shutting down connection yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-8
[root@cambridge bricks]# grep yarrow.lab.eng.blr.redhat.com rhgs-vmaddldiskbrick-vmaddl-brick1.log | grep setvolume | grep -v client-0-0-0
[2016-05-10 09:03:09.154583] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-1 (version: 3.7.9)
[2016-05-10 09:03:09.236874] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-6805-2016/05/10-08:37:19:10587-volume2-client-0-0-1 (version: 3.7.9)
[2016-05-11 07:56:53.538451] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-2 (version: 3.7.9)
[2016-05-11 09:32:07.295064] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-3 (version: 3.7.9)
[2016-05-11 09:42:00.132986] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-4 (version: 3.7.9)
[2016-05-11 09:42:00.491152] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-4889-2016/05/11-09:32:06:839412-volume2-client-0-0-1 (version: 3.7.9)
[2016-05-11 09:50:26.619117] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-5 (version: 3.7.9)
[2016-05-11 11:27:36.296603] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-6 (version: 3.7.9)
[2016-05-11 11:35:44.525578] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-7 (version: 3.7.9)
[2016-05-11 11:53:19.309196] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-8 (version: 3.7.9)
[2016-05-11 11:56:50.256846] I [MSGID: 115029] [server-handshake.c:690:server_setvolume] 0-volume2-server: accepted client from yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-9 (version: 3.7.9)

New connection to the brick happened at [2016-05-11 11:56:50.256846] which is still active. But both fxattrop and finodelk failed with transport endpoint not connected:

[2016-05-11 11:56:54.944907] W [MSGID: 114031] [client-rpc-fops.c:1917:client3_3_fxattrop_cbk] 0-volume2-client-0: remote operation failed
[2016-05-11 11:56:54.945728] W [MSGID: 108001] [afr-transaction.c:729:afr_handle_quorum] 0-volume2-replicate-0: 9805e9cb-8c28-4c1d-aaf1-326e331d23f8: Failing FXATTROP as quorum is not met
[2016-05-11 11:56:54.945763] E [MSGID: 114031] [client-rpc-fops.c:1676:client3_3_finodelk_cbk] 0-volume2-client-0: remote operation failed [Transport endpoint is not connected]
[2016-05-11 11:56:54.946131] E [MSGID: 133016] [shard.c:631:shard_update_file_size_cbk] 0-volume2-shard: Update to file size xattr failed on 9805e9cb-8c28-4c1d-aaf1-326e331d23f8 [Read-only file system]

This is the stale lock:
[xlator.features.locks.volume2-locks.inode]
path=/8c8b8d20-54d5-41cb-8401-699ee877537b/images/67da0afc-e687-435e-a16d-88c56d876dcc/ed6c954
a-e5f5-4e09-959d-0759c566bb65.lease
mandatory=0
inodelk-count=4
lock-dump.domain.domain=volume2-replicate-0:self-heal
lock-dump.domain.domain=volume2-replicate-0
lock-dump.domain.domain=volume2-replicate-0:metadata
inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 16355, owner=d8b8d88ed37f0000, client=0x7ffbb81136a0, connection-id=yarrow.lab.eng.blr.redhat.com-7343-2016/05/10-08:37:23:721784-volume2-client-0-0-9, granted at 2016-05-11 11:56:54 <<------ Unlock failed because of transport endpoint not connected. Which is the bug.
inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 18446744073709551610, owner=70a43167907f0000, client=0x7ffbb80f00d0, connection-id=cambridge.lab.eng.blr.redhat.com-21500-2016/05/11-11:53:19:288932-volume2-client-0-0-0, blocked at 2016-05-12 03:28:55
inodelk.inodelk[2](BLOCKED)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 18446744073709551610, owner=80c447afb27f0000, client=0x7ffbb80f2980, connection-id=moonshine.lab.eng.blr.redhat.com-14476-2016/05/11-11:53:21:329730-volume2-client-0-0-0, blocked at 2016-05-12 03:28:55
inodelk.inodelk[3](BLOCKED)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 15709, owner=5d3d0000, client=0x7ffbb0021200, connection-id=yarrow.lab.eng.blr.redhat.com-15709-2016/05/12-03:29:21:720440-volume2-client-0-0-0, blocked at 2016-05-12 03:29:21

Comment 4 Pranith Kumar K 2016-05-20 08:37:22 UTC

This bug happened once and I don't think we have clear steps to re-create the issue. So far we don't have RC as to why it happened.

Comment 5 Atin Mukherjee 2016-05-20 10:24:33 UTC

Can we try to get a consistent reproducer here?

Comment 8 Sahina Bose 2016-07-21 11:56:16 UTC

Do we have a reproducer or shall we close this?

Comment 9 SATHEESARAN 2016-11-16 09:25:15 UTC

(In reply to Sahina Bose from comment #8)
> Do we have a reproducer or shall we close this?

I am hitting this issue again, where I see that self-heal info is hung.
I have tested with RHGS 3.2.0 interim build ( glusterfs-3.8.4-5.el7rhgs ).

These are the steps I did :

1. Created replica 3 volume
2. Optimized the volume for VM store usecase
3. Created a VM ( which uses gfapi to access its disk )
4. Started OS installation
5. While OS installation is happening, killed the brick ( kill <pid> ) of the server1.
6. I could see that heal-info reporting unhealed entries
7. Created one another VM and installed OS in it too.
8. Brought back the volume ( by force starting the volume )
9. When the brick is up ( confirmed with gluster volume status ), heal-info just didn't respond

Comment 10 SATHEESARAN 2016-11-17 09:35:35 UTC

(In reply to SATHEESARAN from comment #9)
> (In reply to Sahina Bose from comment #8)
> > Do we have a reproducer or shall we close this?
> 
> I am hitting this issue again, where I see that self-heal info is hung.
> I have tested with RHGS 3.2.0 interim build ( glusterfs-3.8.4-5.el7rhgs ).
> 
> These are the steps I did :
> 
> 1. Created replica 3 volume
> 2. Optimized the volume for VM store usecase
> 3. Created a VM ( which uses gfapi to access its disk )
> 4. Started OS installation
> 5. While OS installation is happening, killed the brick ( kill <pid> ) of
> the server1.
> 6. I could see that heal-info reporting unhealed entries
> 7. Created one another VM and installed OS in it too.
> 8. Brought back the volume ( by force starting the volume )
> 9. When the brick is up ( confirmed with gluster volume status ), heal-info
> just didn't respond

This issue is seen with compound-fops only which was not the reason this bug was raised initially. I will raise a separate bug for the issue. Closing this bug