1394752 – Seeing error messages [snapview-client.c:283:gf_svc_lookup_cbk] and [dht-helper.c:1666ht_inode_ctx_time_update] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x5d75c)

Bug 1394752 - Seeing error messages [snapview-client.c:283:gf_svc_lookup_cbk] and [dht-helper.c:1666ht_inode_ctx_time_update] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x5d75c)

Summary: Seeing error messages [snapview-client.c:283:gf_svc_lookup_cbk] and [dht-help...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	distribute
Sub Component:
Version:	rhgs-3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	RHGS 3.2.0
Assignee:	Nithya Balachandran
QA Contact:	Prasad Desala
Docs Contact:
URL:
Whiteboard:
Depends On:	1395261 1395510 1395517
Blocks:	1351528
TreeView+	depends on / blocked

Reported:	2016-11-14 11:57 UTC by Nag Pavan Chilakam
Modified:	2017-03-23 06:18 UTC (History)
CC List:	4 users (show)
Fixed In Version:	glusterfs-3.8.4-6
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1395261 (view as bug list)
Environment:
Last Closed:	2017-03-23 06:18:41 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:0486	0	normal	SHIPPED_LIVE	Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update	2017-03-23 09:18:45 UTC

Description Nag Pavan Chilakam 2016-11-14 11:57:40 UTC

On my systemic setup, 
I am seeing lot of error messages on my clients as below


[2016-11-14 02:43:50.274000] E [snapview-client.c:283:gf_svc_lookup_cbk] 0-sysvol-snapview-client: Lookup failed on normal graph with error Transport endpoint is not connected
[2016-11-14 02:43:50.275390] E [dht-helper.c:1666:dht_inode_ctx_time_update] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x5d75c) [0x7f2a4ee4175c] -->/usr/lib64/glusterfs/3.8.4/xlator/cluster/distribute.so(+0x4623c) [0x7f2a4eba023c] -->/usr/lib64/glusterfs/3.8.4/xlator/cluster/distribute.so(+0x99b0) [0x7f2a4eb639b0] ) 0-sysvol-dht: invalid argument: inode [Invalid argument]


I see these message repeating about every 20 min in bulk

Vol info is as below
Volume Name: sysvol
Type: Distributed-Replicate
Volume ID: b1ef4d84-0614-4d5d-9e2e-b19183996e43
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: 10.70.35.191:/rhs/brick1/sysvol
Brick2: 10.70.37.108:/rhs/brick1/sysvol
Brick3: 10.70.35.3:/rhs/brick1/sysvol
Brick4: 10.70.37.66:/rhs/brick1/sysvol
Brick5: 10.70.35.191:/rhs/brick2/sysvol
Brick6: 10.70.37.108:/rhs/brick2/sysvol
Brick7: 10.70.35.3:/rhs/brick2/sysvol
Brick8: 10.70.37.66:/rhs/brick2/sysvol
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
performance.stat-prefetch: on
performance.cache-invalidation: on
cluster.shd-max-threads: 10
features.cache-invalidation-timeout: 400
features.cache-invalidation: on
performance.md-cache-timeout: 300
features.uss: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
[root@dhcp35-191 ~]# gluster v status
Status of volume: sysvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.191:/rhs/brick1/sysvol       N/A       N/A        N       N/A  
Brick 10.70.37.108:/rhs/brick1/sysvol       49152     0          Y       27848
Brick 10.70.35.3:/rhs/brick1/sysvol         N/A       N/A        N       N/A  
Brick 10.70.37.66:/rhs/brick1/sysvol        49152     0          Y       28853
Brick 10.70.35.191:/rhs/brick2/sysvol       49153     0          Y       18344
Brick 10.70.37.108:/rhs/brick2/sysvol       N/A       N/A        N       N/A  
Brick 10.70.35.3:/rhs/brick2/sysvol         49153     0          Y       11727
Brick 10.70.37.66:/rhs/brick2/sysvol        N/A       N/A        N       N/A  
Snapshot Daemon on localhost                49154     0          Y       18461
Self-heal Daemon on localhost               N/A       N/A        Y       18364
Quota Daemon on localhost                   N/A       N/A        Y       18410
Snapshot Daemon on 10.70.35.3               49154     0          Y       11826
Self-heal Daemon on 10.70.35.3              N/A       N/A        Y       11747
Quota Daemon on 10.70.35.3                  N/A       N/A        Y       11779
Snapshot Daemon on 10.70.37.66              49154     0          Y       28970
Self-heal Daemon on 10.70.37.66             N/A       N/A        Y       28892
Quota Daemon on 10.70.37.66                 N/A       N/A        Y       28923
Snapshot Daemon on 10.70.37.108             49154     0          Y       27965
Self-heal Daemon on 10.70.37.108            N/A       N/A        Y       27887
Quota Daemon on 10.70.37.108                N/A       N/A        Y       27918
 
Task Status of Volume sysvol
------------------------------------------------------------------------------
There are no active volume tasks




Client IO patterns can be found at :
https://docs.google.com/spreadsheets/d/1iP5Mi1TewBFVh8HTmlcBm9072Bgsbgkr3CLcGmawDys/edit#gid=760435885

Comment 3 Nithya Balachandran 2016-11-16 04:30:04 UTC

Upstream patch:

Master: http://review.gluster.org/#/c/15847/

Comment 4 Nithya Balachandran 2016-11-16 06:05:59 UTC

Patches:

Upstream:
master: http://review.gluster.org/15847
release-3.8 : http://review.gluster.org/15850
release-3.9 : http://review.gluster.org/15851


Downstream:
https://code.engineering.redhat.com/gerrit/#/c/90283/

Comment 8 Prasad Desala 2016-12-13 05:39:10 UTC

Verified this BZ on glusterfs version 3.8.4-8.el7rhgs.x86_64.
Followed the same steps mentioned in Comment 2 and did not see the errors reported in this BZ.

Moving this BZ to Verified.

Comment 10 errata-xmlrpc 2017-03-23 06:18:41 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html

Note You need to log in before you can comment on or make changes to this bug.