1380710 – invalid argument warning messages seen in fuse client logs 2016-09-30 06:34:58.938667] W [dict.c:418ict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x58722) 0-dict: !this || !value for key=link-count [Invalid argument]

Bug 1380710 - invalid argument warning messages seen in fuse client logs 2016-09-30 06:34:58.938667] W [dict.c:418ict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x58722) 0-dict: !this || !value for key=link-count [Invalid argument]

Summary: invalid argument warning messages seen in fuse client logs 2016-09-30 06:34:5...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	replicate
Sub Component:
Version:	rhgs-3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.2.0
Assignee:	Pranith Kumar K
QA Contact:	Nag Pavan Chilakam
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1351528 1385104 1385236 1385442
TreeView+	depends on / blocked

Reported:	2016-09-30 11:58 UTC by Nag Pavan Chilakam
Modified:	2018-11-30 05:39 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.8.4-3
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1385104 (view as bug list)
Environment:
Last Closed:	2017-03-23 06:07:09 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:0486	0	normal	SHIPPED_LIVE	Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update	2017-03-23 09:18:45 UTC

Description Nag Pavan Chilakam 2016-09-30 11:58:08 UTC

Description of problem:
=======================
Description of problem:
=======================
In My systemic setup, I have a 4x2 volume with IOs being done from multiple clients.
However from two clients I issued same directory structure creates in a loop as below:
I am seeing Invalid arguments message on the client log as below
[2016-09-30 06:34:58.938667] W [dict.c:418ict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x58722) [0x7f7a1c50f722] -->/usr/lib64/libglusterfs.so.0(dict_set_str+0x3c) [0x7f7a2a3d178c] -->/usr/lib64/libglusterfs.so.0(dict_set+0x113) [0x7f7a2a3d0bc3] ) 0-dict: !this || !value for key=link-count [Invalid argument]


for i in {1..100};do for j in {1..100};do for k in {1..100} ;do for l in {1..100} ;do for m in {1..100} ;do echo ""THIS IS LOOP $i $j $k $l $m"" |& tee -a dir.$HOSTNAME.log;date |& tee -a dir.$HOSTNAME.log;echo ""###############################"" |& tee -a dir.$HOSTNAME.log;mkdir -p level1.$i |& tee -a dir.$HOSTNAME.log;echo ""THIS IS LOOP $i $j $k $l $m"" |& tee -a dir.$HOSTNAME.log;date |& tee -a dir.$HOSTNAME.log;echo ""###############################"" |& tee -a dir.$HOSTNAME.log; mkdir -p level1.$i/level2.$j |& tee -a dir.$HOSTNAME.log;echo ""THIS IS LOOP $i $j $k $l $m"" |& tee -a dir.$HOSTNAME.log;date |& tee -a dir.$HOSTNAME.log;echo ""###############################"" |& tee -a dir.$HOSTNAME.log; mkdir -p level1.$i/level2.$j/level3.$k |& tee -a dir.$HOSTNAME.log;echo ""THIS IS LOOP $i $j $k $l $m"" |& tee -a dir.$HOSTNAME.log;date |& tee -a dir.$HOSTNAME.log;echo ""###############################"" |& tee -a dir.$HOSTNAME.log; mkdir -p level1.$i/level2.$j/level3.$k/level4.$l |& tee -a dir.$HOSTNAME.log;echo ""THIS IS LOOP $i $j $k $l $m"" |& tee -a dir.$HOSTNAME.log;date |& tee -a dir.$HOSTNAME.log;echo ""###############################"" |& tee -a dir.$HOSTNAME.log;mkdir -p level1.$i/level2.$j/level3.$k/level4.$l |& tee -a dir.$HOSTNAME.log;mkdir -p level1.$i/level2.$j/level3.$k/level4.$l/level5.$m |& tee -a dir.$HOSTNAME.log;echo ""THIS IS LOOP $i $j $k $l $m"" |& tee -a dir.$HOSTNAME.log;date |& tee -a dir.$HOSTNAME.log;echo ""###############################"" |& tee -a dir.$HOSTNAME.log;done;done;done;done;done"


While the directory creations seem to be going smooth, I see same brick error logs repeated for which  BZ#1380699 has been raised.
However on the client too I see below messages:

client Logs:
[2016-09-30 06:34:58.938667] W [dict.c:418ict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x58722) [0x7f7a1c50f722] -->/usr/lib64/libglusterfs.so.0(dict_set_str+0x3c) [0x7f7a2a3d178c] -->/usr/lib64/libglusterfs.so.0(dict_set+0x113) [0x7f7a2a3d0bc3] ) 0-dict: !this || !value for key=link-count [Invalid argument]
[2016-09-30 06:34:58.949023] E [MSGID: 114031] [client-rpc-fops.c:1550:client3_3_inodelk_cbk] 0-distrepvol-client-7: remote operation failed [Invalid argument]
[2016-09-30 06:34:59.178135] I [MSGID: 109063] [dht-layout.c:713ht_layout_normalize] 0-distrepvol-dht: Found anomalies in /rootdir1/renames/dir_samenames/level1.1/level2.1/level3.21/level4.17/level5.13 (gfid = 6bd93a82-7c5e-47d4-9f7d-5e703a1225d6). Holes=1 overlaps=0
[2016-09-30 06:35:01.301329] W [fuse-bridge.c:471:fuse_entry_cbk] 0-glusterfs-fuse: 27400471: MKDIR() /rootdir1/renames/dir_samenames/level1.1/level2.1/level3.21/level4.17/level5.24 => -1 (File exists)
[2016-09-30 06:35:01.371991] I [MSGID: 109063] [dht-layout.c:713ht_layout_normalize] 0-distrepvol-dht: Found anomalies in /rootdir1/renames/dir_samenames/level1.1/level2.1/level3.21/level4.17/level5.24 (gfid = 310d4874-bcc5-442f-a378-265004540333). Holes=1 overlaps=0

Systemic testing details:
https://docs.google.com/spreadsheets/d/1iP5Mi1TewBFVh8HTmlcBm9072Bgsbgkr3CLcGmawDys/edit#gid=760435885


Steps to Reproduce:
1. create same directory structure from two different clients
Version-Release number of selected component (if applicable):
====================
[root@dhcp37-187 dir_samenames]# rpm -qa|grep gluster
glusterfs-api-3.8.4-1.el7rhgs.x86_64
glusterfs-rdma-3.8.4-1.el7rhgs.x86_64
glusterfs-libs-3.8.4-1.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-1.el7rhgs.x86_64
glusterfs-fuse-3.8.4-1.el7rhgs.x86_64
glusterfs-server-3.8.4-1.el7rhgs.x86_64
python-gluster-3.8.4-1.el7rhgs.noarch
glusterfs-devel-3.8.4-1.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-1.el7rhgs.x86_64
glusterfs-3.8.4-1.el7rhgs.x86_64
glusterfs-cli-3.8.4-1.el7rhgs.x86_64
glusterfs-events-3.8.4-1.el7rhgs.x86_64
[root@dhcp37-187 dir_samenames]#

Comment 2 Nithya Balachandran 2016-09-30 14:10:45 UTC

Steps to reproduce this:
1. Create a 2x2 volume. 
2. Fuse mount the volume and create dir1
3. Unmount volume
4. Delete dir1 manually on both bricks of any one replica set.
5. Mount the volume and do a lookup. DHT should see that the directory is missing and trigger a heal, causing this message to be logged.

Comment 3 Prasad Desala 2016-10-06 12:01:26 UTC

Glusterfs version: 3.8.4-2.el7rhgs.x86_64

Seeing similar warning messages in rebalance logs as well during rebalance.

[2016-10-06 10:09:11.181450] W [dict.c:418:dict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x4b320) [0x7efdb3b7d320] -->/lib64/libglusterfs.so.0(dict_set_str+0x2c) [0x7efdc5bce32c] -->/lib64/libglusterfs.so.0(dict_set+0xe6) [0x7efdc5bcc1e6] ) 0-dict: !this || !value for key=link-count [Invalid argument]
[2016-10-06 10:09:11.184983] I [dht-rebalance.c:2902:gf_defrag_process_dir] 0-distrep-dht: Migration operation on dir /manual/sticky/d3263 took 0.08 secs
[2016-10-06 10:09:11.191802] W [dict.c:418:dict_set] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x4b320) [0x7efdb3b7d320] -->/lib64/libglusterfs.so.0(dict_set_str+0x2c) [0x7efdc5bce32c] -->/lib64/libglusterfs.so.0(dict_set+0xe6) [0x7efdc5bcc1e6] ) 0-dict: !this || !value for key=link-count [Invalid argument]

Updated this BZ as the warning messages observed in both fuse client and rebalance logs looks similar. If not, please let me know I will open a new BZ for the warning messages seen in rebalance logs.

Steps that were performed:
==========================
1) Create a distributed replica volume and start it.
2) FUSE mount the volume and create files and directories.
3) Add few bricks to the volume.
4) Trigger rebalance.
5) monitor rebalance logs for the above warning messages... /var/log/glusterfs/<volname-rebalance.log>

Comment 4 Nithya Balachandran 2016-10-07 04:14:30 UTC

These are two separate test cases that trigger the same condition - healing of directories that are missing on some bricks. QE needs to decide whether the same BZ can be used to verify both scenarios.

Comment 5 Pranith Kumar K 2016-10-14 19:05:52 UTC

http://review.gluster.org/15646

Comment 9 Nag Pavan Chilakam 2016-11-07 06:32:20 UTC

QATP:
=====
Have rerun the case with fixed in build and didn't see any the warnings in all the below cases Hence moving to verified:

TC#1:
====
1. create same directory structure from two different clients
Result:not seeing the warning

TC#2:
====
1) Create a distributed replica volume and start it.
2) FUSE mount the volume and create files and directories.
3) Add few bricks to the volume.
4) Trigger rebalance.
5) monitor rebalance logs for the above warning messages... /var/log/glusterfs/<volname-rebalance.log>

Not seeing the warnings anymore


TC#3:
====
1. Create a 2x2 volume. 
2. Fuse mount the volume and create dir1
3. Unmount volume
4. Delete dir1 manually on both bricks of any one replica set.
5. Mount the volume and do a lookup. DHT should see that the directory is missing and trigger a heal, causing this message to be logged.

Not seeing warnings anymore


Hence moving to verified
[root@dhcp35-86 glusterfs]# rpm -qa|grep gluster
glusterfs-3.8.4-3.el7rhgs.x86_64
glusterfs-server-3.8.4-3.el7rhgs.x86_64
glusterfs-fuse-3.8.4-3.el7rhgs.x86_64
glusterfs-libs-3.8.4-3.el7rhgs.x86_64
glusterfs-api-3.8.4-3.el7rhgs.x86_64
glusterfs-cli-3.8.4-3.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-3.el7rhgs.x86_64

Comment 11 errata-xmlrpc 2017-03-23 06:07:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html

Note You need to log in before you can comment on or make changes to this bug.