Bug 1272408 - Data Tiering:[2015-10-15 02:54:52.259879] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-cold-dht: vgetxattr: Subvolume tiervolume-disperse-1 returned -1 [No such file or directory]
Summary: Data Tiering:[2015-10-15 02:54:52.259879] E [MSGID: 109039] [dht-common.c:283...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: tier
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.1.2
Assignee: Ashish Pandey
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On: 1272401
Blocks: 1260783 1260923
TreeView+ depends on / blocked
 
Reported: 2015-10-16 10:50 UTC by Nag Pavan Chilakam
Modified: 2019-04-03 09:15 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.7.5-11
Doc Type: Bug Fix
Doc Text:
Clone Of: 1272401
Environment:
Last Closed: 2016-03-01 05:41:50 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0193 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.1 update 2 2016-03-01 10:20:36 UTC

Description Nag Pavan Chilakam 2015-10-16 10:50:33 UTC
+++ This bug was initially created as a clone of Bug #1272401 +++

Description of problem:
=========================
On the longevity/stress setup we are getting the error message 
[2015-10-15 02:54:52.259879] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-cold-dht: vgetxattr: Subvolume tiervolume-disperse-1 returned -1 [No s
uch file or directory]


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-0.19.git0f5c3e8.el7.centos.x86_64



Steps Carried:
==============

1. Created 12 node cluster
2. Create tiered volume with Hot tier as (6 x 2) and Cold tier as (2 x (6 + 2) = 16)
3. Fuse Mount the volume on 3 clients RHEL7.2,RHEl7.1 and RHEL6.7
4. Start creating data from each client:

Client 1:
=========
[root@dj ~]# crefi --multi -n 10 -b 10 -d 10 --max=1024k --min=5k --random -T 5 -t text -I 5 --fop=create /mnt/fuse/

Client 2:
=========
[root@mia ~]# cd /mnt/fuse/
[root@mia fuse]# for i in {1..10}; do cp -rf /etc etc.$i ; sleep 100 ; done

Client 3:
=========
[root@wingo fuse]# for i in {1..999}; do dd if=/dev/zero of=dd.$i bs=1M count=1 ; sleep 10 ; done

5. After a while, the data creation of client 1 and client 2 should be completed while the data creation from client 3 will still be inprogress

6. At this point the data creation will be of only 1 file from client 3 in every 10 sec.

7. Monitor the cpu usage using top

Comment 5 Rahul Hinduja 2015-11-02 13:17:36 UTC
For records: Able to hit it again while changing permissions of files in system 


[root@dhcp37-160 glusterfs]# grep "dht_vgetxattr_cbk" tiervolume-tier.log
[2015-11-02 09:04:32.847824] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-cold-dht: vgetxattr: Subvolume tiervolume-disperse-0 returned -1 [Input/output error]
[2015-11-02 09:04:32.847852] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-tier-dht: vgetxattr: Subvolume tiervolume-cold-dht returned -1 [Input/output error]
[2015-11-02 10:30:05.823787] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-cold-dht: vgetxattr: Subvolume tiervolume-disperse-0 returned -1 [No such file or directory]
[2015-11-02 10:30:05.823812] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-tier-dht: vgetxattr: Subvolume tiervolume-cold-dht returned -1 [No such file or directory]
[root@dhcp37-160 glusterfs]#

Comment 7 Rahul Hinduja 2015-11-30 06:58:13 UTC
This bug was reproducible while performing metadata changes. With the latest build glusterfs-3.7.5-7.el7rhgs.x86_64 metadata changes do not cause a migration. Performed fops like create,chmod,chown,chgrp,symlink,truncate,rename with the latest build and didn't observe these errors. 

Since we do not know the actual RCA, this bug should be kept open till the regression is done where stress of fops would be performed in test and cache mode.

Comment 9 Ashish Pandey 2015-12-14 04:40:19 UTC
If a getxattr call is made on a file and if the file has been migrated from cold (EC) to hot, that might give an error that file does not exist.
This patch handles the issue by making cold volume as hashed volume and making sure to creat T files on hashed volume (EC). 
That also make sure that a getxattr on that file will see the file and xattr.

Comment 10 Rahul Hinduja 2015-12-17 13:30:24 UTC
Verified with build: 

Ran the regression run on Tiered volume (Cold Tier: 2x(4+2) and Hot Tier: 2x2) which covers fops like create, chmod, chown, chgrp, symlink, rename, truncate. 

No errors reported. Moving the bug to verified state. 

[root@dhcp37-165 glusterfs]# grep -i "vgetxattr" tiervolume-*
[root@dhcp37-165 glusterfs]# 
[root@dhcp37-165 glusterfs]# grep -i " E " tiervolume-*
[root@dhcp37-165 glusterfs]#

Comment 13 errata-xmlrpc 2016-03-01 05:41:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html


Note You need to log in before you can comment on or make changes to this bug.