Bug 1272408 - Data Tiering:[2015-10-15 02:54:52.259879] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-cold-dht: vgetxattr: Subvolume tiervolume-disperse-1 returned -1 [No such file or directory]
Data Tiering:[2015-10-15 02:54:52.259879] E [MSGID: 109039] [dht-common.c:283...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: tier (Show other bugs)
unspecified
Unspecified Unspecified
unspecified Severity unspecified
: ---
: RHGS 3.1.2
Assigned To: Ashish Pandey
Rahul Hinduja
: ZStream
Depends On: 1272401
Blocks: 1260783 1260923
  Show dependency treegraph
 
Reported: 2015-10-16 06:50 EDT by nchilaka
Modified: 2016-09-17 11:41 EDT (History)
6 users (show)

See Also:
Fixed In Version: glusterfs-3.7.5-11
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1272401
Environment:
Last Closed: 2016-03-01 00:41:50 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description nchilaka 2015-10-16 06:50:33 EDT
+++ This bug was initially created as a clone of Bug #1272401 +++

Description of problem:
=========================
On the longevity/stress setup we are getting the error message 
[2015-10-15 02:54:52.259879] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-cold-dht: vgetxattr: Subvolume tiervolume-disperse-1 returned -1 [No s
uch file or directory]


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-0.19.git0f5c3e8.el7.centos.x86_64



Steps Carried:
==============

1. Created 12 node cluster
2. Create tiered volume with Hot tier as (6 x 2) and Cold tier as (2 x (6 + 2) = 16)
3. Fuse Mount the volume on 3 clients RHEL7.2,RHEl7.1 and RHEL6.7
4. Start creating data from each client:

Client 1:
=========
[root@dj ~]# crefi --multi -n 10 -b 10 -d 10 --max=1024k --min=5k --random -T 5 -t text -I 5 --fop=create /mnt/fuse/

Client 2:
=========
[root@mia ~]# cd /mnt/fuse/
[root@mia fuse]# for i in {1..10}; do cp -rf /etc etc.$i ; sleep 100 ; done

Client 3:
=========
[root@wingo fuse]# for i in {1..999}; do dd if=/dev/zero of=dd.$i bs=1M count=1 ; sleep 10 ; done

5. After a while, the data creation of client 1 and client 2 should be completed while the data creation from client 3 will still be inprogress

6. At this point the data creation will be of only 1 file from client 3 in every 10 sec.

7. Monitor the cpu usage using top
Comment 5 Rahul Hinduja 2015-11-02 08:17:36 EST
For records: Able to hit it again while changing permissions of files in system 


[root@dhcp37-160 glusterfs]# grep "dht_vgetxattr_cbk" tiervolume-tier.log
[2015-11-02 09:04:32.847824] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-cold-dht: vgetxattr: Subvolume tiervolume-disperse-0 returned -1 [Input/output error]
[2015-11-02 09:04:32.847852] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-tier-dht: vgetxattr: Subvolume tiervolume-cold-dht returned -1 [Input/output error]
[2015-11-02 10:30:05.823787] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-cold-dht: vgetxattr: Subvolume tiervolume-disperse-0 returned -1 [No such file or directory]
[2015-11-02 10:30:05.823812] E [MSGID: 109039] [dht-common.c:2833:dht_vgetxattr_cbk] 0-tiervolume-tier-dht: vgetxattr: Subvolume tiervolume-cold-dht returned -1 [No such file or directory]
[root@dhcp37-160 glusterfs]#
Comment 7 Rahul Hinduja 2015-11-30 01:58:13 EST
This bug was reproducible while performing metadata changes. With the latest build glusterfs-3.7.5-7.el7rhgs.x86_64 metadata changes do not cause a migration. Performed fops like create,chmod,chown,chgrp,symlink,truncate,rename with the latest build and didn't observe these errors. 

Since we do not know the actual RCA, this bug should be kept open till the regression is done where stress of fops would be performed in test and cache mode.
Comment 9 Ashish Pandey 2015-12-13 23:40:19 EST
If a getxattr call is made on a file and if the file has been migrated from cold (EC) to hot, that might give an error that file does not exist.
This patch handles the issue by making cold volume as hashed volume and making sure to creat T files on hashed volume (EC). 
That also make sure that a getxattr on that file will see the file and xattr.
Comment 10 Rahul Hinduja 2015-12-17 08:30:24 EST
Verified with build: 

Ran the regression run on Tiered volume (Cold Tier: 2x(4+2) and Hot Tier: 2x2) which covers fops like create, chmod, chown, chgrp, symlink, rename, truncate. 

No errors reported. Moving the bug to verified state. 

[root@dhcp37-165 glusterfs]# grep -i "vgetxattr" tiervolume-*
[root@dhcp37-165 glusterfs]# 
[root@dhcp37-165 glusterfs]# grep -i " E " tiervolume-*
[root@dhcp37-165 glusterfs]#
Comment 13 errata-xmlrpc 2016-03-01 00:41:50 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html

Note You need to log in before you can comment on or make changes to this bug.