Bug 1294657

Summary: ls -laRt throwing invalid argument errors
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: disperseAssignee: Ashish Pandey <aspandey>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: medium Docs Contact:
Priority: medium    
Version: rhgs-3.1CC: aspandey, mchangir, nchilaka, pkarampu, rcyriac, rhs-bugs, sankarshan
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-09 08:58:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nag Pavan Chilakam 2015-12-29 13:47:21 UTC
Description of problem:
=======================
I created a distributed disperse volume and mounted it using fuse.
I copied linux kernel and started to do following in a loop:
create a dir copy the kernel and untar....for 1000 times
Now after I soaked this setup(kept it for about 1-2 days), while the IOs were happening I attached a dist-rep hot tier.
Now i started to do other IOs from other mount points.

After a day, I issued a ls -laRt on the mount point to list all files

I saw the following issue:


[root@rhsauto015 stress]# date >>listing;ls -laRt >>listing;date >>listing;
ls: cannot access ./kern.legacy/dir_rename.97/linux-4.3.3/tools/virtio/vhost_test: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/net/netlabel: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/fs/jfs: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/fs/ocfs2/dlm: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/usb/dwc3: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/usb/wusbcore: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/scsi/mpt3sas: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/pci/pcie: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/hamradio: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/ppp: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/vmxnet3: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/wireless/mwifiex: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/ethernet/marvell: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/ethernet/ezchip: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/media/rc: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/media/usb/cx231xx: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/media/pci/ivtv: Invalid argument
ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/media/pci/b2c2: Invalid argument



Mount log:


[2015-12-29 01:22:53.903078] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.5/xlator/cl
uster/replicate.so(afr_replies_interpret+0x1ad) [0x7f8702952eed] -->/usr/lib64/glusterfs/3.7.
5/xlator/cluster/replicate.so(afr_accuse_smallfiles+0x66) [0x7f8702952cb6] -->/lib64/libglust
erfs.so.0(dict_get+0xac) [0x7f8710cfe0cc] ) 0-dict: !this || key=glusterfs.bad-inode [Invalid
 argument]
[2015-12-29 01:22:53.922168] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 637309: LOOKUP() /kern.legacy/dir_rename.97/linux-4.3.3/tools/virtio/vhost_test => -1 (Invalid argument)
[2015-12-29 01:22:54.736650] W [MSGID: 122056] [ec-combine.c:866:ec_combine_check] 0-stress-disperse-0: Mismatching xdata in answers of 'LOOKUP'
[2015-12-29 01:22:54.736691] W [MSGID: 122053] [ec-common.c:116:ec_check_status] 0-stress-disperse-0: Operation failed on some subvolumes (up=3F, mask=3F, remaining=0, good=37, bad=8)
[2015-12-29 01:22:54.736705] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 0-stress-disperse-0: Heal failed [Invalid argument]






Version-Release number of selected component (if applicable):
====================
3.7.5-13



Other observation/Notes:
My promotes/demotes haven't yet started even after 2 days, as it probably seems like my fix layout is still in progress




Sos reports and other logs will be attached

Comment 2 Nag Pavan Chilakam 2015-12-29 13:54:22 UTC
sosreports:
rhsqe-repo.lab.eng.blr.redhat.com:/home/repo/sosreports/nchilaka

Comment 4 Ashish Pandey 2015-12-31 06:03:58 UTC
Observation- 

All the files for which we got Invalid Argument error during "ls -laRt", following logs have been found in mount logs - 

[2015-12-29 01:22:53.898401] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-stress-tier-dht: Found anomalies in /kern.legacy/dir_rename.97/linux-4.3.3/tools/virtio/vhost_test (gfid = 00000000-0000-0000-0000-000000000000). Holes=1 overlaps=0


[2015-12-29 01:22:53.922168] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 637309: LOOKUP() /kern.legacy/dir_rename.97/linux-4.3.3/tools/virtio/vhost_test => -1 (Invalid argument)

Now, when we do lookup on same file on the given setup, it does not give any error message.

Comment 5 Ashish Pandey 2016-01-07 09:24:30 UTC
Tried to reproduce this bug on my local setup but could not get success.

By looking at logs, it looks like some rename operations were also happening which could be one of the trigger for this issue.

Could you please mention what were the other "IOs" you were doing from other mount points. Was it rename of dir to dir_rename.XX ? and you started ls -lRt while it was going on?