Hide Forgot
Description of problem: ======================= I created a distributed disperse volume and mounted it using fuse. I copied linux kernel and started to do following in a loop: create a dir copy the kernel and untar....for 1000 times Now after I soaked this setup(kept it for about 1-2 days), while the IOs were happening I attached a dist-rep hot tier. Now i started to do other IOs from other mount points. After a day, I issued a ls -laRt on the mount point to list all files I saw the following issue: [root@rhsauto015 stress]# date >>listing;ls -laRt >>listing;date >>listing; ls: cannot access ./kern.legacy/dir_rename.97/linux-4.3.3/tools/virtio/vhost_test: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/net/netlabel: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/fs/jfs: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/fs/ocfs2/dlm: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/usb/dwc3: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/usb/wusbcore: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/scsi/mpt3sas: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/pci/pcie: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/hamradio: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/ppp: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/vmxnet3: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/wireless/mwifiex: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/ethernet/marvell: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/net/ethernet/ezchip: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/media/rc: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/media/usb/cx231xx: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/media/pci/ivtv: Invalid argument ls: cannot access ./kern.legacy/dir_rename.96/linux-4.3.3/drivers/media/pci/b2c2: Invalid argument Mount log: [2015-12-29 01:22:53.903078] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.5/xlator/cl uster/replicate.so(afr_replies_interpret+0x1ad) [0x7f8702952eed] -->/usr/lib64/glusterfs/3.7. 5/xlator/cluster/replicate.so(afr_accuse_smallfiles+0x66) [0x7f8702952cb6] -->/lib64/libglust erfs.so.0(dict_get+0xac) [0x7f8710cfe0cc] ) 0-dict: !this || key=glusterfs.bad-inode [Invalid argument] [2015-12-29 01:22:53.922168] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 637309: LOOKUP() /kern.legacy/dir_rename.97/linux-4.3.3/tools/virtio/vhost_test => -1 (Invalid argument) [2015-12-29 01:22:54.736650] W [MSGID: 122056] [ec-combine.c:866:ec_combine_check] 0-stress-disperse-0: Mismatching xdata in answers of 'LOOKUP' [2015-12-29 01:22:54.736691] W [MSGID: 122053] [ec-common.c:116:ec_check_status] 0-stress-disperse-0: Operation failed on some subvolumes (up=3F, mask=3F, remaining=0, good=37, bad=8) [2015-12-29 01:22:54.736705] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 0-stress-disperse-0: Heal failed [Invalid argument] Version-Release number of selected component (if applicable): ==================== 3.7.5-13 Other observation/Notes: My promotes/demotes haven't yet started even after 2 days, as it probably seems like my fix layout is still in progress Sos reports and other logs will be attached
sosreports: rhsqe-repo.lab.eng.blr.redhat.com:/home/repo/sosreports/nchilaka
Observation- All the files for which we got Invalid Argument error during "ls -laRt", following logs have been found in mount logs - [2015-12-29 01:22:53.898401] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-stress-tier-dht: Found anomalies in /kern.legacy/dir_rename.97/linux-4.3.3/tools/virtio/vhost_test (gfid = 00000000-0000-0000-0000-000000000000). Holes=1 overlaps=0 [2015-12-29 01:22:53.922168] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 637309: LOOKUP() /kern.legacy/dir_rename.97/linux-4.3.3/tools/virtio/vhost_test => -1 (Invalid argument) Now, when we do lookup on same file on the given setup, it does not give any error message.
Tried to reproduce this bug on my local setup but could not get success. By looking at logs, it looks like some rename operations were also happening which could be one of the trigger for this issue. Could you please mention what were the other "IOs" you were doing from other mount points. Was it rename of dir to dir_rename.XX ? and you started ls -lRt while it was going on?