Description of problem: ======================== After enabling USS and quotas, I did IO population using 3 NFS clients(used different servers for different clients to mount), with one doing linux untar, the other two doing file creates using dd command. I then triggered an attach tier with IOs still going on. I saw the following error for linux untar : inux-4.4.1/arch/powerpc/include/uapi/asm/termios.h tar: linux-4.4.1/arch/powerpc/include/uapi/asm/termios.h: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/include/uapi/asm/tm.h tar: linux-4.4.1/arch/powerpc/include/uapi/asm/tm.h: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/include/uapi/asm/types.h tar: linux-4.4.1/arch/powerpc/include/uapi/asm/types.h: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/include/uapi/asm/ucontext.h tar: linux-4.4.1/arch/powerpc/include/uapi/asm/ucontext.h: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/include/uapi/asm/unistd.h tar: linux-4.4.1/arch/powerpc/include/uapi/asm/unistd.h: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/kernel/ tar: linux-4.4.1/arch/powerpc/kernel: Cannot mkdir: Stale file handle linux-4.4.1/arch/powerpc/kernel/.gitignore tar: linux-4.4.1/arch/powerpc/kernel/.gitignore: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/kernel/Makefile tar: linux-4.4.1/arch/powerpc/kernel/Makefile: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/kernel/align.c tar: linux-4.4.1/arch/powerpc/kernel/align.c: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/kernel/asm-offsets.c tar: linux-4.4.1/arch/powerpc/kernel/asm-offsets.c: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/kernel/audit.c tar: linux-4.4.1/arch/powerpc/kernel/audit.c: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/kernel/btext.c tar: linux-4.4.1/arch/powerpc/kernel/btext.c: Cannot open: Stale file handle linux-4.4.1/arch/powerpc/kernel/cacheinfo.c And the untar failed . Version-Release number of selected component (if applicable): ========================= 3.7.5-19 How reproducible: ============== i either hit this bug or the 1306194: NFS+attach tier:IOs hang while attach tier is issued Out of 5 times I retried, I hit twice this bug and the remaining 3 times the nfs hang Steps to reproduce: ================= 1)client1:created a 300Mb file and started to copy the file to new files for i in {2..50};do cp hlfile.1 hlfile.$i;done 2)client2:created 50Mb file and initiated a rename of file continuously for i in {2..1000};do cp rename.1 rename.$i;done 3)client3: linux untar 4)copying a 3GB file to create new files in loop for i in {1..10};do cp File.mkv cheema$i.mkv;done Volume Name: finalvol Type: Tier Volume ID: 15a9fbaa-7e45-4302-b246-19e48cbdf059 Status: Started Number of Bricks: 36 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 6 x 2 = 12 Brick1: 10.70.35.239:/bricks/brick7/final_hot Brick2: 10.70.35.133:/bricks/brick7/final_hot Brick3: 10.70.37.202:/bricks/brick7/final_hot Brick4: 10.70.37.195:/bricks/brick7/final_hot Brick5: 10.70.37.120:/bricks/brick7/final_hot Brick6: 10.70.37.60:/bricks/brick7/final_hot Brick7: 10.70.37.69:/bricks/brick7/final_hot Brick8: 10.70.37.101:/bricks/brick7/final_hot Brick9: 10.70.35.163:/bricks/brick7/final_hot Brick10: 10.70.35.173:/bricks/brick7/final_hot Brick11: 10.70.35.232:/bricks/brick7/final_hot Brick12: 10.70.35.176:/bricks/brick7/final_hot Cold Tier: Cold Tier Type : Distributed-Disperse Number of Bricks: 2 x (8 + 4) = 24 Brick13: 10.70.37.202:/bricks/brick1/finalvol Brick14: 10.70.37.195:/bricks/brick1/finalvol Brick15: 10.70.35.133:/bricks/brick1/finalvol Brick16: 10.70.35.239:/bricks/brick1/finalvol Brick17: 10.70.35.225:/bricks/brick1/finalvol Brick18: 10.70.35.11:/bricks/brick1/finalvol Brick19: 10.70.35.10:/bricks/brick1/finalvol Brick20: 10.70.35.231:/bricks/brick1/finalvol Brick21: 10.70.35.176:/bricks/brick1/finalvol Brick22: 10.70.35.232:/bricks/brick1/finalvol Brick23: 10.70.35.173:/bricks/brick1/finalvol Brick24: 10.70.35.163:/bricks/brick1/finalvol Brick25: 10.70.37.101:/bricks/brick1/finalvol Brick26: 10.70.37.69:/bricks/brick1/finalvol Brick27: 10.70.37.60:/bricks/brick1/finalvol Brick28: 10.70.37.120:/bricks/brick1/finalvol Brick29: 10.70.37.202:/bricks/brick2/finalvol Brick30: 10.70.37.195:/bricks/brick2/finalvol Brick31: 10.70.35.133:/bricks/brick2/finalvol Brick32: 10.70.35.239:/bricks/brick2/finalvol Brick33: 10.70.35.225:/bricks/brick2/finalvol Brick34: 10.70.35.11:/bricks/brick2/finalvol Brick35: 10.70.35.10:/bricks/brick2/finalvol Brick36: 10.70.35.231:/bricks/brick2/finalvol Options Reconfigured: cluster.tier-mode: cache features.ctr-enabled: on features.uss: enable features.quota-deem-statfs: on features.inode-quota: on features.quota: on performance.readdir-ahead: on [root@dhcp37-202 ~]# NOTE: If we did an uss disable I see the nfs hang issue
mount points: Server:Client-->IOtype Mount1: 10.70.35.133:rhs-client4----> file rename #for i in {1..2000};do mv -f foolu.$i qwer.$i ;done MOunt2: 10.70.35.225:rhs-client9.lab.eng.blr.redhat.com---> linux untar as below #date;date >> untar.log;for i in {1..5};do mkdir dir.$i;echo "created dir.$i" >>untar.log;cp linux-4.4.1.tar.xz dir.$i/;echo "copied kernel tar to dir.$i and will start untarring kernel" >>untar.log ;tar -xvf dir.$i/linux-4.4.1.tar.xz -C dir.$i/;echo "linux untar done in dir.$i" >>untar.log;date >> untar.log;done;date Mount3: 10.70.37.101:rhs-client30 --->"for i in {1..1000};do dd if=/dev/urandom of=file.$i bs=1024 count=10000;done"
@ Laura: this is a separate issue for which Avra has mentioned the doc text field to be uupdated
I am removing the need info tag , kindly re-tag if the above discussion doesnt solve the purpose
We tried following the steps mentioned in the bug, where we created a volume, enabled uss on it, and mounted it via nfs on 3 mount points. Stared copying /etc dir from 2 mount points in loop, and untared linux tarball from the other mount point. While this i/o was going on we tried attaching tier. Attach tier was successful, and there was neither any i/o hang nor any stale file handle. We repeated the above 7 times, with the same outcome. It would be great if we can get some help reproducing the issue so that we can rca it.
Nagpavan, Can you please retest it and see if its reproducible? ~Atin
changed needinfo assignee to karthick as he works on tiering
Have not seen this issue with the recent tests on tiering with nfs mount. I'll update the bug with the logs if the issue is seen in the future.
Thank you for your bug report. This bug has documentation updated on the problem reported. Further we are no longer releasing any bug fixes or, other updates for Tier. This bug will be set to CLOSED WONTFIX to reflect this.