Bug 1306917 - With USS enabled, and during Attach tier, Seeing IO error "Cannot open: Stale file handle"
With USS enabled, and during Attach tier, Seeing IO error "Cannot open: Stale...
Status: NEW
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: tier (Show other bugs)
3.1
Unspecified Unspecified
unspecified Severity urgent
: ---
: ---
Assigned To: Bug Updates Notification Mailing List
krishnaram Karthick
tier-interops
: ZStream
Depends On:
Blocks: 1268895
  Show dependency treegraph
 
Reported: 2016-02-12 02:52 EST by nchilaka
Modified: 2017-06-28 05:06 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
When a User Serviceable Snapshot is enabled, attaching a tier succeeds, but any I/O operations in progress during the attach tier operation may fail with stale file handle errors. Workaround: Disable User Serviceable Snapshots before performing attach tier. Once attach tier has succeeded, User Serviceable Snapshots can be enabled.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description nchilaka 2016-02-12 02:52:50 EST
Description of problem:
========================
After enabling USS and quotas, I did IO population using 3 NFS clients(used different servers for different clients to mount), with one doing linux untar, the other two doing file creates using dd command.
I then triggered an attach tier with IOs still going on.
I saw the following error for linux untar :
inux-4.4.1/arch/powerpc/include/uapi/asm/termios.h
tar: linux-4.4.1/arch/powerpc/include/uapi/asm/termios.h: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/include/uapi/asm/tm.h
tar: linux-4.4.1/arch/powerpc/include/uapi/asm/tm.h: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/include/uapi/asm/types.h
tar: linux-4.4.1/arch/powerpc/include/uapi/asm/types.h: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/include/uapi/asm/ucontext.h
tar: linux-4.4.1/arch/powerpc/include/uapi/asm/ucontext.h: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/include/uapi/asm/unistd.h
tar: linux-4.4.1/arch/powerpc/include/uapi/asm/unistd.h: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/kernel/
tar: linux-4.4.1/arch/powerpc/kernel: Cannot mkdir: Stale file handle
linux-4.4.1/arch/powerpc/kernel/.gitignore
tar: linux-4.4.1/arch/powerpc/kernel/.gitignore: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/kernel/Makefile
tar: linux-4.4.1/arch/powerpc/kernel/Makefile: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/kernel/align.c
tar: linux-4.4.1/arch/powerpc/kernel/align.c: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/kernel/asm-offsets.c
tar: linux-4.4.1/arch/powerpc/kernel/asm-offsets.c: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/kernel/audit.c
tar: linux-4.4.1/arch/powerpc/kernel/audit.c: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/kernel/btext.c
tar: linux-4.4.1/arch/powerpc/kernel/btext.c: Cannot open: Stale file handle
linux-4.4.1/arch/powerpc/kernel/cacheinfo.c


And the untar failed .



Version-Release number of selected component (if applicable):
=========================
3.7.5-19


How reproducible:
==============
i either hit this bug or the 1306194: NFS+attach tier:IOs hang while attach tier is issued 
Out of 5 times I retried, I hit twice this bug and the remaining 3 times the nfs hang

Steps to reproduce:
=================

1)client1:created a 300Mb file and started to copy the file to new files 
for i in {2..50};do cp hlfile.1 hlfile.$i;done

2)client2:created 50Mb file and initiated a rename of file continuously 
for i in {2..1000};do cp rename.1 rename.$i;done

3)client3: linux untar
4)copying a 3GB file to create new files in loop 
for i in {1..10};do cp File.mkv cheema$i.mkv;done



Volume Name: finalvol
Type: Tier
Volume ID: 15a9fbaa-7e45-4302-b246-19e48cbdf059
Status: Started
Number of Bricks: 36
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 6 x 2 = 12
Brick1: 10.70.35.239:/bricks/brick7/final_hot
Brick2: 10.70.35.133:/bricks/brick7/final_hot
Brick3: 10.70.37.202:/bricks/brick7/final_hot
Brick4: 10.70.37.195:/bricks/brick7/final_hot
Brick5: 10.70.37.120:/bricks/brick7/final_hot
Brick6: 10.70.37.60:/bricks/brick7/final_hot
Brick7: 10.70.37.69:/bricks/brick7/final_hot
Brick8: 10.70.37.101:/bricks/brick7/final_hot
Brick9: 10.70.35.163:/bricks/brick7/final_hot
Brick10: 10.70.35.173:/bricks/brick7/final_hot
Brick11: 10.70.35.232:/bricks/brick7/final_hot
Brick12: 10.70.35.176:/bricks/brick7/final_hot
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (8 + 4) = 24
Brick13: 10.70.37.202:/bricks/brick1/finalvol
Brick14: 10.70.37.195:/bricks/brick1/finalvol
Brick15: 10.70.35.133:/bricks/brick1/finalvol
Brick16: 10.70.35.239:/bricks/brick1/finalvol
Brick17: 10.70.35.225:/bricks/brick1/finalvol
Brick18: 10.70.35.11:/bricks/brick1/finalvol
Brick19: 10.70.35.10:/bricks/brick1/finalvol
Brick20: 10.70.35.231:/bricks/brick1/finalvol
Brick21: 10.70.35.176:/bricks/brick1/finalvol
Brick22: 10.70.35.232:/bricks/brick1/finalvol
Brick23: 10.70.35.173:/bricks/brick1/finalvol
Brick24: 10.70.35.163:/bricks/brick1/finalvol
Brick25: 10.70.37.101:/bricks/brick1/finalvol
Brick26: 10.70.37.69:/bricks/brick1/finalvol
Brick27: 10.70.37.60:/bricks/brick1/finalvol
Brick28: 10.70.37.120:/bricks/brick1/finalvol
Brick29: 10.70.37.202:/bricks/brick2/finalvol
Brick30: 10.70.37.195:/bricks/brick2/finalvol
Brick31: 10.70.35.133:/bricks/brick2/finalvol
Brick32: 10.70.35.239:/bricks/brick2/finalvol
Brick33: 10.70.35.225:/bricks/brick2/finalvol
Brick34: 10.70.35.11:/bricks/brick2/finalvol
Brick35: 10.70.35.10:/bricks/brick2/finalvol
Brick36: 10.70.35.231:/bricks/brick2/finalvol
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
features.uss: enable
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
[root@dhcp37-202 ~]# 



NOTE: If we did an uss disable I see the nfs hang issue
Comment 5 nchilaka 2016-02-15 00:36:11 EST
mount points:
Server:Client-->IOtype
Mount1:
10.70.35.133:rhs-client4----> file rename
#for i in {1..2000};do mv -f foolu.$i qwer.$i ;done

MOunt2:
10.70.35.225:rhs-client9.lab.eng.blr.redhat.com---> linux untar as below
#date;date >> untar.log;for i in {1..5};do mkdir dir.$i;echo "created dir.$i" >>untar.log;cp linux-4.4.1.tar.xz dir.$i/;echo "copied kernel tar to dir.$i and will start untarring kernel" >>untar.log ;tar -xvf dir.$i/linux-4.4.1.tar.xz -C dir.$i/;echo "linux untar done in dir.$i" >>untar.log;date >> untar.log;done;date

Mount3:
10.70.37.101:rhs-client30 --->"for i in {1..1000};do dd if=/dev/urandom of=file.$i bs=1024 count=10000;done"
Comment 7 nchilaka 2016-02-17 01:20:31 EST
@ Laura:
this is a separate issue for which Avra has mentioned the doc text field to be uupdated
Comment 11 nchilaka 2016-03-01 08:23:48 EST
I am removing the need info tag , kindly re-tag if the above discussion doesnt solve the purpose
Comment 12 Avra Sengupta 2016-03-03 03:40:05 EST
We tried following the steps mentioned in the bug, where we created a volume, enabled uss on it, and mounted it via nfs on 3 mount points. Stared copying /etc dir from 2 mount points in loop, and untared linux  tarball from the other mount point. While this i/o was going on we tried attaching tier. Attach tier was successful, and there was neither any i/o hang nor any stale file handle.

We repeated the above 7 times, with the same outcome. It would be great if we can get some help reproducing the issue so that we can rca it.
Comment 14 Atin Mukherjee 2016-03-10 07:33:05 EST
Nagpavan,

Can you please retest it and see if its reproducible?

~Atin
Comment 16 nchilaka 2016-05-09 08:03:40 EDT
changed needinfo assignee to karthick as he works on tiering
Comment 17 krishnaram Karthick 2016-05-25 12:24:41 EDT
Have not seen this issue with the recent tests on tiering with nfs mount. I'll update the bug with the logs if the issue is seen in the future.

Note You need to log in before you can comment on or make changes to this bug.