Bug 1065705 - ls & stat give inconsistent results after file deletion by another client
Summary: ls & stat give inconsistent results after file deletion by another client
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: 3.4.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Vinayaga Raman
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-16 02:33 UTC by Louis Zuckerman
Modified: 2014-03-31 01:29 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-02-20 04:24:12 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Louis Zuckerman 2014-02-16 02:33:36 UTC
Description of problem:

Immediately after a file is deleted in one fuse client other clients can not see the file in a directory listing but can still stat the file for about one second.

I have tried disabling various performance xlators (see volume info below) but this behavior persists.

I have only tried this on a trivial one-brick volume, mounted locally, which I am using to test a libgfapi client.  I don't know if this problem affects multi-brick volumes or non-local clients.

Version-Release number of selected component (if applicable):

glusterfs 3.4.2 on ubuntu saucy

Linux zzbp 3.11.0-15-generic #25-Ubuntu SMP Thu Jan 30 17:22:01 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:

Consistently, on different localhost mounts of my one-brick test volume.  I first noticed this behavior happening between a fuse mount where the file was being deleted and a libgfapi client which was monitoring the directory.

I then tried reproducing the problem using two fuse clients and plain old bash commands and it happens every time.

Steps to Reproduce:

1. Make two fuse mounts of the same volume, /mnt/foo1 & /mnt/foo2, and open a shell in each one
2. In /mnt/foo1, create a test file
   echo test > test.file
3. In /mnt/foo2, run this command to monitor the file
   while true; do sleep 0.1; ls -la | grep test.file; stat -t test.file; done
4. Back in /mnt/foo1, delete the test file
   rm test.file
5. Check the output from the monitor in /mnt/foo2

Actual results:

Notice the output from stat changes 10 iterations after the output from ls changes, that's a full second later.

-rw-r--r-- 1 root  root     5 Feb 15 21:18 test.file
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
-rw-r--r-- 1 root  root     5 Feb 15 21:18 test.file
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
-rw-r--r-- 1 root  root     5 Feb 15 21:18 test.file
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
test.file 5 1 81a4 0 0 26 9449262257167296185 1 0 0 1392517108 1392517108 1392517108 0 131072
stat: cannot stat ‘test.file’: No such file or directory
stat: cannot stat ‘test.file’: No such file or directory
stat: cannot stat ‘test.file’: No such file or directory
stat: cannot stat ‘test.file’: No such file or directory



Expected results:

The file should disappear from ls & stat at the same time, not one second apart.

Additional info:

My test volume details:

Volume Name: foo
Type: Distribute
Volume ID: 4bb5d118-be31-4bdb-a645-9a045241cc37
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: gluster:/var/tmp/foo
Options Reconfigured:
performance.md-cache-timeout: 0
performance.cache-refresh-timeout: 0
performance.flush-behind: off
diagnostics.brick-log-level: TRACE
diagnostics.client-log-level: DEBUG
server.allow-insecure: on
performance.quick-read: off
performance.write-behind: off

As mentioned above I tried disabling some performance xlators to get rid of this problem but nothing made a difference.


On a single client mount ls & stat are consistent immediately after deletion.
For example...
root@zzbp:/mnt/foo1# echo test > test.file; sleep 1; rm test.file; while true; do ls -la | grep test.file; stat -t test.file; sleep 0.1; done
stat: cannot stat ‘test.file’: No such file or directory
stat: cannot stat ‘test.file’: No such file or directory


Thank you very much,
-louis

Comment 1 Louis Zuckerman 2014-02-20 04:24:12 UTC
Vijay recommended on IRC that I try mounting the FUSE clients with "-o attribute-timeout=0" and that resolved the problem.

Thanks, Vijay!


Note You need to log in before you can comment on or make changes to this bug.