Bug 762160 (GLUSTER-428)

Summary: looks like stat cache not upto date..
Product: [Community] GlusterFS Reporter: Amar Tumballi <amarts>
Component: stat-prefetchAssignee: Raghavendra G <raghavendra>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: low    
Version: mainlineCC: vraman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Amar Tumballi 2009-12-03 02:44:08 UTC
Notice the below operations..

until I did a explicit stat, it never deleted the file.

[root@client08 ~]# rm /mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd
rm: remove regular file `/mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd'? y
[root@client08 ~]# rm /mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd 
rm: remove regular file `/mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd'? y
[root@client08 ~]# stat /mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd
  File: `/mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd'
  Size: 110             Blocks: 40         IO Block: 4096   regular file
Device: 1bh/27d Inode: 25952578    Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2009-12-02 17:52:55.000000000 -0800
Modify: 2009-12-02 17:55:05.000000000 -0800
Change: 2009-12-02 17:55:05.000000000 -0800
[root@client08 ~]# stat /mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd
  File: `/mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd'
  Size: 110             Blocks: 40         IO Block: 4096   regular file
Device: 1bh/27d Inode: 25952578    Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2009-12-02 17:52:55.000000000 -0800
Modify: 2009-12-02 17:55:05.000000000 -0800
Change: 2009-12-02 17:55:05.000000000 -0800
[root@client08 ~]# rm /mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd 
rm: remove regular file `/mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd'? y
[root@client08 ~]# rm /mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd 
rm: cannot lstat `/mount/availmedia/stripe/test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd': No such file or directory

for earlier 'rm's where unlink was just failing, gdb backtrace looked like below.

-----------
Breakpoint 1, fuse_unlink_cbk (frame=0x2aaaba3f78b0, cookie=0x2aaaba3f6e60, this=0x13344790, op_ret=-1,
    op_errno=0, preparent=0x0, postparent=0x0) at fuse-bridge.c:1216
1216    fuse-bridge.c: No such file or directory.
        in fuse-bridge.c
(gdb) bt
#0  fuse_unlink_cbk (frame=0x2aaaba3f78b0, cookie=0x2aaaba3f6e60, this=0x13344790, op_ret=-1, op_errno=0,
    preparent=0x0, postparent=0x0) at fuse-bridge.c:1216
#1  0x00002b5d64bc41aa in sp_unlink (frame=0x2aaaba3f6e60, this=<value optimized out>,
    loc=<value optimized out>) at stat-prefetch.c:2407
#2  0x00002b5d64dd1a32 in fuse_unlink (this=0x13344790, finh=<value optimized out>, msg=<value optimized out>)
    at fuse-bridge.c:1444
#3  0x00002b5d64dd6466 in fuse_thread_proc (data=<value optimized out>) at fuse-bridge.c:3170
#4  0x0000003366a064a7 in start_thread () from /lib64/libpthread.so.0
#5  0x00000033662d3c2d in clone () from /lib64/libc.so.6
(gdb) up
#1  0x00002b5d64bc41aa in sp_unlink (frame=0x2aaaba3f6e60, this=<value optimized out>,
    loc=<value optimized out>) at stat-prefetch.c:2407
--------------

notice that unlink returned from sp_unlink itself, with op_ret -1, and errno 0 :O

Actually there were some other operations too, where this behavior was happening (for the same file).
--------------

Corresponding logs in log file said.. (there are few more lines of similar log on same file)

[2009-12-02 17:57:37] W [fuse-bridge.c:911:fuse_fd_cbk] glusterfs-fuse: 3542237: OPEN() /test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd => -1 (Success)
[2009-12-02 17:57:37] W [fuse-bridge.c:911:fuse_fd_cbk] glusterfs-fuse: 3542238: OPEN() /test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd => -1 (Success)
[2009-12-02 17:57:40] W [fuse-bridge.c:720:fuse_truncate_cbk] glusterfs-fuse: 3542962: TRUNCATE() /test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd => -1 (Success)
[2009-12-02 17:57:40] W [fuse-bridge.c:965:fuse_setattr_cbk] glusterfs-fuse: 3542962: SETATTR() /test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd => -1 (Success)
[2009-12-02 17:57:56] W [fuse-bridge.c:1238:fuse_unlink_cbk] glusterfs-fuse: 3577415: UNLINK() /test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd => -1 (Success)
[2009-12-02 18:20:38] W [fuse-bridge.c:1238:fuse_unlink_cbk] glusterfs-fuse: 3577459: UNLINK() /test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd => -1 (Success)
[2009-12-02 18:31:00] W [fuse-bridge.c:1238:fuse_unlink_cbk] glusterfs-fuse: 3577668: UNLINK() /test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd => -1 (Success)
[2009-12-02 18:36:48] W [fuse-bridge.c:1238:fuse_unlink_cbk] glusterfs-fuse: 3577682: UNLINK() /test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd => -1 (Success)
[2009-12-02 18:36:51] W [fuse-bridge.c:2608:fuse_xattr_cbk] glusterfs-fuse: 3577690: GETXATTR() /test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd => -1 (Success)
[2009-12-02 18:36:53] W [fuse-bridge.c:2608:fuse_xattr_cbk] glusterfs-fuse: 3577697: GETXATTR() /test2/linux-2.6.31.6/scripts/kconfig/.conf.cmd => -1 (Success)
-----

after this i did the stat, and unlink succeeded..

Comment 1 Anand Avati 2009-12-05 09:21:32 UTC
PATCH: http://patches.gluster.com/patch/2558 in master (performance/stat-prefetch: don't check whether previous lookups have been failed in sp_process_inode_ctx.)

Comment 2 Anand Avati 2009-12-06 14:28:44 UTC
PATCH: http://patches.gluster.com/patch/2559 in release-2.0 (performance/stat-prefetch: don't check whether previous lookups have been failed in sp_process_inode_ctx.)

Comment 3 Amar Tumballi 2009-12-06 20:09:17 UTC
not seen this problem after the fix.. Will reopen if seen. currently resolving it.