Bug 762536 (GLUSTER-804) - [NFS] ESTALE with Fsstress test
Summary: [NFS] ESTALE with Fsstress test
Keywords:
Status: CLOSED WORKSFORME
Alias: GLUSTER-804
Product: GlusterFS
Classification: Community
Component: distribute
Version: mainline
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Anand Avati
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-04-06 07:29 UTC by Anush Shetty
Modified: 2015-09-01 23:04 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Anush Shetty 2010-04-06 07:29:57 UTC
Setup: Distribute-Replicate with NFS server and no performance translators

Executing /opt/qa/tools/ltp-full-20091031/testcases/kernel/fs//fsstress/fsstress
seed = 1269971658
rm: cannot remove directory `/mnt/nfs/client1/run11938//ltp': Directory not empty
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
pc: Stale NFS file handle
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
p10: Stale NFS file handle
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
p14: Stale NFS file handle
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
p8: Stale NFS file handle
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
p15: Stale NFS file handle
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
p2: Stale NFS file handle
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
p11: Stale NFS file handle
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
p10: Stale NFS file handle
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy
p9: Stale NFS file handle
rm: cannot remove `/mnt/nfs/client1/run11938//ltp/.nfs00000000084f06bc00000001': Device or resource busy

Comment 1 Shehjar Tikoo 2010-11-10 04:08:34 UTC
Occurs on 3.1 branch also.

First, the cleanup phase in fsstress reports a ENOENT for a file that exists:

 56   5.419016  10.1.10.180 -> 10.1.10.176  NFS V3 REMOVE Call, DH:0xa0115ff7/f24XXXXXXXXXX
 57   5.420329  10.1.10.176 -> 10.1.10.180  NFS V3 REMOVE Reply (Call In 56) Error:NFS3ERR_NOENT


But the first does exist:

[root@FC11-5 shehjart]# ls mount/ -lR
mount/:
total 16
drwxrwxrwx. 3 gopher root 8192 2010-11-10 05:31 p0

mount/p0:
total 16
drwxrwxrwx. 3 69069 root 8192 2010-11-10 05:31 d9XX

mount/p0/d9XX:
total 16
drwxrwxrwx. 2 root root 8192 2010-11-10 05:31 d2aXXXX

mount/p0/d9XX/d2aXXXX:
total 468
-rw-rw-rw-. 1 416 root 2686976 2010-11-10 05:31 f24XXXXXXXXXX
[root@FC11-5 shehjart]# stat mount/p0/d9XX/d2aXXXX/f24XXXXXXXXXX
  File: `mount/p0/d9XX/d2aXXXX/f24XXXXXXXXXX'
  Size: 2686976         Blocks: 936        IO Block: 65536  regular file
Device: 14h/20d Inode: 5660651598438928797  Links: 1
Access: (0666/-rw-rw-rw-)  Uid: (  416/ UNKNOWN)   Gid: (    0/    root)
Access: 2010-11-10 05:31:34.000000000 +0530
Modify: 2010-11-10 05:31:39.000000000 +0530
Change: 2010-11-10 05:31:40.000000000 +0530
[root@FC11-5 shehjart]# rm  mount/p0/d9XX/d2aXXXX/f24XXXXXXXXXX
rm: remove regular file `mount/p0/d9XX/d2aXXXX/f24XXXXXXXXXX'? y
rm: cannot remove `mount/p0/d9XX/d2aXXXX/f24XXXXXXXXXX': No such file or directory

Comment 2 Shehjar Tikoo 2010-11-10 04:13:19 UTC
The volume on which the file is scheduled is returning ENOENT.

[2010-11-10 05:41:31.713879] D [dht-common.c:1447:dht_unlink_linkfile_cbk] dr-dht: subvolume dr-replicate-0 returned -1 (No such file or directory)
[2010-11-10 05:41:31.713896] D [nfs3-helpers.c:2385:nfs3_log_common_res] nfs-nfsv3: XID: 7d4b3d48, REMOVE: NFS: 2(No such file or directory), POSIX: 2(No such file or directory)

Comment 3 Shehjar Tikoo 2010-11-10 06:07:17 UTC
The directory entry is showing up in dir listing but when I try to remove the entry, I receive a ENOENT from protocol/client. Even more surprisingly, file is actually missing on the backend. The question then is, why is the entry showing up in readdir results.

Comment 4 Shehjar Tikoo 2010-11-10 07:09:28 UTC
Does not happen on a 2 replica config. Only on a 2x2 distributed replicated one.

Comment 5 Shehjar Tikoo 2010-11-15 03:40:11 UTC
Simplest reproduction on a dist-repl config:

[root@FC11-5 shehjart]# /home/shehjart/fsstress/fsstress -d /home/shehjart/mount/ -l 1 -n 500 -p 5 -r
seed = 1290598758
rm: cannot remove directory `/home/shehjart/mount//p1/d1cXXXX/d3c/d44XXXX': Directory not empty
[root@FC11-5 shehjart]# stat /home/shehjart/mount//p1/d1cXXXX/d3c/d44XXXX
  File: `/home/shehjart/mount//p1/d1cXXXX/d3c/d44XXXX'
  Size: 4096            Blocks: 16         IO Block: 65536  directory
Device: 14h/20d Inode: 9089902889031391622  Links: 2
Access: (0777/drwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2010-11-15 05:09:14.000000000 +0530
Modify: 2010-11-15 05:09:14.000000000 +0530
Change: 2010-11-15 05:09:14.000000000 +0530
[root@FC11-5 shehjart]# ls /home/shehjart/mount//p1/d1cXXXX/d3c/d44XXXX
f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[root@FC11-5 shehjart]# ls /home/shehjart/mount//p1/d1cXXXX/d3c/d44XXXX -l
total 72
-rw-rw-rw-. 1 29847 root 393216 2010-11-15 05:09 f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[root@FC11-5 shehjart]# rm /home/shehjart/mount//p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
rm: remove regular file `/home/shehjart/mount//p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'? y
rm: cannot remove `/home/shehjart/mount//p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX': No such file or directory
[root@FC11-5 shehjart]# ls /home/shehjart/mount//p1/d1cXXXX/d3c/d44XXXX -l
total 72
-rw-rw-rw-. 1 29847 root 393216 2010-11-15 05:09 f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[root@FC11-5 shehjart]#

Comment 6 Shehjar Tikoo 2010-11-15 04:45:14 UTC
Heres whats happening:
[2010-11-15 05:09:07.191173] D [nfs3-helpers.c:2265:nfs3_log_fh_entry_call] nfs-nfsv3: XID: b13af7dc, LOOKUP: args: FH: hashcount 4, exportid 00000000-0000-0000-0000-000000000000, gfid 78707a9d-3378-43f8-8659-836784d2257e, name: f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[2010-11-15 05:09:07.191186] T [nfs3.c:1213:nfs3_lookup] nfs-nfsv3: FH to Volume: dist-repl
[2010-11-15 05:09:07.191198] T [nfs3-helpers.c:3021:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: gfid: 78707a9d-3378-43f8-8659-836784d2257e , entry: f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX, hashidx: 0
[2010-11-15 05:09:07.191213] T [nfs3-helpers.c:3029:nfs3_fh_resolve_entry_hard] nfs-nfsv3: Entry needs lookup: /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[2010-11-15 05:09:07.191224] T [nfs-fops.c:353:nfs_fop_lookup] nfs: Lookup: /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

.....
.....
.....

[2010-11-15 05:09:07.196236] T [rpc-clnt.c:631:rpc_clnt_reply_init] rpc-clnt: recieved rpc message (RPC XID: 0x7f8b000037ae Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) from rpc-transport (brick3)
[2010-11-15 05:09:07.196272] T [rpc-clnt.c:631:rpc_clnt_reply_init] rpc-clnt: recieved rpc message (RPC XID: 0x7f8b000032b7 Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) from rpc-transport (brick4)
[2010-11-15 05:09:07.196315] T [rpc-clnt.c:631:rpc_clnt_reply_init] rpc-clnt: recieved rpc message (RPC XID: 0x7f8b0000312c Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) from rpc-transport (brick2)
[2010-11-15 05:09:07.196492] T [nfs3-helpers.c:2580:nfs3_fh_resolve_entry_lookup_cbk] nfs-nfsv3: Lookup failed: /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX: No such file or directory
[2010-11-15 05:09:07.196514] D [nfs3-helpers.c:2385:nfs3_log_common_res] nfs-nfsv3: XID: b13af7dc, LOOKUP: NFS: 2(No such file or directory), POSIX: 14(Bad address)

....
....
....

[2010-11-15 05:09:07.203920] D [nfs3-helpers.c:2280:nfs3_log_rename_call] nfs-nfsv3: XID: b73af7dc, RENAME: args: Src: FH: hashcount 4, exportid 00000000-0000-0000-0000-000000000000, gfid 503b558e-c1b4-42c7-ba8f-f36da4a222e0, name: f54XX, Dst: FH: hashcount 4, exportid 00000000-0000-0000-0000-000000000000, gfid 78707a9d-3378-43f8-8659-836784d2257e, name: f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[2010-11-15 05:09:07.203933] T [nfs3.c:3579:nfs3_rename] nfs-nfsv3: FH to Volume: dist-repl
[2010-11-15 05:09:07.203945] T [nfs3-helpers.c:3021:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: gfid: 503b558e-c1b4-42c7-ba8f-f36da4a222e0 , entry: f54XX, hashidx: 0
[2010-11-15 05:09:07.203963] T [nfs3-helpers.c:3021:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: gfid: 78707a9d-3378-43f8-8659-836784d2257e , entry: f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX, hashidx: 0
[2010-11-15 05:09:07.203977] T [nfs3-helpers.c:3029:nfs3_fh_resolve_entry_hard] nfs-nfsv3: Entry needs lookup: /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[2010-11-15 05:09:07.203988] T [nfs-fops.c:353:nfs_fop_lookup] nfs: Lookup: /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

....
....
....

[2010-11-15 05:09:07.215120] T [nfs3-helpers.c:2580:nfs3_fh_resolve_entry_lookup_cbk] nfs-nfsv3: Lookup failed: /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX: No such file or directory
[2010-11-15 05:09:07.215133] T [nfs.c:412:nfs_user_create] nfs: uid: 0, gid 0, gids: 7
[2010-11-15 05:09:07.215142] T [nfs.c:420:nfs_user_create] nfs: gid: 0
[2010-11-15 05:09:07.215151] T [nfs.c:420:nfs_user_create] nfs: gid: 1
[2010-11-15 05:09:07.215160] T [nfs.c:420:nfs_user_create] nfs: gid: 2
[2010-11-15 05:09:07.215168] T [nfs.c:420:nfs_user_create] nfs: gid: 3
[2010-11-15 05:09:07.215177] T [nfs.c:420:nfs_user_create] nfs: gid: 4
[2010-11-15 05:09:07.215185] T [nfs.c:420:nfs_user_create] nfs: gid: 6
[2010-11-15 05:09:07.215194] T [nfs.c:420:nfs_user_create] nfs: gid: 10
[2010-11-15 05:09:07.215203] T [nfs-fops.c:1126:nfs_fop_rename] nfs: Rename: /p1/d1cXXXX/d33XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d46/f54XX -> /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[2010-11-15 05:09:07.215215] T [nfs-fops.c:135:nfs_create_frame] nfs: uid: 0, gid 0, gids: 7
[2010-11-15 05:09:07.215224] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 0
[2010-11-15 05:09:07.215232] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 1
[2010-11-15 05:09:07.215241] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 2
[2010-11-15 05:09:07.215249] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 3
[2010-11-15 05:09:07.215258] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 4
[2010-11-15 05:09:07.215275] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 6
[2010-11-15 05:09:07.215285] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 10
[2010-11-15 05:09:07.215300] T [dht-rename.c:697:dht_rename] dist-repl: renaming /p1/d1cXXXX/d33XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d46/f54XX (hash=repl1/cache=repl1) => /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (hash=repl2/cache=<nul>)
[2010-11-15 05:09:07.215313] T [dht-rename.c:592:dht_rename_create_links] dist-repl: linkfile /p1/d1cXXXX/d33XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d46/f54XX @ repl2 => repl1

....
....
....


[2010-11-15 05:09:07.229012] T [dht-rename.c:509:dht_do_rename] dist-repl: renaming /p1/d1cXXXX/d33XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d46/f54XX => /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (repl2)

....
....
....

[2010-11-15 05:09:07.216203] T [dht-rename.c:601:dht_rename_create_links] dist-repl: link /p1/d1cXXXX/d33XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d46/f54XX => /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (repl1)

....
....
....

[2010-11-15 05:09:07.243393] T [dht-rename.c:434:dht_rename_cbk] dist-repl: deleting old src datafile /p1/d1cXXXX/d33XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/d46/f54XX @ repl1

....
....
....


[2010-11-15 05:09:07.250087] D [nfs3-helpers.c:2385:nfs3_log_common_res] nfs-nfsv3: XID: b73af7dc, RENAME: NFS: 0(Call completed successfully.), POSIX: 14(Bad address)

....
....
....

[2010-11-15 05:09:58.17231] D [nfs3-helpers.c:2265:nfs3_log_fh_entry_call] nfs-nfsv3: XID: cd50f7dc, REMOVE: args: FH: hashcount 4, exportid 00000000-0000-0000-0000-000000000000, gfid 78707a9d-3378-43f8-8659-836784d2257e, name: f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[2010-11-15 05:09:58.17245] T [nfs3.c:3233:nfs3_remove] nfs-nfsv3: FH to Volume: dist-repl
[2010-11-15 05:09:58.17259] T [nfs3-helpers.c:3021:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: gfid: 78707a9d-3378-43f8-8659-836784d2257e , entry: f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX, hashidx: 0
[2010-11-15 05:09:58.17278] T [nfs.c:412:nfs_user_create] nfs: uid: 0, gid 0, gids: 7
[2010-11-15 05:09:58.17289] T [nfs.c:420:nfs_user_create] nfs: gid: 0
[2010-11-15 05:09:58.17297] T [nfs.c:420:nfs_user_create] nfs: gid: 1
[2010-11-15 05:09:58.17306] T [nfs.c:420:nfs_user_create] nfs: gid: 2
[2010-11-15 05:09:58.17314] T [nfs.c:420:nfs_user_create] nfs: gid: 3
[2010-11-15 05:09:58.17322] T [nfs.c:420:nfs_user_create] nfs: gid: 4
[2010-11-15 05:09:58.17349] T [nfs.c:420:nfs_user_create] nfs: gid: 6
[2010-11-15 05:09:58.17362] T [nfs.c:420:nfs_user_create] nfs: gid: 10
[2010-11-15 05:09:58.17372] T [nfs-fops.c:1018:nfs_fop_unlink] nfs: Unlink: /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[2010-11-15 05:09:58.17384] T [nfs-fops.c:135:nfs_create_frame] nfs: uid: 0, gid 0, gids: 7
[2010-11-15 05:09:58.17394] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 0
[2010-11-15 05:09:58.17402] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 1
[2010-11-15 05:09:58.17411] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 2
[2010-11-15 05:09:58.17419] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 3
[2010-11-15 05:09:58.17428] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 4
[2010-11-15 05:09:58.17467] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 6
[2010-11-15 05:09:58.17479] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 10

....
....
....

[2010-11-15 05:09:58.20490] D [dht-common.c:1447:dht_unlink_linkfile_cbk] dist-repl: subvolume repl2 returned -1 (No such file or directory)
[2010-11-15 05:09:58.20517] D [nfs3-helpers.c:2385:nfs3_log_common_res] nfs-nfsv3: XID: cd50f7dc, REMOVE: NFS: 2(No such file or directory), POSIX: 2(No such file or directory)

Comment 7 Shehjar Tikoo 2010-11-15 04:45:35 UTC
Complete bricks and nfs log at: dev:/share/tickets/804

Comment 8 Shehjar Tikoo 2010-11-15 05:03:55 UTC
[2010-11-15 05:09:08.789140] D [nfs3-helpers.c:2265:nfs3_log_fh_entry_call] nfs-nfsv3: XID: d23df7dc, RMDIR: args: FH: hashcount 3, exportid 00000000-0000-0000-0000-000000000000, gfid 446fc6a6-f65c-42da-93a8-2045e11025ab, name: d44XXXX
[2010-11-15 05:09:08.789152] T [nfs3.c:3376:nfs3_rmdir] nfs-nfsv3: FH to Volume: dist-repl
[2010-11-15 05:09:08.789163] T [nfs3-helpers.c:3021:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: gfid: 446fc6a6-f65c-42da-93a8-2045e11025ab , entry: d44XXXX, hashidx: 0
[2010-11-15 05:09:08.789177] T [nfs.c:412:nfs_user_create] nfs: uid: 0, gid 0, gids: 7
[2010-11-15 05:09:08.789188] T [nfs.c:420:nfs_user_create] nfs: gid: 0
[2010-11-15 05:09:08.789196] T [nfs.c:420:nfs_user_create] nfs: gid: 1
[2010-11-15 05:09:08.789205] T [nfs.c:420:nfs_user_create] nfs: gid: 2
[2010-11-15 05:09:08.789213] T [nfs.c:420:nfs_user_create] nfs: gid: 3
[2010-11-15 05:09:08.789258] T [nfs.c:420:nfs_user_create] nfs: gid: 4
[2010-11-15 05:09:08.789275] T [nfs.c:420:nfs_user_create] nfs: gid: 6
[2010-11-15 05:09:08.789284] T [nfs.c:420:nfs_user_create] nfs: gid: 10
[2010-11-15 05:09:08.789294] T [nfs-fops.c:970:nfs_fop_rmdir] nfs: Rmdir: /p1/d1cXXXX/d3c/d44XXXX
[2010-11-15 05:09:08.789304] T [nfs-fops.c:135:nfs_create_frame] nfs: uid: 0, gid 0, gids: 7
[2010-11-15 05:09:08.789313] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 0
[2010-11-15 05:09:08.789322] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 1
[2010-11-15 05:09:08.789348] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 2
[2010-11-15 05:09:08.789360] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 3
[2010-11-15 05:09:08.789369] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 4
[2010-11-15 05:09:08.789377] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 6
[2010-11-15 05:09:08.789386] T [nfs-fops.c:137:nfs_create_frame] nfs: gid: 10

....
....

[2010-11-15 05:09:08.791697] T [dht-common.c:4185:dht_rmdir_readdirp_cbk] dist-repl: readdir on repl1 for /p1/d1cXXXX/d3c/d44XXXX returned 4 entries

....
....

[2010-11-15 05:09:08.791816] T [dht-common.c:4143:dht_rmdir_is_subvol_empty] dist-repl: looking up /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX on repl2

....
....

[2010-11-15 05:09:08.792037] T [dht-common.c:4193:dht_rmdir_readdirp_cbk] dist-repl: readdir on repl2 for /p1/d1cXXXX/d3c/d44XXXX found 2 linkfiles

....
....

[2010-11-15 05:09:08.795376] T [dht-common.c:4013:dht_rmdir_linkfile_unlink_cbk] dist-repl: unlinked linkfile /p1/d1cXXXX/d3c/d44XXXX/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX on repl2
[2010-11-15 05:09:08.795396] D [nfs3-helpers.c:2385:nfs3_log_common_res] nfs-nfsv3: XID: d23df7dc, RMDIR: NFS: 66(Directory not empty), POSIX: 39(Directory not empty)

Comment 9 Shehjar Tikoo 2010-11-15 05:57:57 UTC
Command line reproduction:

#on nfs client
[root@FC11-5 shehjart]# mkdir mount/testdir2
[root@FC11-5 shehjart]# mkdir mount/testdir1

#first brick
[root@FC11-1 gluster-master]# ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 0

#second brick
[root@FC11-2 ~]#  ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 0

#third brick
[root@FC11-3 ~]#  ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 0

#fourth brick
[root@FC11-4 ~]#  ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 0


#nfs client
[root@FC11-5 shehjart]# touch mount/testdir1/f54XX

#first brick, file created as expected
[root@FC11-1 gluster-master]# ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 4
-rw-r--r--. 1 root root 0 2010-11-15 07:18 f54XX

/testdirs/disk1/testdir2:
total 0

#second brick, file created as expected
[root@FC11-2 ~]#  ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 4
-rw-r--r--. 1 root root 0 2010-11-15 07:18 f54XX

/testdirs/disk1/testdir2:
total 0

#rename the file from nfs client
[root@FC11-5 shehjart]# mv mount/testdir1/f54XX mount/testdir2/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

#first brick shows file moved to testdir2 as expected
[root@FC11-1 gluster-master]# ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 4
-rw-r--r--. 1 root root 0 2010-11-15 07:18 f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

#second brick shows file moved to testdir2 as expected
[root@FC11-2 ~]#  ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 4
-rw-r--r--. 1 root root 0 2010-11-15 07:18 f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

#third brick shows link file added in testdir2 to show that the file is hashed here now
[root@FC11-3 ~]#  ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 4
---------T. 1 root root 0 2010-11-15 07:19 f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

#same here because brick3 and 4 are replicas
[root@FC11-4 ~]#  ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 4
---------T. 1 root root 0 2010-11-15 07:19 f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

#rmdir of non-empty dir returns as expected
[root@FC11-5 shehjart]# rmdir mount/testdir2/
rmdir: failed to remove `mount/testdir2/': Directory not empty

#bricks 1 and 2 are in correct state
[root@FC11-1 gluster-master]# ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 4
-rw-r--r--. 1 root root 0 2010-11-15 07:18 f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


[root@FC11-2 ~]#  ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 4
-rw-r--r--. 1 root root 0 2010-11-15 07:18 f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

#but 3 and 4 have the link file missing
[root@FC11-3 ~]#  ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 0

[root@FC11-4 ~]#  ls /testdirs/disk1/* -l
/testdirs/disk1/testdir1:
total 0

/testdirs/disk1/testdir2:
total 0

#back on the nfs client, the file still shows up
[root@FC11-5 shehjart]# ls -l mount/testdir2/
total 4
-rw-r--r--. 1 root root 0 2010-11-15 07:18 f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

#removing it again makes no difference
[root@FC11-5 shehjart]# rm -f  mount/testdir2/f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[root@FC11-5 shehjart]# ls -l mount/testdir2/
total 4
-rw-r--r--. 1 root root 0 2010-11-15 07:18 f6bXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

#bricks 1 and 2 continue to keep their copy of this file

Comment 10 Shehjar Tikoo 2010-11-15 06:16:57 UTC
Re-assigning to Avati till he looks into the distribute problem of link file removal during rmdir. Will get back to fsstress ESTALE problem once the above gets fixed.

Comment 11 Amar Tumballi 2011-04-25 09:32:53 UTC
Please update the status of this bug as its been more than 6months since its filed (bug id < 2000)

Please resolve it with proper resolution if its not valid anymore. If its still valid and not critical, move it to 'enhancement' severity.


Note You need to log in before you can comment on or make changes to this bug.