Description of problem: I can see on client: [2012-09-13 14:45:48.369643] W [client3_1-fops.c:2457:client3_1_link_cbk] 0-Lookups-client-0: remote operation failed: File exists (00000000-0000-0000-0000-000000000000 -> /3g48v4ogxph0hpu4fb45f9w4g2lsyc3n.hash) [2012-09-13 14:45:48.374734] E [afr-self-heal-common.c:2156:afr_self_heal_completion_cbk] 0-Lookups-replicate-0: background entry self-heal failed on / On server: [2012-09-13 14:47:23.221610] I [server3_1-fops.c:1183:server_link_cbk] 0-Lookups-server: 43153874: LINK /3g48v4ogxph0hpu4fb45f9w4g2lsyc3n.hash (d97fa895-40a6-4eaf-b9b7-48b2c1f8c361) ==> -1 (File exists) Version-Release number of selected component (if applicable): release-3.3 How reproducible: I don't know how we got to it, might be kernel change: We were updating kernel on replicated bricks and we were doing it on the fly by updating and rebooting one node after another.
86769937 lrwxrwxrwx. 2 apache apache 53 Jan 1 1970 3g48v4ogxph0hpu4fb45f9w4g2lsyc3n.hash -> 3g48v4ogxph0hpu4fb45f9w4g2lsyc3n.hash.P check out the date, that is strange
31691 12:10:17 <... lstat resumed> 0x7f75f4c29710) = -1 ENOENT (No such file or directory) 27119 12:10:17 readv(88, <unfinished ...> 31691 12:10:17 lstat("/mnt/gluster/Lookups/.glusterfs/db/ab/dbab3fcc-69b9-4185-ab43-2cd7c62928d1", <unfinished ...> 28215 12:10:17 fstat(110, <unfinished ...> 31691 12:10:17 <... lstat resumed> 0x7f75f4c296b0) = -1 ENOENT (No such file or directory) 27119 12:10:17 <... readv resumed> [{"\200\0\0\244", 4}], 1) = 4 31691 12:10:17 lstat("/mnt/gluster/Lookups/.glusterfs/db/ab/dbab3fcc-69b9-4185-ab43-2cd7c62928d1", <unfinished ...> 28215 12:10:17 <... fstat resumed> {st_mode=S_IFREG|0644, st_size=118613394, ...}) = 0 31691 12:10:17 <... lstat resumed> 0x7f75f4c29830) = -1 ENOENT (No such file or directory) 27119 12:10:17 readv(88, <unfinished ...> 31691 12:10:17 mkdir("/mnt/gluster/Lookups/.glusterfs/db", 0700 <unfinished ...> 28215 12:10:17 fgetxattr(110, "trusted.gfid" <unfinished ...> 31691 12:10:17 <... mkdir resumed> ) = -1 EEXIST (File exists) 27119 12:10:17 <... readv resumed> [{"\10\357\16\353\0\0\0\0", 8}], 1) = 8 31691 12:10:17 mkdir("/mnt/gluster/Lookups/.glusterfs/db/ab", 0700 <unfinished ...> 28215 12:10:17 <... fgetxattr resumed> , "H\x88\x02\xc5\xbd\x1aA\xa1\x8ds\xef\xd8\x944\x95\xd1", 16) = 16 27119 12:10:17 readv(88, <unfinished ...> 31691 12:10:17 <... mkdir resumed> ) = 0 I tried unlinking and linking file with gfid: dbab3fcc-69b9-4185-ab43-2cd7c62928d1 It too throws these LINK errors in high amount into logs.
Can you test to see if the patch for bug #831151 fixes this for you? I'm pretty sure this is a duplicate. The patch is http://review.gluster.com/3571
yes, that worked. This is an duplicate.
*** This bug has been marked as a duplicate of bug 831151 ***