Bug 765544 (GLUSTER-3812) - stat after write returns stale file size
Summary: stat after write returns stale file size
Keywords:
Status: CLOSED DUPLICATE of bug 765443
Alias: GLUSTER-3812
Product: GlusterFS
Classification: Community
Component: io-threads
Version: mainline
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Raghavendra G
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-14 10:58 UTC by Jean-Marc Saffroy
Modified: 2011-11-17 10:47 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
reproducer for this bug (324 bytes, application/x-shellscript)
2011-11-14 07:58 UTC, Jean-Marc Saffroy
no flags Details

Description Jean-Marc Saffroy 2011-11-14 10:58:48 UTC
Linux kernel builds over Gluster can fail randomly with the following error:

ar: drivers/misc/lis3lv02d/built-in.o: File format is ambiguous
ar: Matching formats: elf32-i386 a.out-i386-linux pei-i386 pei-x86-64 elf64-l1om elf64-little elf64-big elf32-little elf32-big plugin
make[3]: *** [drivers/misc/lis3lv02d/built-in.o] Error 1

When ar fails, strace on ar shows the following syscalls:

10:36:19.684908 stat("drivers/misc/lis3lv02d/built-in.o", 0x7ffff67e1e40) = -1 ENOENT (No such file or directory) <0.005708>
10:36:19.690699 open("drivers/misc/lis3lv02d/built-in.o", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3 <0.056728>
10:36:19.747654 fcntl(3, F_GETFD)       = 0 <0.000126>
10:36:19.747855 fcntl(3, F_SETFD, FD_CLOEXEC) = 0 <0.000013>
10:36:19.747961 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 <0.000012>
10:36:19.748106 mmap(NULL, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac3f6a80000 <0.000290>
10:36:19.748454 lseek(3, 0, SEEK_SET)   = 0 <0.000008>
10:36:19.748518 write(3, "!<arch>\n", 8) = 8 <0.000239>
10:36:19.748816 close(3)                = 0 <0.018367>
10:36:19.767256 munmap(0x2ac3f6a80000, 131072) = 0 <0.000039>
10:36:19.767415 stat("drivers/misc/lis3lv02d/built-in.o", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 <0.000057>
10:36:19.767578 open("drivers/misc/lis3lv02d/built-in.o", O_RDONLY) = 3 <0.022340>
10:36:19.789992 fcntl(3, F_GETFD)       = 0 <0.000011>
10:36:19.790056 fcntl(3, F_SETFD, FD_CLOEXEC) = 0 <0.000011>
10:36:19.790112 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 <0.000009>
10:36:19.790331 mmap(NULL, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac3f6a80000 <0.000014>
10:36:19.790401 lseek(3, 0, SEEK_SET)   = 0 <0.000009>
10:36:19.790465 read(3, "", 131072)     = 0 <0.021196>
10:36:19.811737 lseek(3, 0, SEEK_SET)   = 0 <0.000007>
10:36:19.811785 read(3, "!<arch>\n", 131072) = 8 <0.003844>

Right after writing its output file, ar sees it as empty (st_size=0).

It seems to be a race between a stat() from another process (on the same client) and the write()/stat() sequence. The attached script reproduces the problem very quickly:

# ./writestat2.sh 
testing write+stat race on /gluster/foo
unexpected size for file: 0

The problem disappears if I disable io-threads on the server. The volume uses a single brick, and was created as follows:
# gluster volume create vol1 server:/home/gluster-vol1

The Gluster code is built from a git checkout from October 26th (commit b3d696f78b16f246bd34f87aafb52317033408cc).

Comment 1 Raghavendra G 2011-11-17 02:09:46 UTC
The issue here is write-behind is not maintaining the order of operations across multiple processes. Following patch will fix this issue:
http://review.gluster.com/#change,712

Hence marking this bug as duplicate of 3711.

*** This bug has been marked as a duplicate of bug 3711 ***

Comment 2 Jean-Marc Saffroy 2011-11-17 07:47:08 UTC
(In reply to comment #1)
> The issue here is write-behind is not maintaining the order of operations
> across multiple processes. Following patch will fix this issue:
> http://review.gluster.com/#change,712

Thanks, that makes sense.

Also, can you please fix Gerrit? None of the methods proposed to retrieve a patch works: they all use http://review.gluster.com/p/glusterfs as a git repo, this doesn't seem to work, so I can't try the proposed fix.


Note You need to log in before you can comment on or make changes to this bug.