+++ This bug was initially created as a clone of Bug #842373 +++
+++ This bug was initially created as a clone of Bug #842364 +++
Description of problem:
After creating stripe-replicate or distribute-stripe-replicate volume write operation fails with "Invalid argument " error for the first time writing the file, but touch operation succeeds, any further operations on already created file will succeed.
Version-Release number of selected component (if applicable):
RHS 2.0.z
How reproducible:
Always
Steps to Reproduce:
1. create a stripe-replicate or distributed-stripe-replicate volume
2. mount the volume and create a file say
dd if=/dev/urandom of=file1 bs=1024 count=10000
Actual results:
"Invalid argument" error will be thrown on the mount point, but file will be just created nothing will be written. Further write operation on the same file succeeds.
Expected results:
Additional info:
[2012-07-23 14:43:06.112862] E [stripe-helpers.c:268:stripe_ctx_handle] 0-str-rep1-stripe-0: Failed to get stripe-size
[2012-07-23 14:43:06.132458] I [dict.c:317:dict_get] (-->/usr/lib64/glusterfs/3.3.0/xlator/cluster/replicate.so(afr_create_unwind+0x13c) [0x7f50a7b62b6c] (-->/usr/lib64/glusterfs/3.3.0/xlator/cluster/stripe.so(stripe_create_cbk+0x60b) [0x7f50a794676b] (-->/usr/lib64/glusterfs/3.3.0/xlator/cluster/stripe.so(stripe_ctx_handle+0x90) [0x7f50a794b040]))) 0-dict: !this || key=trusted.str-rep1-stripe-0.stripe-size
Tried to create a file with vim and got the following error
----------------------------------------------------------------
[2012-07-23 14:43:11.104050] E [stripe-helpers.c:268:stripe_ctx_handle] 0-str-rep1-stripe-0: Failed to get stripe-size
[2012-07-23 14:43:11.146554] W [fuse-bridge.c:968:fuse_err_cbk] 0-glusterfs-fuse: 78: FSYNC() ERR => -1 (Invalid argument)
[2012-07-23 14:43:15.153063] W [fuse-bridge.c:2025:fuse_writev_cbk] 0-glusterfs-fuse: 86: WRITE => -1 (Invalid argument)
==========================================================
gdb) bt
#0 fuse_writev_cbk (frame=0x7fc6ff9d5050, cookie=0x7fc6ffbdd1a4, this=0x1e4d8f0, op_ret=-1, op_errno=22, stbuf=0x0,
postbuf=0x0, xdata=0x0) at fuse-bridge.c:2006
#1 0x00007fc6f6f3c4c6 in io_stats_writev_cbk (frame=0x7fc6ffbdd1a4, cookie=<value optimized out>,
this=<value optimized out>, op_ret=-1, op_errno=22, prebuf=0x0, postbuf=0x0, xdata=0x0) at io-stats.c:1361
#2 0x00007fc6f714cc19 in mdc_writev_cbk (frame=0x7fc6ffbdcd9c, cookie=<value optimized out>, this=<value optimized out>,
op_ret=-1, op_errno=<value optimized out>, prebuf=0x0, postbuf=0x0, xdata=0x0) at md-cache.c:1381
#3 0x00007fc6f735a4b9 in qr_writev_cbk (frame=0x7fc6ffbdd04c, cookie=<value optimized out>, this=<value optimized out>,
op_ret=-1, op_errno=22, prebuf=0x0, postbuf=0x0, xdata=0x0) at quick-read.c:1392
#4 0x00007fc6f756fcf4 in ioc_writev_cbk (frame=0x7fc6ffbdc02c, cookie=<value optimized out>, this=<value optimized out>,
op_ret=-1, op_errno=22, prebuf=0x0, postbuf=0x0, xdata=0x0) at io-cache.c:1198
#5 0x00007fc6f777f82f in ra_writev_cbk (frame=0x7fc6ffbdccf0, cookie=<value optimized out>, this=<value optimized out>,
op_ret=-1, op_errno=22, prebuf=0x0, postbuf=0x0, xdata=0x0) at read-ahead.c:654
#6 0x00007fc6f7992231 in wb_writev (frame=0x7fc6ffbdd5ac, this=0x1e6caf0, fd=<value optimized out>,
vector=<value optimized out>, count=1, offset=1024, flags=32769, iobref=0x7fc6ec000d90, xdata=0x0)
at write-behind.c:2148
#7 0x00007fc6f777fa6b in ra_writev (frame=<value optimized out>, this=0x1e6dc80, fd=0x1fa6a3c, vector=0x7fc6ec001b00,
count=1, offset=1024, flags=32769, iobref=0x7fc6ec000d90, xdata=0x0) at read-ahead.c:682
#8 0x00007fc6f756fa48 in ioc_writev (frame=<value optimized out>, this=0x1e6ed60, fd=0x1fa6a3c, vector=0x7fc6ec001b00,
count=1, offset=1024, flags=32769, iobref=0x7fc6ec000d90, xdata=0x0) at io-cache.c:1238
#9 0x00007fc6f73641bf in qr_writev (frame=<value optimized out>, this=0x1e6fe40, fd=0x1fa6a3c, vector=0x7fc6ec001b00,
count=1, off=1024, wr_flags=32769, iobref=0x7fc6ec000d90, xdata=0x0) at quick-read.c:1529
#10 0x00007fc6f7149e02 in mdc_writev (frame=<value optimized out>, this=0x1e71030, fd=0x1fa6a3c, vector=0x7fc6ec001b00,
count=1, offset=1024, flags=32769, iobref=0x7fc6ec000d90, xdata=0x0) at md-cache.c:1399
#11 0x00007fc6f6f38b90 in io_stats_writev (frame=<value optimized out>, this=0x1e720f0, fd=0x1fa6a3c,
vector=0x7fc6ec001b00, count=1, offset=1024, flags=32769, iobref=0x7fc6ec000d90, xdata=0x0) at io-stats.c:2082
#12 0x00007fc6ff455a83 in fuse_write_resume (state=<value optimized out>) at fuse-bridge.c:2061
#13 0x00007fc6ff448226 in fuse_resolve_done (state=<value optimized out>) at fuse-resolve.c:466
#14 fuse_resolve_all (state=<value optimized out>) at fuse-resolve.c:495
#15 0x00007fc6ff448132 in fuse_resolve (state=0x7fc6ec001400) at fuse-resolve.c:452
#16 0x00007fc6ff44821e in fuse_resolve_all (state=<value optimized out>) at fuse-resolve.c:491
#17 0x00007fc6ff4482b1 in fuse_resolve_continue (state=0x7fc6ec001400) at fuse-resolve.c:511
--- Additional comment from amarts on 2012-07-23 12:09:12 EDT ---
looks to me that the issue is due to recent merge of 'stripe-coalesce' feature. Thats the only feature which started sending back dictionary in _cbk() and check for the extended attributes.
--- Additional comment from amarts on 2012-07-23 12:19:38 EDT ---
most probably the below patch should fix it.
diff --git a/xlators/cluster/afr/src/afr-dir-write.c b/xlators/cluster/afr/src/afr-dir-write.c
index b7e9bd8..78eaa97 100644
--- a/xlators/cluster/afr/src/afr-dir-write.c
+++ b/xlators/cluster/afr/src/afr-dir-write.c
@@ -98,7 +98,7 @@ afr_create_unwind (call_frame_t *frame, xlator_t *this)
local->cont.create.inode,
unwind_buf, &local->cont.create.preparent,
&local->cont.create.postparent,
- NULL);
+ local->xdata_rsp);
}
return 0;
@@ -160,8 +160,10 @@ afr_create_wind_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
fd_ctx->opened_on[child_index] = AFR_FD_OPENED;
fd_ctx->flags = local->cont.create.flags;
- if (local->success_count == 0)
+ if (local->success_count == 0) {
local->cont.create.buf = *buf;
+ local->xdata_rsp = dict_ref (xdata);
+ }
if (child_index == local->read_child_index) {
local->cont.create.read_child_buf = *buf;
--- Additional comment from bfoster on 2012-07-23 13:46:32 EDT ---
Ugh, guess I never tested AFR. Thanks Amar... your change fixes this problem.
Patch posted: http://review.gluster.com/3713
--- Additional comment from vbellur on 2012-07-23 14:48:37 EDT ---
CHANGE: http://review.gluster.com/3713 (afr: pass back xdata in create) merged in master by Anand Avati (avati)
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
http://rhn.redhat.com/errata/RHBA-2012-1253.html