Bug 1471737
Summary: | file being created by dd doesn't automatically release control when the volume is down | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> |
Component: | write-behind | Assignee: | Csaba Henk <csaba> |
Status: | CLOSED NOTABUG | QA Contact: | Rahul Hinduja <rhinduja> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | rhgs-3.3 | CC: | csaba, jahernan, rhs-bugs, storage-qa-internal |
Target Milestone: | --- | Keywords: | Triaged, ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-01-20 12:20:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Nag Pavan Chilakam
2017-07-17 11:21:47 UTC
This behaviour seems to be due to write-behind returning success for the writes. I was able to reproduce a similar behaviour using a C program which does write(2) in a while loop. Even after killing all 3 bricks of a 1x3 volume, further write syscalls on the same fd succeeds: --------------------------------------------------------------------------------- (gdb) bt #0 fuse_writev_cbk (frame=0x7fd9d8004780, cookie=0x7fd9d80042a0, this=0x12a2100, op_ret=17, op_errno=0, stbuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0) at fuse-bridge.c:2398 #1 0x00007fd9e1fa8cda in io_stats_writev_cbk (frame=0x7fd9d80042a0, cookie=0x7fd9d8004490, this=0x7fd9dc01aac0, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0) at io-stats.c:2107 #2 0x00007fd9e21d9192 in mdc_writev_cbk (frame=0x7fd9d8004490, cookie=0x7fd9d8003ad0, this=0x7fd9dc019480, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0) at md-cache.c:2116 #3 0x00007fd9f0fdc121 in default_writev_cbk (frame=0x7fd9d8003ad0, cookie=0x7fd9d8003cc0, this=0x7fd9dc017e40, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0) at defaults.c:1242 #4 0x00007fd9f0fdc121 in default_writev_cbk (frame=0x7fd9d8003cc0, cookie=0x7fd9d8003dd0, this=0x7fd9dc016790, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0) at defaults.c:1242 #5 0x00007fd9e280a0bc in ioc_writev_cbk (frame=0x7fd9d8003dd0, cookie=0x7fd9d80020e0, this=0x7fd9dc0151c0, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0) at io-cache.c:1234 #6 0x00007fd9e2a1f697 in ra_writev_cbk (frame=0x7fd9d80020e0, cookie=0x7fd9d8002300, this=0x7fd9dc013bf0, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0) at read-ahead.c:656 #7 0x00007fd9e2c30029 in wb_do_unwinds (wb_inode=0x7fd9dc06a900, lies=0x7fd9e0a81380) at write-behind.c:1204 #8 0x00007fd9e2c315b4 in wb_process_queue (wb_inode=0x7fd9dc06a900) at write-behind.c:1707 #9 0x00007fd9e2c31f1c in wb_writev (frame=0x7fd9d8002300, this=0x7fd9dc012480, fd=0x7fd9d800ec50, vector=0x7fd9d80045c0, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0) at write-behind.c:1822 #10 0x00007fd9e2a1fb4d in ra_writev (frame=0x7fd9d80020e0, this=0x7fd9dc013bf0, fd=0x7fd9d800ec50, vector=0x7fd9d80045c0, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0) at read-ahead.c:684 #11 0x00007fd9e280a70e in ioc_writev (frame=0x7fd9d8003dd0, this=0x7fd9dc0151c0, fd=0x7fd9d800ec50, vector=0x7fd9d80045c0, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0) at io-cache.c:1275 #12 0x00007fd9e25fac2a in qr_writev (frame=0x7fd9d8003cc0, this=0x7fd9dc016790, fd=0x7fd9d800ec50, iov=0x7fd9d80045c0, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0) at quick-read.c:639 #13 0x00007fd9f0fe5bd9 in default_writev_resume (frame=0x7fd9d8003ad0, this=0x7fd9dc017e40, fd=0x7fd9d800ec50, vector=0x7fd9d80045c0, count=1, off=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0) at defaults.c:1849 #14 0x00007fd9f0f4f601 in call_resume_wind (stub=0x7fd9d80193a0) at call-stub.c:2045 #15 0x00007fd9f0f5f3ec in call_resume (stub=0x7fd9d80193a0) at call-stub.c:2512 #16 0x00007fd9e23ed91b in open_and_resume (this=0x7fd9dc017e40, fd=0x7fd9d800ec50, stub=0x7fd9d80193a0) at open-behind.c:246 #17 0x00007fd9e23ee87c in ob_writev (frame=0x7fd9d8003ad0, this=0x7fd9dc017e40, fd=0x7fd9d800ec50, iov=0x7fd9d8009c90, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0) at open-behind.c:424 #18 0x00007fd9e21d94e7 in mdc_writev (frame=0x7fd9d8004490, this=0x7fd9dc019480, fd=0x7fd9d800ec50, vector=0x7fd9d8009c90, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0) at md-cache.c:2134 #19 0x00007fd9e1fb88c0 in io_stats_writev (frame=0x7fd9d80042a0, this=0x7fd9dc01aac0, fd=0x7fd9d800ec50, vector=0x7fd9d8009c90, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0) at io-stats.c:2955 #20 0x00007fd9f0fec869 in default_writev (frame=0x7fd9d80042a0, this=0x7fd9dc01c5b0, fd=0x7fd9d800ec50, vector=0x7fd9d8009c90, count=1, off=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0) at defaults.c:2543 #21 0x00007fd9e1d89957 in meta_writev (frame=0x7fd9d80042a0, this=0x7fd9dc01c5b0, fd=0x7fd9d800ec50, iov=0x7fd9d8009c90, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0) at meta.c:142 #22 0x00007fd9e8303687 in fuse_write_resume (state=0x7fd9d8009580) at fuse-bridge.c:2457 #23 0x00007fd9e82f72c7 in fuse_fop_resume (state=0x7fd9d8009580) at fuse-bridge.c:625 #24 0x00007fd9e82f4a2b in fuse_resolve_done (state=0x7fd9d8009580) at fuse-resolve.c:663 #25 0x00007fd9e82f4b01 in fuse_resolve_all (state=0x7fd9d8009580) at fuse-resolve.c:690 #26 0x00007fd9e82f4a0c in fuse_resolve (state=0x7fd9d8009580) at fuse-resolve.c:654 #27 0x00007fd9e82f4ad8 in fuse_resolve_all (state=0x7fd9d8009580) at fuse-resolve.c:686 #28 0x00007fd9e82f4b5f in fuse_resolve_continue (state=0x7fd9d8009580) at fuse-resolve.c:706 #29 0x00007fd9e82f47c5 in fuse_resolve_fd (state=0x7fd9d8009580) at fuse-resolve.c:566 #30 0x00007fd9e82f49ba in fuse_resolve (state=0x7fd9d8009580) at fuse-resolve.c:643 #31 0x00007fd9e82f4a83 in fuse_resolve_all (state=0x7fd9d8009580) at fuse-resolve.c:679 #32 0x00007fd9e82f4b9d in fuse_resolve_and_resume (state=0x7fd9d8009580, fn=0x7fd9e83030d5 <fuse_write_resume>) at fuse-resolve.c:718 ---Type <return> to continue, or q <return> to quit--- #33 0x00007fd9e830387a in fuse_write (this=0x12a2100, finh=0x7fd9d80046a0, msg=0x7fd9ef22a000) at fuse-bridge.c:2509 #34 0x00007fd9e83105a2 in fuse_thread_proc (data=0x12a2100) at fuse-bridge.c:5083 #35 0x00007fd9efd7fe25 in start_thread () from /lib64/libpthread.so.0 #36 0x00007fd9ef648bad in clone () from /lib64/libc.so.6 -------------------------------------------------------------------------------- I think the guarantee write-behind gives is that it will fail the next fsync call, but I'm moving it to WB to confirm if this is a bug that needs to be fixed. Latest status? Proposal is to close it with NOTABUG, as it's the user application that does not comply fully to POSIX semantics, not glusterfs. However, there are some minor questions to clarify before doing this. (In reply to Csaba Henk from comment #6) > Proposal is to close it with NOTABUG, as it's the user application that does > not comply fully to POSIX semantics, not glusterfs. However, there are some > minor questions to clarify before doing this. I don't understand. Shouldn't gluster return ENOTCONN as soon as there are no backends to use ? what's wrong with 'dd' ? I think write-behind should detect that bricks are offline (probably by handling event CHILD_DOWN), flush all pending data, mark any open fd as invalid and fail any future request with ENOTCONN. POSIX vfs semantics gives you a chance to make sure to get notified about success of your writes via fsync. Which is not done by dd. Write-behind cannot reply with error if it's not being asked. (In reply to Csaba Henk from comment #8) > POSIX vfs semantics gives you a chance to make sure to get notified about > success of your writes via fsync. Which is not done by dd. Write-behind > cannot reply with error if it's not being asked. I think that if we can detect an error we should report it as soon as possible, but even if we ignore it, how is it possible that the 'dd' keeps running forever ? sooner or later write-behind cache should get full and some write requests should be sent to the bricks, which will fail. At this point any future request should fail (either with ENOTCONN or EIO). I don't see any benefit on hiding this problem to the user. Based on Csaba's comments #6 and #8, I'm closing this bug as NOTABUG. We'll need to decide what are the guarantees that gluster gives to decide how to behave in these cases, even if it's posix compliant like in this case. Removing needinfo on me as bug has been closed. |