Bug 1471737 - file being created by dd doesn't automatically release control when the volume is down
Summary: file being created by dd doesn't automatically release control when the volu...
Status: NEW
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: write-behind   
(Show other bugs)
Version: 3.3
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Csaba Henk
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Keywords: ZStream
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-17 11:21 UTC by nchilaka
Modified: 2019-03-09 18:10 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

Description nchilaka 2017-07-17 11:21:47 UTC
Description of problem:
======================
I have seen that when I am writing to a file using dd, and I stop the volume, the control of cli is never released by the dd command, until user intervenes.
However, when there is a transport end point error , why are we still holding the lock?
It gets released only when an user does a terminate of the command(ctrl+c)

Any file created henceforth, using dd or touch or any other manner, fails with "Transport endpoint not connected" which is correct


Version-Release number of selected component (if applicable):
=======
3.8.4-33

How reproducible:
========
always

Steps to Reproduce:
1.create a 1x3 volume and fuse mount it 
2. create a file using dd or put in loop say for 10 files
for i in {1..10000};do dd if=/dev/urandom of=file.$i bs=1024 count=100000;done

3. stop the volume using vol stop command
3.you can notice that the cli prompt is never returned until the user intervenes.
I waited for about 5 min


The file handle shows transport end point error

[root@dhcp35-103 glusterfs]#  lsof|grep file.5
lsof: WARNING: can't stat() fuse.glusterfs file system /mnt/vol_3
      Output information may be incomplete.
dd        12428          root    1w  unknown                                         /mnt/vol_3/dd2/file.5 (stat: Transport endpoint is not connected)




Actual results:
===========


Expected results:
======
the shell control must be returned to user once there is a transport end point error

Additional info:
==============
this can be a problem especially if IOs are being done through another application, where user may not be able to identify.




Also fuse logs point to the right errors

[2017-07-17 11:09:19.826321] W [socket.c:595:__socket_rwv] 0-vol_3-client-0: readv on 10.70.35.45:49154 failed (Connection reset by peer)
[2017-07-17 11:09:19.826399] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-vol_3-client-0: disconnected from vol_3-client-0. Client process will keep trying to connect to glusterd until brick's port is available
[2017-07-17 11:09:19.827417] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f0ef8dae1e2] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f0ef8b738ae] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f0ef8b739be] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7f0ef8b75130] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7f0ef8b75be0] ))))) 0-vol_3-client-0: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2017-07-17 11:09:19.824810 (xid=0x1b0f9)
[2017-07-17 11:09:19.827457] W [MSGID: 114031] [client-rpc-fops.c:855:client3_3_writev_cbk] 0-vol_3-client-0: remote operation failed [Transport endpoint is not connected]
[2017-07-17 11:09:19.827506] W [MSGID: 114061] [client-common.c:460:client_pre_fsync] 0-vol_3-client-0:  (5bd904f5-384b-4cbf-b0ae-1849f7e0ad36) remote_fd is -1. EBADFD [File descriptor in bad state]
[2017-07-17 11:09:19.827561] W [MSGID: 108035] [afr-transaction.c:2231:afr_changelog_fsync_cbk] 0-vol_3-replicate-0: fsync(5bd904f5-384b-4cbf-b0ae-1849f7e0ad36) failed on subvolume vol_3-client-0. Transaction was WRITE [File descriptor in bad state]
[2017-07-17 11:09:20.010047] E [MSGID: 114031] [client-rpc-fops.c:1601:client3_3_finodelk_cbk] 0-vol_3-client-0: remote operation failed [Transport endpoint is not connected]
[2017-07-17 11:09:21.846196] W [socket.c:595:__socket_rwv] 0-vol_3-client-2: readv on 10.70.35.122:49154 failed (Connection reset by peer)
[2017-07-17 11:09:21.846287] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-vol_3-client-2: disconnected from vol_3-client-2. Client process will keep trying to connect to glusterd until brick's port is available
[2017-07-17 11:09:21.846292] W [socket.c:595:__socket_rwv] 0-vol_3-client-1: readv on 10.70.35.130:49154 failed (Connection reset by peer)
[2017-07-17 11:09:21.846328] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-vol_3-client-1: disconnected from vol_3-client-1. Client process will keep trying to connect to glusterd until brick's port is available
[2017-07-17 11:09:21.846417] W [MSGID: 108001] [afr-common.c:4820:afr_notify] 0-vol_3-replicate-0: Client-quorum is not met
[2017-07-17 11:09:21.846560] E [MSGID: 108006] [afr-common.c:4731:afr_notify] 0-vol_3-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2017-07-17 11:09:21.847039] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f0ef8dae1e2] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f0ef8b738ae] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f0ef8b739be] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7f0ef8b75130] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7f0ef8b75be0] ))))) 0-vol_3-client-2: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2017-07-17 11:09:21.844728 (xid=0x1c6c1)
[2017-07-17 11:09:21.847047] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f0ef8dae1e2] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f0ef8b738ae] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f0ef8b739be] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7f0ef8b75130] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7f0ef8b75be0] ))))) 0-vol_3-client-1: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2017-07-17 11:09:21.844687 (xid=0x1c6c3)
[2017-07-17 11:09:21.847064] W [MSGID: 114031] [client-rpc-fops.c:855:client3_3_writev_cbk] 0-vol_3-client-2: remote operation failed [Transport endpoint is not connected]
[2017-07-17 11:09:21.847072] W [MSGID: 114031] [client-rpc-fops.c:855:client3_3_writev_cbk] 0-vol_3-client-1: remote operation failed [Transport endpoint is not connected]
[2017-07-17 11:09:21.847099] W [MSGID: 114061] [client-common.c:460:client_pre_fsync] 0-vol_3-client-1:  (de308b78-ace3-40b5-bbb1-d658cdf1fb9c) remote_fd is -1. EBADFD [File descriptor in bad state]
[2017-07-17 11:09:21.847120] W [MSGID: 108035] [afr-transaction.c:2231:afr_changelog_fsync_cbk] 0-vol_3-replicate-0: fsync(de308b78-ace3-40b5-bbb1-d658cdf1fb9c) failed on subvolume vol_3-client-1. Transaction was WRITE [File descriptor in bad state]
[2017-07-17 11:09:21.847136] W [MSGID: 114061] [client-common.c:460:client_pre_fsync] 0-vol_3-client-2:  (de308b78-ace3-40b5-bbb1-d658cdf1fb9c) remote_fd is -1. EBADFD [File descriptor in bad state]
[2017-07-17 11:09:21.847151] W [MSGID: 108035] [afr-transaction.c:2231:afr_changelog_fsync_cbk] 0-vol_3-replicate-0: fsync(de308b78-ace3-40b5-bbb1-d658cdf1fb9c) failed on subvolume vol_3-client-2. Transaction was WRITE [File descriptor in bad state]
[2017-07-17 11:09:21.847166] E [MSGID: 114031] [client-rpc-fops.c:1601:client3_3_finodelk_cbk] 0-vol_3-client-1: remote operation failed [Transport endpoint is not connected]
[2017-07-17 11:09:21.847185] E [MSGID: 114031] [client-rpc-fops.c:1601:client3_3_finodelk_cbk] 0-vol_3-client-2: remote operation failed [Transport endpoint is not connected]
[2017-07-17 11:09:21.847246] I [MSGID: 108006] [afr-common.c:4874:afr_local_init] 0-vol_3-replicate-0: no subvolumes up
The message "I [MSGID: 108006] [afr-common.c:4874:afr_local_init] 0-vol_3-replicate-0: no subvolumes up" repeated 38282 times between [2017-07-17 11:09:21.847246] and [2017-07-17 11:09:30.576305]
[2017-07-17 11:09:30.576570] E [MSGID: 114058] [client-handshake.c:1537:client_query_portmap_cbk] 0-vol_3-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2017-07-17 11:09:30.576701] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-vol_3-client-0: disconnected from vol_3-client-0. Client process will keep trying to connect to glusterd until brick's port is available
[2017-07-17 11:09:30.577094] I [MSGID: 108006] [afr-common.c:4874:afr_local_init] 0-vol_3-replicate-0: no subvolumes up
The message "I [MSGID: 108006] [afr-common.c:4874:afr_local_init] 0-vol_3-replicate-0: no subvolumes up" repeated 3457 times between [2017-07-17 11:09:30.577094] and [2017-07-17 11:09:32.579105]
[2017-07-17 11:09:32.579465] E [MSGID: 114058] [client-handshake.c:1537:client_query_portmap_cbk] 0-vol_3-client-2: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2017-07-17 11:09:32.579573] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-vol_3-client-2: disconnected from vol_3-client-2. Client process will keep trying to connect to glusterd until brick's port is available
[2017-07-17 11:09:32.580279] I [MSGID: 108006] [afr-common.c:4874:afr_local_init] 0-vol_3-replicate-0: no subvolumes up
[2017-07-17 11:09:32.581463] I [MSGID: 108006] [afr-common.c:4874:afr_local_init] 0-vol_3-replicate-0: no subvolumes up
[2017-07-17 11:09:32.581700] E [MSGID: 114058] [client-handshake.c:1537:client_query_portmap_cbk] 0-vol_3-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2017-07-17 11:09:32.581782] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-vol_3-client-1: disconnected from vol_3-client-1. Client process will keep trying to connect to glusterd until brick's port is available
[2017-07-17 11:09:32.582687] I [MSGID: 108006] [afr-common.c:4874:afr_local_init] 0-vol_3-replicate-0: no subvolumes up
The message "I [MSGID: 108006] [afr-common.c:4874:afr_local_init] 0-vol_3-replicate-0: no subvolumes up" repeated 61047 times between [2017-07-17 11:09:32.582687] and [2017-07-17 11:10:30.728966]
[2017-07-17 11:10:30.729836] I [MSGID: 108006] [afr-common.c:4874:afr_local_init] 0-vol_3-replicate-0: no subvolumes up



version:
3.8.4-33

Comment 4 Ravishankar N 2018-11-14 11:54:03 UTC
This behaviour seems to be due to write-behind returning success for the writes. I was able to reproduce a similar behaviour using a C program which does write(2) in a while loop. Even after killing all 3 bricks of a 1x3 volume, further write syscalls on the same fd succeeds:
---------------------------------------------------------------------------------
(gdb) bt
#0  fuse_writev_cbk (frame=0x7fd9d8004780, cookie=0x7fd9d80042a0, this=0x12a2100, op_ret=17, op_errno=0, stbuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0) at fuse-bridge.c:2398
#1  0x00007fd9e1fa8cda in io_stats_writev_cbk (frame=0x7fd9d80042a0, cookie=0x7fd9d8004490, this=0x7fd9dc01aac0, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, 
    xdata=0x0) at io-stats.c:2107
#2  0x00007fd9e21d9192 in mdc_writev_cbk (frame=0x7fd9d8004490, cookie=0x7fd9d8003ad0, this=0x7fd9dc019480, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0)
    at md-cache.c:2116
#3  0x00007fd9f0fdc121 in default_writev_cbk (frame=0x7fd9d8003ad0, cookie=0x7fd9d8003cc0, this=0x7fd9dc017e40, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, 
    xdata=0x0) at defaults.c:1242
#4  0x00007fd9f0fdc121 in default_writev_cbk (frame=0x7fd9d8003cc0, cookie=0x7fd9d8003dd0, this=0x7fd9dc016790, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, 
    xdata=0x0) at defaults.c:1242
#5  0x00007fd9e280a0bc in ioc_writev_cbk (frame=0x7fd9d8003dd0, cookie=0x7fd9d80020e0, this=0x7fd9dc0151c0, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0)
    at io-cache.c:1234
#6  0x00007fd9e2a1f697 in ra_writev_cbk (frame=0x7fd9d80020e0, cookie=0x7fd9d8002300, this=0x7fd9dc013bf0, op_ret=17, op_errno=0, prebuf=0x7fd9e0a812a0, postbuf=0x7fd9e0a812a0, xdata=0x0)
    at read-ahead.c:656
#7  0x00007fd9e2c30029 in wb_do_unwinds (wb_inode=0x7fd9dc06a900, lies=0x7fd9e0a81380) at write-behind.c:1204
#8  0x00007fd9e2c315b4 in wb_process_queue (wb_inode=0x7fd9dc06a900) at write-behind.c:1707
#9  0x00007fd9e2c31f1c in wb_writev (frame=0x7fd9d8002300, this=0x7fd9dc012480, fd=0x7fd9d800ec50, vector=0x7fd9d80045c0, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, 
    xdata=0x0) at write-behind.c:1822
#10 0x00007fd9e2a1fb4d in ra_writev (frame=0x7fd9d80020e0, this=0x7fd9dc013bf0, fd=0x7fd9d800ec50, vector=0x7fd9d80045c0, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, 
    xdata=0x0) at read-ahead.c:684
#11 0x00007fd9e280a70e in ioc_writev (frame=0x7fd9d8003dd0, this=0x7fd9dc0151c0, fd=0x7fd9d800ec50, vector=0x7fd9d80045c0, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, 
    xdata=0x0) at io-cache.c:1275
#12 0x00007fd9e25fac2a in qr_writev (frame=0x7fd9d8003cc0, this=0x7fd9dc016790, fd=0x7fd9d800ec50, iov=0x7fd9d80045c0, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0)
    at quick-read.c:639
#13 0x00007fd9f0fe5bd9 in default_writev_resume (frame=0x7fd9d8003ad0, this=0x7fd9dc017e40, fd=0x7fd9d800ec50, vector=0x7fd9d80045c0, count=1, off=211, flags=32769, iobref=0x7fd9d8003a00, 
    xdata=0x0) at defaults.c:1849
#14 0x00007fd9f0f4f601 in call_resume_wind (stub=0x7fd9d80193a0) at call-stub.c:2045
#15 0x00007fd9f0f5f3ec in call_resume (stub=0x7fd9d80193a0) at call-stub.c:2512
#16 0x00007fd9e23ed91b in open_and_resume (this=0x7fd9dc017e40, fd=0x7fd9d800ec50, stub=0x7fd9d80193a0) at open-behind.c:246
#17 0x00007fd9e23ee87c in ob_writev (frame=0x7fd9d8003ad0, this=0x7fd9dc017e40, fd=0x7fd9d800ec50, iov=0x7fd9d8009c90, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0)
    at open-behind.c:424
#18 0x00007fd9e21d94e7 in mdc_writev (frame=0x7fd9d8004490, this=0x7fd9dc019480, fd=0x7fd9d800ec50, vector=0x7fd9d8009c90, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, 
    xdata=0x0) at md-cache.c:2134
#19 0x00007fd9e1fb88c0 in io_stats_writev (frame=0x7fd9d80042a0, this=0x7fd9dc01aac0, fd=0x7fd9d800ec50, vector=0x7fd9d8009c90, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, 
    xdata=0x0) at io-stats.c:2955
#20 0x00007fd9f0fec869 in default_writev (frame=0x7fd9d80042a0, this=0x7fd9dc01c5b0, fd=0x7fd9d800ec50, vector=0x7fd9d8009c90, count=1, off=211, flags=32769, iobref=0x7fd9d8003a00, 
    xdata=0x0) at defaults.c:2543
#21 0x00007fd9e1d89957 in meta_writev (frame=0x7fd9d80042a0, this=0x7fd9dc01c5b0, fd=0x7fd9d800ec50, iov=0x7fd9d8009c90, count=1, offset=211, flags=32769, iobref=0x7fd9d8003a00, xdata=0x0)
    at meta.c:142
#22 0x00007fd9e8303687 in fuse_write_resume (state=0x7fd9d8009580) at fuse-bridge.c:2457
#23 0x00007fd9e82f72c7 in fuse_fop_resume (state=0x7fd9d8009580) at fuse-bridge.c:625
#24 0x00007fd9e82f4a2b in fuse_resolve_done (state=0x7fd9d8009580) at fuse-resolve.c:663
#25 0x00007fd9e82f4b01 in fuse_resolve_all (state=0x7fd9d8009580) at fuse-resolve.c:690
#26 0x00007fd9e82f4a0c in fuse_resolve (state=0x7fd9d8009580) at fuse-resolve.c:654
#27 0x00007fd9e82f4ad8 in fuse_resolve_all (state=0x7fd9d8009580) at fuse-resolve.c:686
#28 0x00007fd9e82f4b5f in fuse_resolve_continue (state=0x7fd9d8009580) at fuse-resolve.c:706
#29 0x00007fd9e82f47c5 in fuse_resolve_fd (state=0x7fd9d8009580) at fuse-resolve.c:566
#30 0x00007fd9e82f49ba in fuse_resolve (state=0x7fd9d8009580) at fuse-resolve.c:643
#31 0x00007fd9e82f4a83 in fuse_resolve_all (state=0x7fd9d8009580) at fuse-resolve.c:679
#32 0x00007fd9e82f4b9d in fuse_resolve_and_resume (state=0x7fd9d8009580, fn=0x7fd9e83030d5 <fuse_write_resume>) at fuse-resolve.c:718
---Type <return> to continue, or q <return> to quit---
#33 0x00007fd9e830387a in fuse_write (this=0x12a2100, finh=0x7fd9d80046a0, msg=0x7fd9ef22a000) at fuse-bridge.c:2509
#34 0x00007fd9e83105a2 in fuse_thread_proc (data=0x12a2100) at fuse-bridge.c:5083
#35 0x00007fd9efd7fe25 in start_thread () from /lib64/libpthread.so.0
#36 0x00007fd9ef648bad in clone () from /lib64/libc.so.6
--------------------------------------------------------------------------------
I think the guarantee write-behind gives is that it will fail the next fsync call, but I'm moving it to WB to confirm if this is a bug that needs to be fixed.


Note You need to log in before you can comment on or make changes to this bug.