Description of problem: ====================== File creation fails with Input/output error. [root@dhcpxx fuse]# touch script.sh touch: cannot touch ‘script.sh’: Input/output error [root@dhcpxxfuse]# touch s1.sh touch: cannot touch ‘s1.sh’: Input/output error Version-Release number of selected component (if applicable): 3.8.4-5.el7rhgs.x86_64 How reproducible: ================ only once Steps to Reproduce: =================== 1) Create a Distributed-Disperse volume and start it. 2) FUSE mount the volume on multiple clients. 3) Create files and directories from mount point. Few of the file creation are getting failed with IO error. [root@dhcpxx fuse]# touch script.sh touch: cannot touch ‘script.sh’: Input/output error [root@dhcpxxfuse]# touch s1.sh touch: cannot touch ‘s1.sh’: Input/output error Actual results: =============== File creation fails with Input/output error. Expected results: ================= File creation should be successful without any errors/issues.
debugging info: Buggy reconnection log messages grep 88-ecvol-client-6 mnt-fuse.log | grep 13: [2016-11-21 13:23:32.415643] W [MSGID: 114063] [client-callback.c:110:client_cbk_cache_invalidation] 88-ecvol-client-6: XDR decode of cache_invalidation failed. [Operation not permitted] [2016-11-21 13:23:32.416609] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7fabe46c55b2] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fabe448b75e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fabe448b86e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7fabe448cfc4] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7fabe448d8a0] ))))) 88-ecvol-client-6: forced unwinding frame type(GlusterFS 3.3) op(READDIRP(40)) called at 2016-11-21 13:23:32.405196 (xid=0x95934) The message "W [MSGID: 114063] [client-callback.c:110:client_cbk_cache_invalidation] 88-ecvol-client-6: XDR decode of cache_invalidation failed. [Operation not permitted]" repeated 44 times between [2016-11-21 13:23:32.415643] and [2016-11-21 13:23:32.416219] [2016-11-21 13:23:32.416638] W [MSGID: 114031] [client-rpc-fops.c:2648:client3_3_readdirp_cbk] 88-ecvol-client-6: remote operation failed [Transport endpoint is not connected] [2016-11-21 13:23:32.419660] I [socket.c:3401:socket_submit_request] 88-ecvol-client-6: not connected (priv->connected = 0) [2016-11-21 13:23:32.419687] W [rpc-clnt.c:1640:rpc_clnt_submit] 88-ecvol-client-6: failed to submit rpc-request (XID: 0x95935 Program: GlusterFS 3.3, ProgVers: 330, Proc: 42) to rpc-transport (ecvol-client-6) [2016-11-21 13:23:32.420617] I [rpc-clnt.c:1947:rpc_clnt_reconfig] 88-ecvol-client-6: changing port to 49154 (from 0) [2016-11-21 13:23:32.425115] I [MSGID: 114057] [client-handshake.c:1446:select_server_supported_programs] 88-ecvol-client-6: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2016-11-21 13:23:32.426228] I [MSGID: 114046] [client-handshake.c:1222:client_setvolume_cbk] 88-ecvol-client-6: Connected to ecvol-client-6, attached to remote volume '/bricks/brick2/b2'. [2016-11-21 13:23:32.426256] I [MSGID: 114047] [client-handshake.c:1233:client_setvolume_cbk] 88-ecvol-client-6: Server and Client lk-version numbers are not same, reopening the fds [2016-11-21 13:23:32.426271] I [MSGID: 114042] [client-handshake.c:1053:client_post_handshake] 88-ecvol-client-6: 2 fds open - Delaying child_up until they are re-opened [2016-11-21 13:23:32.426744] I [MSGID: 114060] [client-handshake.c:817:client3_3_reopendir_cbk] 88-ecvol-client-6: reopendir on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 0) [2016-11-21 13:23:32.426787] I [MSGID: 114060] [client-handshake.c:817:client3_3_reopendir_cbk] 88-ecvol-client-6: reopendir on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 1) [2016-11-21 13:23:32.426816] I [MSGID: 114041] [client-handshake.c:675:client_child_up_reopen_done] 88-ecvol-client-6: last fd open'd/lock-self-heal'd - notifying CHILD-UP [2016-11-21 13:23:32.427005] I [MSGID: 114035] [client-handshake.c:201:client_set_lk_version_cbk] 88-ecvol-client-6: Server lk version = 1 [2016-11-21 13:23:32.542870] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 88-ecvol-client-6: disconnected from ecvol-client-6. Client process will keep trying to connect to glusterd until brick's port is available
debugging info : logs from brick corresponding to reconnection time [2016-11-21 12:02:47.652049] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-0 (version: 3.8.4) [2016-11-21 12:02:47.859194] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-195.lab.eng.blr.redhat.com-3196-2016/11/21-12:02:31:75507-ecvol-client-6-0-0 [2016-11-21 12:02:47.859377] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-195.lab.eng.blr.redhat.com-3196-2016/11/21-12:02:31:75507-ecvol-client-6-0-0 [2016-11-21 12:02:47.939212] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-44.lab.eng.blr.redhat.com-26617-2016/11/21-12:01:59:695317-ecvol-client-6-0-0 [2016-11-21 12:02:47.939474] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-44.lab.eng.blr.redhat.com-26617-2016/11/21-12:01:59:695317-ecvol-client-6-0-0 [2016-11-21 12:02:48.222659] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-215.lab.eng.blr.redhat.com-27992-2016/11/21-12:01:54:633639-ecvol-client-6-0-0 [2016-11-21 12:02:48.222889] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-215.lab.eng.blr.redhat.com-27992-2016/11/21-12:01:54:633639-ecvol-client-6-0-0 [2016-11-21 12:02:49.073121] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28 [2016-11-21 12:02:49.073156] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-44.lab.eng.blr.redhat.com-1548-2016/11/16-06:04:23:72379-ecvol-client-6-40-0 (version: 3.8.4) [2016-11-21 12:02:49.289578] I [socket.c:3401:socket_submit_request] 0-ecvol-quota: not connected (priv->connected = 0) [2016-11-21 12:02:49.289606] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xdd Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota) [2016-11-21 12:02:49.289990] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xde Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota) [2016-11-21 12:02:49.300386] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-27-0 [2016-11-21 12:02:49.300556] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-27-0 [2016-11-21 12:02:49.305852] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xdf Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota) [2016-11-21 12:02:49.306381] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe0 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota) [2016-11-21 12:02:49.306528] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe1 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota) [2016-11-21 12:02:49.312894] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-190.lab.eng.blr.redhat.com-1675-2016/11/16-06:02:35:481106-ecvol-client-6-37-0 [2016-11-21 12:02:49.313043] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-190.lab.eng.blr.redhat.com-1675-2016/11/16-06:02:35:481106-ecvol-client-6-37-0 [2016-11-21 12:02:49.318829] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe2 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota) [2016-11-21 12:02:49.328065] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-44.lab.eng.blr.redhat.com-1548-2016/11/16-06:04:23:72379-ecvol-client-6-37-0 [2016-11-21 12:02:49.328225] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-44.lab.eng.blr.redhat.com-1548-2016/11/16-06:04:23:72379-ecvol-client-6-37-0 [2016-11-21 12:02:49.334847] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-89.lab.eng.blr.redhat.com-2380-2016/11/18-05:57:57:587746-ecvol-client-6-27-0 [2016-11-21 12:02:49.334998] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-89.lab.eng.blr.redhat.com-2380-2016/11/18-05:57:57:587746-ecvol-client-6-27-0 [2016-11-21 12:02:49.348553] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28 [2016-11-21 12:02:49.348582] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-215.lab.eng.blr.redhat.com-9076-2016/11/17-09:25:44:452854-ecvol-client-6-30-0 (version: 3.8.4) [2016-11-21 12:02:49.362776] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe3 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota) [2016-11-21 12:02:49.530562] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-215.lab.eng.blr.redhat.com-9076-2016/11/17-09:25:44:452854-ecvol-client-6-27-0 [2016-11-21 12:02:49.530757] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-215.lab.eng.blr.redhat.com-9076-2016/11/17-09:25:44:452854-ecvol-client-6-27-0 [2016-11-21 12:02:49.759795] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28 [2016-11-21 12:02:49.759836] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-190.lab.eng.blr.redhat.com-29323-2016/11/21-12:02:45:686547-ecvol-client-6-0-0 (version: 3.8.4) [2016-11-21 12:02:49.770000] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-85-0 [2016-11-21 12:02:49.770101] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-85-0 [2016-11-21 12:02:50.773389] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28 [2016-11-21 12:02:50.773425] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-190.lab.eng.blr.redhat.com-29331-2016/11/21-12:02:46:695123-ecvol-client-6-0-0 (version: 3.8.4) [2016-11-21 12:02:50.798089] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe4 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota) [2016-11-21 12:02:51.914779] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28 [2016-11-21 12:02:51.914816] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-195.lab.eng.blr.redhat.com-3234-2016/11/21-12:02:47:834114-ecvol-client-6-0-0 (version: 3.8.4) [2016-11-21 12:02:52.015092] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28 [2016-11-21 12:02:52.015144] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-44.lab.eng.blr.redhat.com-26679-2016/11/21-12:02:47:880064-ecvol-client-6-0-0 (version: 3.8.4) [2016-11-21 12:02:52.291714] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28 [2016-11-21 12:02:52.291747] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-215.lab.eng.blr.redhat.com-28080-2016/11/21-12:02:48:192571-ecvol-client-6-0-0 (version: 3.8.4) [2016-11-21 12:02:52.931139] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28 [2016-11-21 12:02:52.931189] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-195.lab.eng.blr.redhat.com-3242-2016/11/21-12:02:48:843832-ecvol-client-6-0-0 (version: 3.8.4) [2016-11-21 12:02:53.016233] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28 [2016-11-21 12:02:53.016265] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-44.lab.eng.blr.redhat.com-26687-2016/11/21-12:02:48:905484-ecvol-client-6-0-0 (version: 3.8.4) [2016-11-21 12:02:53.314719] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28 [2016-11-21 12:02:53.314761] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-215.lab.eng.blr.redhat.com-28088-2016/11/21-12:02:49:212431-ecvol-client-6-0-0 (version: 3.8.4) [2016-11-21 12:04:08.626761] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-82-0 [2016-11-21 12:04:08.626943] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-82-0 [2016-11-21 12:04:59.797253] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-211.lab.eng.blr.redhat.com-2384-2016/11/18-05:57:55:83077-ecvol-client-6-27-0 [2016-11-21 12:04:59.797468] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-211.lab.eng.blr.redhat.com-2384-2016/11/18-05:57:55:83077-ecvol-client-6-27-0 [2016-11-21 12:06:12.950980] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-24-0 [2016-11-21 12:06:12.951132] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-24-0 [2016-11-21 12:27:37.778472] E [marker-quota.c:728:_mq_get_metadata] 0-ecvol-marker: dict_get failed. [2016-11-21 13:23:18.581409] E [upcall-internal.c:570:upcall_cache_invalidate] (-->/usr/lib64/glusterfs/3.8.4/xlator/features/access-control.so(+0xad79) [0x7f68e6639d79] -->/usr/lib64/glusterfs/3.8.4/xlator/features/locks.so(+0xd4f2) [0x7f68e64184f2] -->/usr/lib64/glusterfs/3.8.4/xlator/features/upcall.so(+0x5b3e) [0x7f68e5bccb3e] ) 0-upcall_cache_invalidate: invalid argument: !(gf_uuid_is_null (up_inode_ctx->gfid)) [Invalid argument] [2016-11-21 13:23:32.442956] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-1 (version: 3.8.4) [2016-11-21 13:23:33.684262] W [socket.c:590:__socket_rwv] 0-tcp.ecvol-server: writev on 10.70.37.91:880 failed (Connection reset by peer) [2016-11-21 13:23:33.684370] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-0 [2016-11-21 13:23:33.684415] I [MSGID: 115013] [server-helpers.c:296:do_fd_cleanup] 0-ecvol-server: fd cleanup on / [2016-11-21 13:23:33.684751] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-0
Patch posted at http://review.gluster.org/#/c/15916
This issue is not reproducible on glusterfs version 3.8.4-11.el7.x86_64. Moving this BZ to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html