Bug 1397267

Summary: File creation fails with Input/output error + FUSE logs throws "invalid argument: inode [Invalid argument]"
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Prasad Desala <tdesala>
Component: rpcAssignee: Raghavendra Talur <rtalur>
Status: CLOSED ERRATA QA Contact: Prasad Desala <tdesala>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, rhinduja, rhs-bugs, rjoseph, rtalur, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-23 06:20:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1385605    
Bug Blocks: 1351528    

Description Prasad Desala 2016-11-22 06:20:50 UTC
Description of problem:
======================
File creation fails with Input/output error.

[root@dhcpxx fuse]# touch script.sh
touch: cannot touch ‘script.sh’: Input/output error
[root@dhcpxxfuse]# touch s1.sh
touch: cannot touch ‘s1.sh’: Input/output error

Version-Release number of selected component (if applicable):
3.8.4-5.el7rhgs.x86_64

How reproducible:
================
only once

Steps to Reproduce:
===================
1) Create a Distributed-Disperse volume and start it.
2) FUSE mount the volume on multiple clients.
3) Create files and directories from mount point.

Few of the file creation are getting failed with IO error.

[root@dhcpxx fuse]# touch script.sh
touch: cannot touch ‘script.sh’: Input/output error
[root@dhcpxxfuse]# touch s1.sh
touch: cannot touch ‘s1.sh’: Input/output error

Actual results:
===============
File creation fails with Input/output error.

Expected results:
=================
File creation should be successful without any errors/issues.

Comment 8 Raghavendra Talur 2016-11-22 15:49:10 UTC
debugging info: Buggy reconnection log messages


grep  88-ecvol-client-6 mnt-fuse.log | grep 13:
[2016-11-21 13:23:32.415643] W [MSGID: 114063] [client-callback.c:110:client_cbk_cache_invalidation] 88-ecvol-client-6: XDR decode of cache_invalidation failed. [Operation not permitted]
[2016-11-21 13:23:32.416609] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7fabe46c55b2] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fabe448b75e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fabe448b86e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7fabe448cfc4] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7fabe448d8a0] ))))) 88-ecvol-client-6: forced unwinding frame type(GlusterFS 3.3) op(READDIRP(40)) called at 2016-11-21 13:23:32.405196 (xid=0x95934)
The message "W [MSGID: 114063] [client-callback.c:110:client_cbk_cache_invalidation] 88-ecvol-client-6: XDR decode of cache_invalidation failed. [Operation not permitted]" repeated 44 times between [2016-11-21 13:23:32.415643] and [2016-11-21 13:23:32.416219]
[2016-11-21 13:23:32.416638] W [MSGID: 114031] [client-rpc-fops.c:2648:client3_3_readdirp_cbk] 88-ecvol-client-6: remote operation failed [Transport endpoint is not connected]
[2016-11-21 13:23:32.419660] I [socket.c:3401:socket_submit_request] 88-ecvol-client-6: not connected (priv->connected = 0)
[2016-11-21 13:23:32.419687] W [rpc-clnt.c:1640:rpc_clnt_submit] 88-ecvol-client-6: failed to submit rpc-request (XID: 0x95935 Program: GlusterFS 3.3, ProgVers: 330, Proc: 42) to rpc-transport (ecvol-client-6)
[2016-11-21 13:23:32.420617] I [rpc-clnt.c:1947:rpc_clnt_reconfig] 88-ecvol-client-6: changing port to 49154 (from 0)
[2016-11-21 13:23:32.425115] I [MSGID: 114057] [client-handshake.c:1446:select_server_supported_programs] 88-ecvol-client-6: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2016-11-21 13:23:32.426228] I [MSGID: 114046] [client-handshake.c:1222:client_setvolume_cbk] 88-ecvol-client-6: Connected to ecvol-client-6, attached to remote volume '/bricks/brick2/b2'.
[2016-11-21 13:23:32.426256] I [MSGID: 114047] [client-handshake.c:1233:client_setvolume_cbk] 88-ecvol-client-6: Server and Client lk-version numbers are not same, reopening the fds
[2016-11-21 13:23:32.426271] I [MSGID: 114042] [client-handshake.c:1053:client_post_handshake] 88-ecvol-client-6: 2 fds open - Delaying child_up until they are re-opened
[2016-11-21 13:23:32.426744] I [MSGID: 114060] [client-handshake.c:817:client3_3_reopendir_cbk] 88-ecvol-client-6: reopendir on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 0)
[2016-11-21 13:23:32.426787] I [MSGID: 114060] [client-handshake.c:817:client3_3_reopendir_cbk] 88-ecvol-client-6: reopendir on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 1)
[2016-11-21 13:23:32.426816] I [MSGID: 114041] [client-handshake.c:675:client_child_up_reopen_done] 88-ecvol-client-6: last fd open'd/lock-self-heal'd - notifying CHILD-UP
[2016-11-21 13:23:32.427005] I [MSGID: 114035] [client-handshake.c:201:client_set_lk_version_cbk] 88-ecvol-client-6: Server lk version = 1
[2016-11-21 13:23:32.542870] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 88-ecvol-client-6: disconnected from ecvol-client-6. Client process will keep trying to connect to glusterd until brick's port is available

Comment 9 Raghavendra Talur 2016-11-22 15:58:23 UTC
debugging info : logs from brick corresponding to reconnection time



[2016-11-21 12:02:47.652049] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-0 (version: 3.8.4)
[2016-11-21 12:02:47.859194] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-195.lab.eng.blr.redhat.com-3196-2016/11/21-12:02:31:75507-ecvol-client-6-0-0
[2016-11-21 12:02:47.859377] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-195.lab.eng.blr.redhat.com-3196-2016/11/21-12:02:31:75507-ecvol-client-6-0-0
[2016-11-21 12:02:47.939212] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-44.lab.eng.blr.redhat.com-26617-2016/11/21-12:01:59:695317-ecvol-client-6-0-0
[2016-11-21 12:02:47.939474] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-44.lab.eng.blr.redhat.com-26617-2016/11/21-12:01:59:695317-ecvol-client-6-0-0
[2016-11-21 12:02:48.222659] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-215.lab.eng.blr.redhat.com-27992-2016/11/21-12:01:54:633639-ecvol-client-6-0-0
[2016-11-21 12:02:48.222889] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-215.lab.eng.blr.redhat.com-27992-2016/11/21-12:01:54:633639-ecvol-client-6-0-0
[2016-11-21 12:02:49.073121] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:49.073156] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-44.lab.eng.blr.redhat.com-1548-2016/11/16-06:04:23:72379-ecvol-client-6-40-0 (version: 3.8.4)
[2016-11-21 12:02:49.289578] I [socket.c:3401:socket_submit_request] 0-ecvol-quota: not connected (priv->connected = 0)
[2016-11-21 12:02:49.289606] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xdd Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.289990] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xde Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.300386] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-27-0
[2016-11-21 12:02:49.300556] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-27-0
[2016-11-21 12:02:49.305852] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xdf Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.306381] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe0 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.306528] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe1 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.312894] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-190.lab.eng.blr.redhat.com-1675-2016/11/16-06:02:35:481106-ecvol-client-6-37-0
[2016-11-21 12:02:49.313043] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-190.lab.eng.blr.redhat.com-1675-2016/11/16-06:02:35:481106-ecvol-client-6-37-0
[2016-11-21 12:02:49.318829] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe2 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.328065] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-44.lab.eng.blr.redhat.com-1548-2016/11/16-06:04:23:72379-ecvol-client-6-37-0
[2016-11-21 12:02:49.328225] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-44.lab.eng.blr.redhat.com-1548-2016/11/16-06:04:23:72379-ecvol-client-6-37-0
[2016-11-21 12:02:49.334847] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-89.lab.eng.blr.redhat.com-2380-2016/11/18-05:57:57:587746-ecvol-client-6-27-0
[2016-11-21 12:02:49.334998] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-89.lab.eng.blr.redhat.com-2380-2016/11/18-05:57:57:587746-ecvol-client-6-27-0
[2016-11-21 12:02:49.348553] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:49.348582] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-215.lab.eng.blr.redhat.com-9076-2016/11/17-09:25:44:452854-ecvol-client-6-30-0 (version: 3.8.4)
[2016-11-21 12:02:49.362776] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe3 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.530562] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-215.lab.eng.blr.redhat.com-9076-2016/11/17-09:25:44:452854-ecvol-client-6-27-0
[2016-11-21 12:02:49.530757] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-215.lab.eng.blr.redhat.com-9076-2016/11/17-09:25:44:452854-ecvol-client-6-27-0
[2016-11-21 12:02:49.759795] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:49.759836] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-190.lab.eng.blr.redhat.com-29323-2016/11/21-12:02:45:686547-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:49.770000] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-85-0
[2016-11-21 12:02:49.770101] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-85-0
[2016-11-21 12:02:50.773389] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:50.773425] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-190.lab.eng.blr.redhat.com-29331-2016/11/21-12:02:46:695123-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:50.798089] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe4 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:51.914779] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:51.914816] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-195.lab.eng.blr.redhat.com-3234-2016/11/21-12:02:47:834114-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:52.015092] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:52.015144] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-44.lab.eng.blr.redhat.com-26679-2016/11/21-12:02:47:880064-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:52.291714] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:52.291747] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-215.lab.eng.blr.redhat.com-28080-2016/11/21-12:02:48:192571-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:52.931139] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:52.931189] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-195.lab.eng.blr.redhat.com-3242-2016/11/21-12:02:48:843832-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:53.016233] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:53.016265] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-44.lab.eng.blr.redhat.com-26687-2016/11/21-12:02:48:905484-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:53.314719] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:53.314761] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-215.lab.eng.blr.redhat.com-28088-2016/11/21-12:02:49:212431-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:04:08.626761] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-82-0
[2016-11-21 12:04:08.626943] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-82-0
[2016-11-21 12:04:59.797253] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-211.lab.eng.blr.redhat.com-2384-2016/11/18-05:57:55:83077-ecvol-client-6-27-0
[2016-11-21 12:04:59.797468] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-211.lab.eng.blr.redhat.com-2384-2016/11/18-05:57:55:83077-ecvol-client-6-27-0
[2016-11-21 12:06:12.950980] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-24-0
[2016-11-21 12:06:12.951132] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-24-0
[2016-11-21 12:27:37.778472] E [marker-quota.c:728:_mq_get_metadata] 0-ecvol-marker: dict_get failed.
[2016-11-21 13:23:18.581409] E [upcall-internal.c:570:upcall_cache_invalidate] (-->/usr/lib64/glusterfs/3.8.4/xlator/features/access-control.so(+0xad79) [0x7f68e6639d79] -->/usr/lib64/glusterfs/3.8.4/xlator/features/locks.so(+0xd4f2) [0x7f68e64184f2] -->/usr/lib64/glusterfs/3.8.4/xlator/features/upcall.so(+0x5b3e) [0x7f68e5bccb3e] ) 0-upcall_cache_invalidate: invalid argument: !(gf_uuid_is_null (up_inode_ctx->gfid)) [Invalid argument]
[2016-11-21 13:23:32.442956] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-1 (version: 3.8.4)
[2016-11-21 13:23:33.684262] W [socket.c:590:__socket_rwv] 0-tcp.ecvol-server: writev on 10.70.37.91:880 failed (Connection reset by peer)
[2016-11-21 13:23:33.684370] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-0
[2016-11-21 13:23:33.684415] I [MSGID: 115013] [server-helpers.c:296:do_fd_cleanup] 0-ecvol-server: fd cleanup on /
[2016-11-21 13:23:33.684751] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-0

Comment 10 Raghavendra Talur 2016-11-23 13:30:45 UTC
Patch posted at http://review.gluster.org/#/c/15916

Comment 14 Prasad Desala 2017-01-16 09:17:28 UTC
This issue is not reproducible on glusterfs version 3.8.4-11.el7.x86_64. Moving this BZ to Verified.

Comment 16 errata-xmlrpc 2017-03-23 06:20:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html