Bug 1397267 - File creation fails with Input/output error + FUSE logs throws "invalid argument: inode [Invalid argument]"
Summary: File creation fails with Input/output error + FUSE logs throws "invalid argum...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rpc
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.2.0
Assignee: Raghavendra Talur
QA Contact: Prasad Desala
URL:
Whiteboard:
Depends On: 1385605
Blocks: 1351528
TreeView+ depends on / blocked
 
Reported: 2016-11-22 06:20 UTC by Prasad Desala
Modified: 2017-03-23 06:20 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.8.4-7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-23 06:20:30 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0486 0 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 09:18:45 UTC

Description Prasad Desala 2016-11-22 06:20:50 UTC
Description of problem:
======================
File creation fails with Input/output error.

[root@dhcpxx fuse]# touch script.sh
touch: cannot touch ‘script.sh’: Input/output error
[root@dhcpxxfuse]# touch s1.sh
touch: cannot touch ‘s1.sh’: Input/output error

Version-Release number of selected component (if applicable):
3.8.4-5.el7rhgs.x86_64

How reproducible:
================
only once

Steps to Reproduce:
===================
1) Create a Distributed-Disperse volume and start it.
2) FUSE mount the volume on multiple clients.
3) Create files and directories from mount point.

Few of the file creation are getting failed with IO error.

[root@dhcpxx fuse]# touch script.sh
touch: cannot touch ‘script.sh’: Input/output error
[root@dhcpxxfuse]# touch s1.sh
touch: cannot touch ‘s1.sh’: Input/output error

Actual results:
===============
File creation fails with Input/output error.

Expected results:
=================
File creation should be successful without any errors/issues.

Comment 8 Raghavendra Talur 2016-11-22 15:49:10 UTC
debugging info: Buggy reconnection log messages


grep  88-ecvol-client-6 mnt-fuse.log | grep 13:
[2016-11-21 13:23:32.415643] W [MSGID: 114063] [client-callback.c:110:client_cbk_cache_invalidation] 88-ecvol-client-6: XDR decode of cache_invalidation failed. [Operation not permitted]
[2016-11-21 13:23:32.416609] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7fabe46c55b2] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fabe448b75e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fabe448b86e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7fabe448cfc4] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7fabe448d8a0] ))))) 88-ecvol-client-6: forced unwinding frame type(GlusterFS 3.3) op(READDIRP(40)) called at 2016-11-21 13:23:32.405196 (xid=0x95934)
The message "W [MSGID: 114063] [client-callback.c:110:client_cbk_cache_invalidation] 88-ecvol-client-6: XDR decode of cache_invalidation failed. [Operation not permitted]" repeated 44 times between [2016-11-21 13:23:32.415643] and [2016-11-21 13:23:32.416219]
[2016-11-21 13:23:32.416638] W [MSGID: 114031] [client-rpc-fops.c:2648:client3_3_readdirp_cbk] 88-ecvol-client-6: remote operation failed [Transport endpoint is not connected]
[2016-11-21 13:23:32.419660] I [socket.c:3401:socket_submit_request] 88-ecvol-client-6: not connected (priv->connected = 0)
[2016-11-21 13:23:32.419687] W [rpc-clnt.c:1640:rpc_clnt_submit] 88-ecvol-client-6: failed to submit rpc-request (XID: 0x95935 Program: GlusterFS 3.3, ProgVers: 330, Proc: 42) to rpc-transport (ecvol-client-6)
[2016-11-21 13:23:32.420617] I [rpc-clnt.c:1947:rpc_clnt_reconfig] 88-ecvol-client-6: changing port to 49154 (from 0)
[2016-11-21 13:23:32.425115] I [MSGID: 114057] [client-handshake.c:1446:select_server_supported_programs] 88-ecvol-client-6: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2016-11-21 13:23:32.426228] I [MSGID: 114046] [client-handshake.c:1222:client_setvolume_cbk] 88-ecvol-client-6: Connected to ecvol-client-6, attached to remote volume '/bricks/brick2/b2'.
[2016-11-21 13:23:32.426256] I [MSGID: 114047] [client-handshake.c:1233:client_setvolume_cbk] 88-ecvol-client-6: Server and Client lk-version numbers are not same, reopening the fds
[2016-11-21 13:23:32.426271] I [MSGID: 114042] [client-handshake.c:1053:client_post_handshake] 88-ecvol-client-6: 2 fds open - Delaying child_up until they are re-opened
[2016-11-21 13:23:32.426744] I [MSGID: 114060] [client-handshake.c:817:client3_3_reopendir_cbk] 88-ecvol-client-6: reopendir on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 0)
[2016-11-21 13:23:32.426787] I [MSGID: 114060] [client-handshake.c:817:client3_3_reopendir_cbk] 88-ecvol-client-6: reopendir on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 1)
[2016-11-21 13:23:32.426816] I [MSGID: 114041] [client-handshake.c:675:client_child_up_reopen_done] 88-ecvol-client-6: last fd open'd/lock-self-heal'd - notifying CHILD-UP
[2016-11-21 13:23:32.427005] I [MSGID: 114035] [client-handshake.c:201:client_set_lk_version_cbk] 88-ecvol-client-6: Server lk version = 1
[2016-11-21 13:23:32.542870] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 88-ecvol-client-6: disconnected from ecvol-client-6. Client process will keep trying to connect to glusterd until brick's port is available

Comment 9 Raghavendra Talur 2016-11-22 15:58:23 UTC
debugging info : logs from brick corresponding to reconnection time



[2016-11-21 12:02:47.652049] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-0 (version: 3.8.4)
[2016-11-21 12:02:47.859194] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-195.lab.eng.blr.redhat.com-3196-2016/11/21-12:02:31:75507-ecvol-client-6-0-0
[2016-11-21 12:02:47.859377] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-195.lab.eng.blr.redhat.com-3196-2016/11/21-12:02:31:75507-ecvol-client-6-0-0
[2016-11-21 12:02:47.939212] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-44.lab.eng.blr.redhat.com-26617-2016/11/21-12:01:59:695317-ecvol-client-6-0-0
[2016-11-21 12:02:47.939474] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-44.lab.eng.blr.redhat.com-26617-2016/11/21-12:01:59:695317-ecvol-client-6-0-0
[2016-11-21 12:02:48.222659] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-215.lab.eng.blr.redhat.com-27992-2016/11/21-12:01:54:633639-ecvol-client-6-0-0
[2016-11-21 12:02:48.222889] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-215.lab.eng.blr.redhat.com-27992-2016/11/21-12:01:54:633639-ecvol-client-6-0-0
[2016-11-21 12:02:49.073121] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:49.073156] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-44.lab.eng.blr.redhat.com-1548-2016/11/16-06:04:23:72379-ecvol-client-6-40-0 (version: 3.8.4)
[2016-11-21 12:02:49.289578] I [socket.c:3401:socket_submit_request] 0-ecvol-quota: not connected (priv->connected = 0)
[2016-11-21 12:02:49.289606] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xdd Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.289990] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xde Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.300386] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-27-0
[2016-11-21 12:02:49.300556] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-27-0
[2016-11-21 12:02:49.305852] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xdf Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.306381] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe0 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.306528] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe1 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.312894] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-190.lab.eng.blr.redhat.com-1675-2016/11/16-06:02:35:481106-ecvol-client-6-37-0
[2016-11-21 12:02:49.313043] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-190.lab.eng.blr.redhat.com-1675-2016/11/16-06:02:35:481106-ecvol-client-6-37-0
[2016-11-21 12:02:49.318829] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe2 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.328065] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-44.lab.eng.blr.redhat.com-1548-2016/11/16-06:04:23:72379-ecvol-client-6-37-0
[2016-11-21 12:02:49.328225] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-44.lab.eng.blr.redhat.com-1548-2016/11/16-06:04:23:72379-ecvol-client-6-37-0
[2016-11-21 12:02:49.334847] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-89.lab.eng.blr.redhat.com-2380-2016/11/18-05:57:57:587746-ecvol-client-6-27-0
[2016-11-21 12:02:49.334998] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-89.lab.eng.blr.redhat.com-2380-2016/11/18-05:57:57:587746-ecvol-client-6-27-0
[2016-11-21 12:02:49.348553] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:49.348582] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-215.lab.eng.blr.redhat.com-9076-2016/11/17-09:25:44:452854-ecvol-client-6-30-0 (version: 3.8.4)
[2016-11-21 12:02:49.362776] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe3 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:49.530562] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-215.lab.eng.blr.redhat.com-9076-2016/11/17-09:25:44:452854-ecvol-client-6-27-0
[2016-11-21 12:02:49.530757] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-215.lab.eng.blr.redhat.com-9076-2016/11/17-09:25:44:452854-ecvol-client-6-27-0
[2016-11-21 12:02:49.759795] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:49.759836] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-190.lab.eng.blr.redhat.com-29323-2016/11/21-12:02:45:686547-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:49.770000] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-85-0
[2016-11-21 12:02:49.770101] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-85-0
[2016-11-21 12:02:50.773389] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:50.773425] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-190.lab.eng.blr.redhat.com-29331-2016/11/21-12:02:46:695123-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:50.798089] W [rpc-clnt.c:1640:rpc_clnt_submit] 0-ecvol-quota: failed to submit rpc-request (XID: 0xe4 Program: Quota enforcer, ProgVers: 1, Proc: 1) to rpc-transport (ecvol-quota)
[2016-11-21 12:02:51.914779] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:51.914816] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-195.lab.eng.blr.redhat.com-3234-2016/11/21-12:02:47:834114-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:52.015092] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:52.015144] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-44.lab.eng.blr.redhat.com-26679-2016/11/21-12:02:47:880064-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:52.291714] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:52.291747] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-215.lab.eng.blr.redhat.com-28080-2016/11/21-12:02:48:192571-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:52.931139] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:52.931189] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-195.lab.eng.blr.redhat.com-3242-2016/11/21-12:02:48:843832-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:53.016233] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:53.016265] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-44.lab.eng.blr.redhat.com-26687-2016/11/21-12:02:48:905484-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:02:53.314719] I [login.c:76:gf_auth] 0-auth/login: allowed user names: 14b2c0fb-f6bd-42c8-a2bb-0c6172269a28
[2016-11-21 12:02:53.314761] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-215.lab.eng.blr.redhat.com-28088-2016/11/21-12:02:49:212431-ecvol-client-6-0-0 (version: 3.8.4)
[2016-11-21 12:04:08.626761] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-82-0
[2016-11-21 12:04:08.626943] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-82-0
[2016-11-21 12:04:59.797253] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-211.lab.eng.blr.redhat.com-2384-2016/11/18-05:57:55:83077-ecvol-client-6-27-0
[2016-11-21 12:04:59.797468] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-211.lab.eng.blr.redhat.com-2384-2016/11/18-05:57:55:83077-ecvol-client-6-27-0
[2016-11-21 12:06:12.950980] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-24-0
[2016-11-21 12:06:12.951132] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-124.lab.eng.blr.redhat.com-2385-2016/11/18-05:57:52:325256-ecvol-client-6-24-0
[2016-11-21 12:27:37.778472] E [marker-quota.c:728:_mq_get_metadata] 0-ecvol-marker: dict_get failed.
[2016-11-21 13:23:18.581409] E [upcall-internal.c:570:upcall_cache_invalidate] (-->/usr/lib64/glusterfs/3.8.4/xlator/features/access-control.so(+0xad79) [0x7f68e6639d79] -->/usr/lib64/glusterfs/3.8.4/xlator/features/locks.so(+0xd4f2) [0x7f68e64184f2] -->/usr/lib64/glusterfs/3.8.4/xlator/features/upcall.so(+0x5b3e) [0x7f68e5bccb3e] ) 0-upcall_cache_invalidate: invalid argument: !(gf_uuid_is_null (up_inode_ctx->gfid)) [Invalid argument]
[2016-11-21 13:23:32.442956] I [MSGID: 115029] [server-handshake.c:693:server_setvolume] 0-ecvol-server: accepted client from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-1 (version: 3.8.4)
[2016-11-21 13:23:33.684262] W [socket.c:590:__socket_rwv] 0-tcp.ecvol-server: writev on 10.70.37.91:880 failed (Connection reset by peer)
[2016-11-21 13:23:33.684370] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-ecvol-server: disconnecting connection from dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-0
[2016-11-21 13:23:33.684415] I [MSGID: 115013] [server-helpers.c:296:do_fd_cleanup] 0-ecvol-server: fd cleanup on /
[2016-11-21 13:23:33.684751] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-ecvol-server: Shutting down connection dhcp37-91.lab.eng.blr.redhat.com-20797-2016/11/11-09:54:36:745300-ecvol-client-6-88-0

Comment 10 Raghavendra Talur 2016-11-23 13:30:45 UTC
Patch posted at http://review.gluster.org/#/c/15916

Comment 14 Prasad Desala 2017-01-16 09:17:28 UTC
This issue is not reproducible on glusterfs version 3.8.4-11.el7.x86_64. Moving this BZ to Verified.

Comment 16 errata-xmlrpc 2017-03-23 06:20:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html


Note You need to log in before you can comment on or make changes to this bug.