Description of problem: After add-brick, done few rename operation on mount point while rebalance process was running. After rebalance process crashed on all node, lookup is giving error on mount point it also shows directory twice in output [root@7-VM2 mvs1]# ls ls: cannot access mvetc24: Input/output error ls: cannot access mvetc25: Input/output error ls: cannot read symbolic link mvetc1: Invalid argument ls: cannot read symbolic link mvetc2: Invalid argument ls: cannot read symbolic link mvetc3: Invalid argument ls: cannot access mvetc4: Input/output error ls: cannot access mvetc5: Input/output error ls: cannot access mvetc8: Input/output error ls: cannot access mvetc9: Input/output error ls: cannot access mvetc10: Input/output error ls: cannot access mvetc11: Input/output error ls: cannot access mvetc12: Input/output error ls: cannot access mvetc13: Input/output error ls: cannot access mvetc14: Input/output error ls: cannot access mvetc15: Input/output error ls: cannot access mvetc17: Input/output error ls: cannot access mvetc18: Input/output error ls: cannot access mvetc19: Input/output error ls: cannot access mvetc20: Input/output error ls: cannot access mvetc21: Input/output error ls: cannot access mvetc22: Input/output error ls: cannot access mvetc24: Input/output error ls: cannot access mvetc25: Input/output error mvetc1 mvetc10 mvetc12 mvetc13 mvetc15 mvetc17 mvetc19 mvetc2 mvetc21 mvetc22 mvetc25 mvetc3 mvetc5 mvetc8 mvetc1 mvetc11 mvetc12 mvetc14 mvetc15 mvetc18 mvetc19 mvetc20 mvetc21 mvetc24 mvetc25 mvetc4 mvetc5 mvetc9 mvetc10 mvetc11 mvetc13 mvetc14 mvetc17 mvetc18 mvetc2 mvetc20 mvetc22 mvetc24 mvetc3 mvetc4 mvetc8 mvetc9 log snippet [2013-11-25 05:31:42.588874] E [rpcsvc.c:448:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2013-11-25 05:31:42.604652] E [server-helpers.c:779:server_alloc_frame] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103) [0x3c11c086 83] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x245) [0x3c11c08535] (-->/usr/lib64/glusterfs/3.4.0.44rhs/xlator/protocol/ser ver.so(server3_3_lookup+0xa0) [0x7f1565d01620]))) 0-server: invalid argument: conn [2013-11-25 05:31:42.604666] E [rpcsvc.c:448:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2013-11-25 05:31:42.604734] E [server-helpers.c:779:server_alloc_frame] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103) [0x3c11c086 83] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x245) [0x3c11c08535] (-->/usr/lib64/glusterfs/3.4.0.44rhs/xlator/protocol/ser ver.so(server3_3_statfs+0x8e) [0x7f1565cec43e]))) 0-server: invalid argument: conn [2013-11-25 05:31:42.604744] E [rpcsvc.c:448:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2013-11-25 05:31:42.608670] E [server-helpers.c:779:server_alloc_frame] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x103) [0x3c11c086 83] (-->/usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x245) [0x3c11c08535] (-->/usr/lib64/glusterfs/3.4.0.44rhs/xlator/protocol/ser ver.so(server3_3_lookup+0xa0) [0x7f1565d01620]))) 0-server: invalid argument: conn [2013-11-25 05:31:42.608692] E [rpcsvc.c:448:rpcsvc_check_and_repl y_error] 0-rpcsvc: rpc actor failed to complete successfully [2013-11-25 05:31:42.655936] I [server-handshake.c:569:server_setvolume] 0-flat-server: accepted client from 7-VM1.lab.eng.blr.redhat.com-19593-2013/11/20-05:50:48:691389-flat-client-2-0 (version: 3.4.0.44rhs) [2013-11-25 05:31:42.656940] I [server-handshake.c:569:server_setvolume] 0-flat-server: accepted client from rhs-client22.lab.eng.blr.redhat.com-32542-2013/11/18-13:02:44:149392-flat-client-2-1 (version: 3.4.0.44rhs) [2013-11-25 05:31:42.847764] I [socket.c:3106:socket_submit_reply] 0-tcp.flat-server: not connected (priv->connected = -1) [2013-11-25 05:31:42.892231] E [rpcsvc.c:1111:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x64x, Program: GlusterFS 3.3, ProgVers: 330, Proc: 14) to rpc-transport (tcp.flat-server) [2013-11-25 05:32:27.066739] I [server-helpers.c:590:server_log_conn_destroy] 0-flat-server: destroyed connection of 7-VM1.lab.eng.blr.redhat.com-19593-2013/11/20-05:50:48:691389-flat-client-2-0 - Failed to respond to following operations: STATFS - 1 [2013-11-25 05:54:08.304681] I [server-handshake.c:569:server_setvolume] 0-flat-server: accepted client from 7-VM2.lab.eng.blr.redhat.com-16938-2013/11/25-05:54:05:67164-flat-client-2-0 (version: 3.4.0.44rhs) [2013-11-25 05:56:12.679403] E [posix.c:737:posix_readlink] 0-flat-posix: readlink on /rhs/brick1/f/mv1/mvetc4 failed: Invalid argument [2013-11-25 05:56:12.679461] I [server-rpc-fops.c:1697:server_readlink_cbk] 0-flat-server: 199: READLINK /mv1/mvetc4 (200ecbd6-1669-4d81-8098-5c3f68d3d86e) ==> (Invalid argument) [2013-11-25 05:56:12.890553] E [posix.c:737:posix_readlink] 0-flat-posix: readlink on /rhs/brick1/f/mv1/mvetc8 failed: Invalid argument [2013-11-25 05:56:12.890618] I [server-rpc-fops.c:1697:server_readlink_cbk] 0-flat-server: 202: READLINK /mv1/mvetc8 (90036c71-83ab-40f0-9b5e-21fa795d0619) ==> (Invalid argument) Version-Release number of selected component (if applicable): ============================================ 3.4.0.44rhs-1.el6rhs.x86_64 How reproducible: ================== haven't tried Steps to Reproduce: ==================== 1. create and mount DHT volume. Create Data from mount point(Directory depth was 10) 2.add brick to volume and start rebalance. 3. while rebalance is in progress, perform rename operation for directories and files 3. after 44+ hours rebalance process was crashed on all node and rebalance status was 'failed' [root@7-VM1 core]# gluster volume rebalance flat status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 832000 13.7GB 5344344 1 228 failed 159836.00 10.70.36.133 1009405 15.7GB 5362837 2 206 failed 159836.00 10.70.36.132 823206 12.9GB 5416604 1 233 failed 159836.00 10.70.36.131 0 0Bytes 5227829 0 0 failed 159836.00 volume rebalance: flat: success: 4. fuse mount this volume and execute ls from mount point for renamed Directories. for few directory it is giving error as mentioned above. Actual results: - directories are listed twice - got error 'ls: cannot access mvetc24: Input/output error' - got error 'ls: cannot read symbolic link mvetc1: Invalid argument' - log has error 'E [posix.c:737:posix_readlink] 0-flat-posix: readlink on /rhs/brick1/f/mv1/mvetc8 failed: Invalid argument' Expected results: - Directories should not be listed twice - lookup should not give error Additional info:
was the brick 100% full at the moment when you got the error?
Cloning this to 3.1. to be fixed in future release.