+++ This bug was initially created as a clone of Bug #772880 +++ Description of problem: I was running sanity tests on 2 way replicate system with 'rdma' transport type. Sanity got hung. But mountpoint is accessible. Version-Release number of selected component (if applicable): git master with head at 5303f98f674ab5cb600dde0394ff7ddd5ba3c98a How reproducible: 2/2 Steps to Reproduce: 1. Create a replicate volume with rdma transport type. 2. Start running the sanity tests. Actual results: Sanity test hung Expected results: Sanity should not hang. Additional info: following is the entries ion client log. [2012-01-09 23:49:29.518564] W [client3_1-fops.c:373:client3_1_open_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.0 [2012-01-09 23:49:29.518586] E [afr-self-heal-data.c:1278:afr_sh_data_open_cbk] 0-hosdu-replicate-0: open of /run2040/_24876_tiotest.0 failed on child hosdu-client-0 (Transport endpoint is not connected) [2012-01-09 23:49:29.551551] E [afr-self-heal-common.c:2045:afr_self_heal_completion_cbk] 0-hosdu-replicate-0: background data self-heal failed on /run2040/_24876_tiotest.0 [2012-01-09 23:49:29.552042] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238212x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0) [2012-01-09 23:49:29.552065] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.1 [2012-01-09 23:49:29.552697] I [afr-common.c:1297:afr_launch_self_heal] 0-hosdu-replicate-0: background data self-heal triggered. path: /run2040/_24876_tiotest.1, reason: lookup detected pending operations [2012-01-09 23:49:29.552753] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238213x Program: GlusterFS 3.1, ProgVers: 310, Proc: 11) to rpc-transport (hosdu-client-0) [2012-01-09 23:49:29.552775] W [client3_1-fops.c:373:client3_1_open_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.1 [2012-01-09 23:49:29.552791] E [afr-self-heal-data.c:1278:afr_sh_data_open_cbk] 0-hosdu-replicate-0: open of /run2040/_24876_tiotest.1 failed on child hosdu-client-0 (Transport endpoint is not connected) [2012-01-09 23:49:29.552957] E [afr-self-heal-common.c:2045:afr_self_heal_completion_cbk] 0-hosdu-replicate-0: background data self-heal failed on /run2040/_24876_tiotest.1 [2012-01-09 23:49:29.553187] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238214x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0) [2012-01-09 23:49:29.553214] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.2 [2012-01-09 23:49:29.553860] I [afr-common.c:1297:afr_launch_self_heal] 0-hosdu-replicate-0: background data self-heal triggered. path: /run2040/_24876_tiotest.2, reason: lookup detected pending operations [2012-01-09 23:49:29.553907] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238215x Program: GlusterFS 3.1, ProgVers: 310, Proc: 11) to rpc-transport (hosdu-client-0) [2012-01-09 23:49:29.553927] W [client3_1-fops.c:373:client3_1_open_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.2 [2012-01-09 23:49:29.553943] E [afr-self-heal-data.c:1278:afr_sh_data_open_cbk] 0-hosdu-replicate-0: open of /run2040/_24876_tiotest.2 failed on child hosdu-client-0 (Transport endpoint is not connected) [2012-01-09 23:49:29.554088] E [afr-self-heal-common.c:2045:afr_self_heal_completion_cbk] 0-hosdu-replicate-0: background data self-heal failed on /run2040/_24876_tiotest.2 [2012-01-09 23:49:29.554285] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238216x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0) [2012-01-09 23:49:29.554310] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.3 [2012-01-09 23:49:29.554826] I [afr-common.c:1297:afr_launch_self_heal] 0-hosdu-replicate-0: background data self-heal triggered. path: /run2040/_24876_tiotest.3, reason: lookup detected pending operations [2012-01-09 23:49:29.554873] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238217x Program: GlusterFS 3.1, ProgVers: 310, Proc: 11) to rpc-transport (hosdu-client-0) [2012-01-09 23:49:29.554904] W [client3_1-fops.c:373:client3_1_open_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.3 [2012-01-09 23:49:29.554921] E [afr-self-heal-data.c:1278:afr_sh_data_open_cbk] 0-hosdu-replicate-0: open of /run2040/_24876_tiotest.3 failed on child hosdu-client-0 (Transport endpoint is not connected) [2012-01-09 23:49:29.555072] E [afr-self-heal-common.c:2045:afr_self_heal_completion_cbk] 0-hosdu-replicate-0: background data self-heal failed on /run2040/_24876_tiotest.3 [2012-01-09 23:49:29.555262] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238218x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0) [2012-01-09 23:49:29.555287] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/p0 [2012-01-09 23:49:29.556156] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238219x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0) [2012-01-09 23:49:29.556179] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/p1 [2012-01-09 23:49:29.556902] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238220x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0) [2012-01-09 23:49:29.556925] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/p2 [2012-01-09 23:49:29.557709] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238221x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0) [2012-01-09 23:49:29.557732] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/p3 [2012-01-09 23:49:29.558451] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238222x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) I have attached the statedumps of client and first server brick.
Moving out of Big Bend since RDMA support is not available in Big Bend,2.1