Bug 849131 - [5303f98f674ab5cb600dde0394ff7ddd5ba3c98a] - gluster fuse client hung during sanity runs
[5303f98f674ab5cb600dde0394ff7ddd5ba3c98a] - gluster fuse client hung during ...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: rdma (Show other bugs)
2.0
x86_64 Linux
low Severity high
: ---
: ---
Assigned To: Raghavendra G
shylesh
:
Depends On: 772880 822337
Blocks: 858452
  Show dependency treegraph
 
Reported: 2012-08-17 07:55 EDT by Vidya Sakar
Modified: 2015-05-13 13:19 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 772880
: 858452 (view as bug list)
Environment:
Last Closed: 2015-02-13 04:51:09 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Vidya Sakar 2012-08-17 07:55:05 EDT
+++ This bug was initially created as a clone of Bug #772880 +++

Description of problem:
I was running sanity tests on 2 way replicate system with 'rdma' transport type. Sanity got hung. But mountpoint is accessible.

Version-Release number of selected component (if applicable):
git master with head at 5303f98f674ab5cb600dde0394ff7ddd5ba3c98a

How reproducible:
2/2

Steps to Reproduce:
1. Create a replicate volume with rdma transport type.
2. Start running the sanity tests.
  
Actual results:
Sanity test hung

Expected results:
Sanity should not hang.

Additional info:

following is the entries ion client log.

[2012-01-09 23:49:29.518564] W [client3_1-fops.c:373:client3_1_open_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.0
[2012-01-09 23:49:29.518586] E [afr-self-heal-data.c:1278:afr_sh_data_open_cbk] 0-hosdu-replicate-0: open of /run2040/_24876_tiotest.0 failed on child hosdu-client-0 (Transport endpoint is not connected)
[2012-01-09 23:49:29.551551] E [afr-self-heal-common.c:2045:afr_self_heal_completion_cbk] 0-hosdu-replicate-0: background  data self-heal failed on /run2040/_24876_tiotest.0
[2012-01-09 23:49:29.552042] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238212x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0)
[2012-01-09 23:49:29.552065] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.1
[2012-01-09 23:49:29.552697] I [afr-common.c:1297:afr_launch_self_heal] 0-hosdu-replicate-0: background  data self-heal triggered. path: /run2040/_24876_tiotest.1, reason: lookup detected pending operations
[2012-01-09 23:49:29.552753] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238213x Program: GlusterFS 3.1, ProgVers: 310, Proc: 11) to rpc-transport (hosdu-client-0)
[2012-01-09 23:49:29.552775] W [client3_1-fops.c:373:client3_1_open_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.1
[2012-01-09 23:49:29.552791] E [afr-self-heal-data.c:1278:afr_sh_data_open_cbk] 0-hosdu-replicate-0: open of /run2040/_24876_tiotest.1 failed on child hosdu-client-0 (Transport endpoint is not connected)
[2012-01-09 23:49:29.552957] E [afr-self-heal-common.c:2045:afr_self_heal_completion_cbk] 0-hosdu-replicate-0: background  data self-heal failed on /run2040/_24876_tiotest.1
[2012-01-09 23:49:29.553187] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238214x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0)
[2012-01-09 23:49:29.553214] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.2
[2012-01-09 23:49:29.553860] I [afr-common.c:1297:afr_launch_self_heal] 0-hosdu-replicate-0: background  data self-heal triggered. path: /run2040/_24876_tiotest.2, reason: lookup detected pending operations
[2012-01-09 23:49:29.553907] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238215x Program: GlusterFS 3.1, ProgVers: 310, Proc: 11) to rpc-transport (hosdu-client-0)
[2012-01-09 23:49:29.553927] W [client3_1-fops.c:373:client3_1_open_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.2
[2012-01-09 23:49:29.553943] E [afr-self-heal-data.c:1278:afr_sh_data_open_cbk] 0-hosdu-replicate-0: open of /run2040/_24876_tiotest.2 failed on child hosdu-client-0 (Transport endpoint is not connected)
[2012-01-09 23:49:29.554088] E [afr-self-heal-common.c:2045:afr_self_heal_completion_cbk] 0-hosdu-replicate-0: background  data self-heal failed on /run2040/_24876_tiotest.2
[2012-01-09 23:49:29.554285] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238216x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0)
[2012-01-09 23:49:29.554310] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.3
[2012-01-09 23:49:29.554826] I [afr-common.c:1297:afr_launch_self_heal] 0-hosdu-replicate-0: background  data self-heal triggered. path: /run2040/_24876_tiotest.3, reason: lookup detected pending operations
[2012-01-09 23:49:29.554873] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238217x Program: GlusterFS 3.1, ProgVers: 310, Proc: 11) to rpc-transport (hosdu-client-0)
[2012-01-09 23:49:29.554904] W [client3_1-fops.c:373:client3_1_open_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/_24876_tiotest.3
[2012-01-09 23:49:29.554921] E [afr-self-heal-data.c:1278:afr_sh_data_open_cbk] 0-hosdu-replicate-0: open of /run2040/_24876_tiotest.3 failed on child hosdu-client-0 (Transport endpoint is not connected)
[2012-01-09 23:49:29.555072] E [afr-self-heal-common.c:2045:afr_self_heal_completion_cbk] 0-hosdu-replicate-0: background  data self-heal failed on /run2040/_24876_tiotest.3
[2012-01-09 23:49:29.555262] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238218x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0)
[2012-01-09 23:49:29.555287] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/p0
[2012-01-09 23:49:29.556156] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238219x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0)
[2012-01-09 23:49:29.556179] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/p1
[2012-01-09 23:49:29.556902] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238220x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0)
[2012-01-09 23:49:29.556925] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/p2
[2012-01-09 23:49:29.557709] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238221x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) to rpc-transport (hosdu-client-0)
[2012-01-09 23:49:29.557732] W [client3_1-fops.c:2249:client3_1_lookup_cbk] 0-hosdu-client-0: remote operation failed: Transport endpoint is not connected. Path: /run2040/p3
[2012-01-09 23:49:29.558451] W [rpc-clnt.c:1478:rpc_clnt_submit] 0-hosdu-client-0: failed to submit rpc-request (XID: 0x238222x Program: GlusterFS 3.1, ProgVers: 310, Proc: 27) 


I have attached the statedumps of client and first server brick.
Comment 3 Sachidananda Urs 2013-08-08 01:45:22 EDT
Moving out of Big Bend since RDMA support is not available in Big Bend,2.1

Note You need to log in before you can comment on or make changes to this bug.