Description of problem: 2x2 distributed replicate volume, with 1 fuse and 1 nfs client. On fuse was running rdd, ping_pong, fs-perf test one after another in a loop. Rebooted one of the bricks and restarted it. ping_pong running on the fuse client hung. (ping_pong was also getting errors before hanging since it was started) Version-Release number of selected component (if applicable): How reproducible: always Steps to Reproduce: 1. create a distributed replicate volume, start it and mount it 2. run ping_pong on the fuse client 3. bring one of the bricks down and then bring it up after some time. Actual results: ping_piong gets blocked Expected results: ping_pong should not block and continue to get locks Additional info: statedump of the server having blocked locks. [1./export-xfs/mirror.itable.active.1999] gfid=275f3060-b089-43f1-8930-36a3582e0a6c nlookup=579 ref=3 ia_type=1 [xlator.features.locks.mirror-locks.inode] path=/file mandatory=0 posixlk-count=5 posixlk.posixlk[0](ACTIVE)=type=WRITE, whence=0, start=96, len=1, pid = 18890, owner=4e6e35ff74ffffff, transport=0x10dd9690, , granted at Sat Mar 10 03:02:41 2012 posixlk.posixlk[1](ACTIVE)=type=WRITE, whence=0, start=103, len=1, pid = 18899, owner=4e6e35ff74ffffff, transport=0x10dd9690, , granted at Sat Mar 10 03:02:41 2012 posixlk.posixlk[2](ACTIVE)=type=WRITE, whence=0, start=97, len=1, pid = 18890, owner=4e6e35ff74ffffff, transport=0x10dd9690, , granted at Sat Mar 10 03:02:41 2012 posixlk.posixlk[3](ACTIVE)=type=READ, whence=0, start=0, len=0, pid = 18899, owner=4e6e35ff74ffffff, transport=0x10feec10, , granted at Sat Ma r 10 03:09:59 2012 posixlk.posixlk[4](BLOCKED)=type=WRITE, whence=0, start=96, len=1, pid = 18899, owner=4e6e35ff74ffffff, transport=0x10feec10, , blocked at Sat Mar 10 03:09:59 2012 [1./export-xfs/mirror.itable.active.2000] gfid=23599b35-4d8b-401c-a493-768a0c19fad8 nlookup=582 ref=5 ia_type=2 [xlator.features.locks.mirror-locks.inode] path=/samba/lib/ccan/tally mandatory=0 [1./export-xfs/mirror.itable.active.2001] gfid=32121b6f-1080-4410-9022-469dd3f46dc6 nlookup=576 ref=4 : server logs where blocked locks were found: 2012-03-10 02:57:11.368580] I [server.c:52:grace_time_handler] 0-mirror-server: grace timer expired [2012-03-10 02:57:11.368681] I [server-helpers.c:630:server_connection_destroy] 0-mirror-server: destroyed connection of node131-19679-2012/03 /10-02:49:10:171926-mirror-client-3-0 [2012-03-10 02:57:12.368882] I [server.c:52:grace_time_handler] 0-mirror-server: grace timer expired [2012-03-10 02:57:12.368976] I [server-helpers.c:630:server_connection_destroy] 0-mirror-server: destroyed connection of node130-17561-2012/03 /10-02:49:31:050508-mirror-client-3-0 [2012-03-10 02:57:12.368991] I [server.c:52:grace_time_handler] 0-mirror-server: grace timer expired [2012-03-10 02:57:12.369004] I [server-helpers.c:630:server_connection_destroy] 0-mirror-server: destroyed connection of RHEL6.1-8777-2012/03/ 10-02:49:47:085068-mirror-client-3-0 [2012-03-10 03:02:20.755382] I [server.c:622:server_rpc_notify] 0-mirror-server: disconnecting connectionfrom 10.1.11.104:997 [2012-03-10 03:02:20.755525] I [server.c:631:server_rpc_notify] 0-mirror-server: starting a grace timer for tcp.mirror-server [2012-03-10 03:02:31.408647] I [server.c:52:grace_time_handler] 0-mirror-server: grace timer expired [2012-03-10 03:02:31.408772] I [server-helpers.c:474:do_fd_cleanup] 0-mirror-server: fd cleanup on /file [2012-03-10 03:02:31.408810] I [server-helpers.c:474:do_fd_cleanup] 0-mirror-server: fd cleanup on /in [2012-03-10 03:02:31.409164] I [server-helpers.c:630:server_connection_destroy] 0-mirror-server: destroyed connection of RHEL6.1-2225-2012/03/09-02:06:40:494326-mirror-client-3-18 [2012-03-10 03:02:35.888654] I [server.c:622:server_rpc_notify] 0-mirror-server: disconnecting connectionfrom 10.1.11.131:1013 [2012-03-10 03:02:35.888720] I [server.c:631:server_rpc_notify] 0-mirror-server: starting a grace timer for tcp.mirror-server [2012-03-10 03:02:36.947712] I [glusterfsd-mgmt.c:64:mgmt_cbk_spec] 0-mgmt: Volume file changed [2012-03-10 03:02:36.974833] I [server.c:622:server_rpc_notify] 0-mirror-server: disconnecting connectionfrom 10.1.11.130:1009 [2012-03-10 03:02:36.974866] I [server.c:631:server_rpc_notify] 0-mirror-server: starting a grace timer for tcp.mirror-server [2012-03-10 03:02:36.979345] I [server.c:622:server_rpc_notify] 0-mirror-server: disconnecting connectionfrom 10.1.11.145:1017 [2012-03-10 03:02:36.979400] I [server.c:631:server_rpc_notify] 0-mirror-server: starting a grace timer for tcp.mirror-server [2012-03-10 03:02:38.003381] I [glusterfsd-mgmt.c:64:mgmt_cbk_spec] 0-mgmt: Volume file changed [2012-03-10 03:02:38.003455] I [glusterfsd-mgmt.c:1299:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing [2012-03-10 03:02:38.003785] I [glusterfsd-mgmt.c:1299:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing [2012-03-10 03:02:41.882657] I [server-handshake.c:569:server_setvolume] 0-mirror-server: accepted client from 10.1.11.104:993 (version: 3.3.0qa26) [2012-03-10 03:02:41.976401] I [server-handshake.c:569:server_setvolume] 0-mirror-server: accepted client from 10.1.11.131:1011 (version: 3.3.0qa26) [2012-03-10 03:02:42.020283] I [server-handshake.c:569:server_setvolume] 0-mirror-server: accepted client from 10.1.11.130:1008 (version: 3.3.0qa26) [2012-03-10 03:02:42.031175] I [server-handshake.c:569:server_setvolume] 0-mirror-server: accepted client from 10.1.11.145:1013 (version: 3.3.0qa26) [2012-03-10 03:02:46.410339] I [server.c:52:grace_time_handler] 0-mirror-server: grace timer expired [2012-03-10 03:02:46.410444] I [server-helpers.c:630:server_connection_destroy] 0-mirror-server: destroyed connection of node131-19694-2012/03/10-02:56:25:690380-mirror-client-3-0 [2012-03-10 03:02:47.410582] I [server.c:52:grace_time_handler] 0-mirror-server: grace timer expired [2012-03-10 03:02:47.410693] I [server-helpers.c:630:server_connection_destroy] 0-mirror-server: destroyed connection of node130-17622-2012/03/10-02:56:46:569325-mirror-client-3-0 [2012-03-10 03:02:47.410721] I [server.c:52:grace_time_handler] 0-mirror-server: grace timer expired [2012-03-10 03:02:47.410740] I [server-helpers.c:630:server_connection_destroy] 0-mirror-server: destroyed connection of RHEL6.1-8787-2012/03/10-02:57:02:665588-mirror-client-3-0 : Also fuse client is logging too much about the registering grace timer and grce-timer getting expired. 2012-03-10 08:15:51.119192] W [client.c:2011:client_rpc_notify] 46-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.122198] W [client.c:2011:client_rpc_notify] 42-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.125312] W [client.c:2011:client_rpc_notify] 40-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.127181] W [client.c:2011:client_rpc_notify] 38-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.130176] W [client.c:2011:client_rpc_notify] 32-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.133261] W [client.c:2011:client_rpc_notify] 31-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.136168] W [client.c:2011:client_rpc_notify] 34-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.139166] W [client.c:2011:client_rpc_notify] 33-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.141172] W [client.c:2011:client_rpc_notify] 28-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.144165] W [client.c:2011:client_rpc_notify] 23-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.147170] W [client.c:2011:client_rpc_notify] 22-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.150174] W [client.c:2011:client_rpc_notify] 16-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.153377] W [client.c:2011:client_rpc_notify] 29-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.155177] W [client.c:2011:client_rpc_notify] 24-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.158166] W [client.c:2011:client_rpc_notify] 52-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.161253] W [client.c:2011:client_rpc_notify] 21-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.164197] W [client.c:2011:client_rpc_notify] 13-mirror-client-2: Registering a grace timer [2012-03-10 08:15:51.166181] W [client.c:2011:client_rpc_notify] 9-mirror-client-2: Registering a grace timer [2012-03-10 08:15:52.172189] W [client.c:2011:client_rpc_notify] 54-mirror-client-2: Registering a grace timer [2012-03-10 08:15:52.402671] W [client.c:112:client_grace_timeout] 35-mirror-client-2: client grace timer expired, updating the lk-version to 562 [2012-03-10 08:15:52.402722] W [client.c:112:client_grace_timeout] 3-mirror-client-2: client grace timer expired, updating the lk-version to 1881 [2012-03-10 08:15:52.404982] W [client.c:112:client_grace_timeout] 62-mirror-client-2: client grace timer expired, updating the lk-version to 17 [2012-03-10 08:15:53.178180] W [client.c:2011:client_rpc_notify] 47-mirror-client-2: Registering a grace timer [2012-03-10 08:15:53.410586] W [client.c:112:client_grace_timeout] 57-mirror-client-2: client grace timer expired, updating the lk-version to 118 [2012-03-10 08:15:54.184213] W [client.c:2011:client_rpc_notify] 45-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.186190] W [client.c:2011:client_rpc_notify] 44-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.189204] W [client.c:2011:client_rpc_notify] 27-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.192202] W [client.c:2011:client_rpc_notify] 0-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.194209] W [client.c:2011:client_rpc_notify] 25-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.197352] W [client.c:2011:client_rpc_notify] 19-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.200194] W [client.c:2011:client_rpc_notify] 20-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.202200] W [client.c:2011:client_rpc_notify] 17-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.205181] W [client.c:2011:client_rpc_notify] 10-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.208181] W [client.c:2011:client_rpc_notify] 15-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.210215] W [client.c:2011:client_rpc_notify] 7-mirror-client-2: Registering a grace timer [2012-03-10 08:15:54.416538] W [client.c:112:client_grace_timeout] 39-mirror-client-2: client grace timer expired, updating the lk-version to 646 [2012-03-10 08:15:54.416586] W [client.c:112:client_grace_timeout] 58-mirror-client-2: client grace timer expired, updating the lk-version to 105
please update these bugs w.r.to 3.3.0qa27, need to work on it as per target milestone set.
Fixed as part of bug 810502. Can you please confirm? Thanks, Vijay
Checked on release-3.3 (git head: 638a4740cc553c96bc01d1dfe4a2b7acf0b406e6). Ran ping_pong from 2 fuse client parallely. Repeated the tests multiple times with both lock-heal on and off. Did not hit the issue and ping_pong completed without hang evertytime on all the mounts. Can be closed.