Description of problem: ======================= probe of a server is successful but every 30 secs it disconnects and reconnects. [root@snapshot09 ~]# gluster peer status Number of Peers: 1 Hostname: snapshot10.lab.eng.blr.redhat.com Uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a State: Peer in Cluster (Connected) [root@snapshot09 ~]# After 30 Secs: [root@snapshot09 ~]# gluster peer status Number of Peers: 1 Hostname: snapshot10.lab.eng.blr.redhat.com Uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a State: Peer in Cluster (Disconnected) [root@snapshot09 ~]# Again, [root@snapshot09 ~]# gluster peer status Number of Peers: 1 Hostname: snapshot10.lab.eng.blr.redhat.com Uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a State: Peer in Cluster (Connected) [root@snapshot09 ~]# Version-Release number of selected component (if applicable): ============================================================== glusterfs-3.5qa2-0.425.git9360107.el6rhs.x86_64 How reproducible: ================== 1/1 Steps to Reproduce: =================== 1. Probe a server 2. Check glusterd logs or peer status Actual results: =============== Peer is disconnected and reconnected again Expected results: ================= Peer should not be disconnected. Additional info: ================ Log snippet: [2014-05-06 14:29:02.034813] C [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-management: server 10.70.44.63:24007 has not responded in the last 30 seconds, disconnecting. [2014-05-06 14:29:11.839313] I [glusterd-handshake.c:712:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 4 [2014-05-06 14:29:11.853211] I [glusterd-handler.c:2301:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a [2014-05-06 14:29:12.046725] I [glusterd-handler.c:3336:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to snapshot10.lab.eng.blr.redhat.com (0), ret: 0 [2014-05-06 14:29:12.057924] I [glusterd-sm.c:495:glusterd_ac_send_friend_update] 0-: Added uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com [2014-05-06 14:29:12.066560] I [glusterd-rpc-ops.c:556:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a [2014-05-06 14:29:12.089355] I [glusterd-rpc-ops.c:359:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com, port: 0 [2014-05-06 14:29:12.099435] I [glusterd-handler.c:2463:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a [2014-05-06 14:29:12.099560] I [glusterd-handler.c:2508:__glusterd_handle_friend_update] 0-: Received uuid: 72f4a446-0ce8-4213-8584-0e4018de17a5, hostname:10.70.44.62 [2014-05-06 14:29:12.099600] I [glusterd-handler.c:2517:__glusterd_handle_friend_update] 0-: Received my uuid as Friend [2014-05-06 14:29:42.048576] C [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-management: server 10.70.44.63:24007 has not responded in the last 30 seconds, disconnecting. [2014-05-06 14:29:51.855922] I [glusterd-handshake.c:712:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 4 [2014-05-06 14:29:51.869645] I [glusterd-handler.c:2301:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a [2014-05-06 14:29:52.060106] I [glusterd-handler.c:3336:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to snapshot10.lab.eng.blr.redhat.com (0), ret: 0 [2014-05-06 14:29:52.070335] I [glusterd-sm.c:495:glusterd_ac_send_friend_update] 0-: Added uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com [2014-05-06 14:29:52.076947] I [glusterd-rpc-ops.c:556:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a [2014-05-06 14:29:52.095418] I [glusterd-rpc-ops.c:359:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com, port: 0 [2014-05-06 14:29:52.103927] I [glusterd-handler.c:2463:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a [2014-05-06 14:29:52.104059] I [glusterd-handler.c:2508:__glusterd_handle_friend_update] 0-: Received uuid: 72f4a446-0ce8-4213-8584-0e4018de17a5, hostname:10.70.44.62 [2014-05-06 14:29:52.104100] I [glusterd-handler.c:2517:__glusterd_handle_friend_update] 0-: Received my uuid as Friend [2014-05-06 14:30:22.062289] C [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-management: server 10.70.44.63:24007 has not responded in the last 30 seconds, disconnecting. [2014-05-06 14:30:31.871789] I [glusterd-handshake.c:712:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 4 [2014-05-06 14:30:31.882941] I [glusterd-handler.c:2301:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a [2014-05-06 14:30:32.074367] I [glusterd-handler.c:3336:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to snapshot10.lab.eng.blr.redhat.com (0), ret: 0 [2014-05-06 14:30:32.083384] I [glusterd-sm.c:495:glusterd_ac_send_friend_update] 0-: Added uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com [2014-05-06 14:30:32.090139] I [glusterd-rpc-ops.c:556:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a [2014-05-06 14:30:32.114941] I [glusterd-rpc-ops.c:359:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com, port: 0 [2014-05-06 14:30:32.124864] I [glusterd-handler.c:2463:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a [2014-05-06 14:30:32.124981] I [glusterd-handler.c:2508:__glusterd_handle_friend_update] 0-: Received uuid: 72f4a446-0ce8-4213-8584-0e4018de17a5, hostname:10.70.44.62 [2014-05-06 14:30:32.125012] I [glusterd-handler.c:2517:__glusterd_handle_friend_update] 0-: Received my uuid as Friend [2014-05-06 14:31:02.076382] C [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-management: server 10.70.44.63:24007 has not responded in the last 30 seconds, disconnecting. [2014-05-06 14:31:12.435312] I [glusterd-rpc-ops.c:359:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com, port: 0 [2014-05-06 14:31:12.444011] I [glusterd-handler.c:2463:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a [2014-05-06 14:31:12.444118] I [glusterd-handler.c:2508:__glusterd_handle_friend_update] 0-: Received uuid: 72f4a446-0ce8-4213-8584-0e4018de17a5, hostname:10.70.44.62 [2014-05-06 14:31:12.444145] I [glusterd-handler.c:2517:__glusterd_handle_friend_update] 0-: Received my uuid as Friend
Posted patch at http://review.gluster.org/7678
I was able to create volumes on the latest bits, this looks fixed in glusterfs-3.6.0-1.0.el6rhs.x86_64.
Verified with build: glusterfs-3.6.0-1.0.el6rhs.x86_64 Did not observe peer disconnect. [root@snapshot09 ~]# date ; cat /var/log/glusterfs/etc-glusterfs-glusterd.vol.log | grep " C " Fri May 9 20:11:05 IST 2014 [root@snapshot09 ~]# date ; cat /var/log/glusterfs/etc-glusterfs-glusterd.vol.log | grep "responded" Fri May 9 20:11:40 IST 2014 [root@snapshot09 ~]# After 4 mins [root@snapshot09 ~]# date ; cat /var/log/glusterfs/etc-glusterfs-glusterd.vol.log | grep " C " Fri May 9 20:15:07 IST 2014 [root@snapshot09 ~]# date ; cat /var/log/glusterfs/etc-glusterfs-glusterd.vol.log | grep "responded" Fri May 9 20:15:09 IST 2014 [root@snapshot09 ~]# Moving the bug to verified state.
Setting flags required to add BZs to RHS 3.0 Errata
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html