Bug 1094608 - Peer is disconnected and reconnected every 30 seconds
Summary: Peer is disconnected and reconnected every 30 seconds
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: rhgs-3.0
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: RHGS 3.0.0
Assignee: krishnan parthasarathi
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On:
Blocks: 1094655
TreeView+ depends on / blocked
 
Reported: 2014-05-06 06:48 UTC by Rahul Hinduja
Modified: 2015-05-13 17:01 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.6.0-1.0.el6rhs
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1094655 (view as bug list)
Environment:
Last Closed: 2014-09-22 19:36:26 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:1278 0 normal SHIPPED_LIVE Red Hat Storage Server 3.0 bug fix and enhancement update 2014-09-22 23:26:55 UTC

Description Rahul Hinduja 2014-05-06 06:48:33 UTC
Description of problem:
=======================

probe of a server is successful but every 30 secs it disconnects and reconnects. 

[root@snapshot09 ~]# gluster peer status
Number of Peers: 1

Hostname: snapshot10.lab.eng.blr.redhat.com
Uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
State: Peer in Cluster (Connected)
[root@snapshot09 ~]# 

After 30 Secs:

[root@snapshot09 ~]# gluster peer status
Number of Peers: 1

Hostname: snapshot10.lab.eng.blr.redhat.com
Uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
State: Peer in Cluster (Disconnected)
[root@snapshot09 ~]# 

Again,

[root@snapshot09 ~]# gluster peer status
Number of Peers: 1

Hostname: snapshot10.lab.eng.blr.redhat.com
Uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
State: Peer in Cluster (Connected)
[root@snapshot09 ~]# 

Version-Release number of selected component (if applicable):
==============================================================

glusterfs-3.5qa2-0.425.git9360107.el6rhs.x86_64


How reproducible:
==================
1/1


Steps to Reproduce:
===================
1. Probe a server 
2. Check glusterd logs or peer status


Actual results:
===============

Peer is disconnected and reconnected again


Expected results:
=================

Peer should not be disconnected.


Additional info:
================

Log snippet:

[2014-05-06 14:29:02.034813] C [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-management: server 10.70.44.63:24007 has not responded in the last 30 seconds, disconnecting.
[2014-05-06 14:29:11.839313] I [glusterd-handshake.c:712:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 4
[2014-05-06 14:29:11.853211] I [glusterd-handler.c:2301:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
[2014-05-06 14:29:12.046725] I [glusterd-handler.c:3336:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to snapshot10.lab.eng.blr.redhat.com (0), ret: 0
[2014-05-06 14:29:12.057924] I [glusterd-sm.c:495:glusterd_ac_send_friend_update] 0-: Added uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com
[2014-05-06 14:29:12.066560] I [glusterd-rpc-ops.c:556:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
[2014-05-06 14:29:12.089355] I [glusterd-rpc-ops.c:359:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com, port: 0
[2014-05-06 14:29:12.099435] I [glusterd-handler.c:2463:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
[2014-05-06 14:29:12.099560] I [glusterd-handler.c:2508:__glusterd_handle_friend_update] 0-: Received uuid: 72f4a446-0ce8-4213-8584-0e4018de17a5, hostname:10.70.44.62
[2014-05-06 14:29:12.099600] I [glusterd-handler.c:2517:__glusterd_handle_friend_update] 0-: Received my uuid as Friend
[2014-05-06 14:29:42.048576] C [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-management: server 10.70.44.63:24007 has not responded in the last 30 seconds, disconnecting.
[2014-05-06 14:29:51.855922] I [glusterd-handshake.c:712:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 4
[2014-05-06 14:29:51.869645] I [glusterd-handler.c:2301:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
[2014-05-06 14:29:52.060106] I [glusterd-handler.c:3336:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to snapshot10.lab.eng.blr.redhat.com (0), ret: 0
[2014-05-06 14:29:52.070335] I [glusterd-sm.c:495:glusterd_ac_send_friend_update] 0-: Added uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com
[2014-05-06 14:29:52.076947] I [glusterd-rpc-ops.c:556:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
[2014-05-06 14:29:52.095418] I [glusterd-rpc-ops.c:359:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com, port: 0
[2014-05-06 14:29:52.103927] I [glusterd-handler.c:2463:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
[2014-05-06 14:29:52.104059] I [glusterd-handler.c:2508:__glusterd_handle_friend_update] 0-: Received uuid: 72f4a446-0ce8-4213-8584-0e4018de17a5, hostname:10.70.44.62
[2014-05-06 14:29:52.104100] I [glusterd-handler.c:2517:__glusterd_handle_friend_update] 0-: Received my uuid as Friend
[2014-05-06 14:30:22.062289] C [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-management: server 10.70.44.63:24007 has not responded in the last 30 seconds, disconnecting.
[2014-05-06 14:30:31.871789] I [glusterd-handshake.c:712:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 4
[2014-05-06 14:30:31.882941] I [glusterd-handler.c:2301:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
[2014-05-06 14:30:32.074367] I [glusterd-handler.c:3336:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to snapshot10.lab.eng.blr.redhat.com (0), ret: 0
[2014-05-06 14:30:32.083384] I [glusterd-sm.c:495:glusterd_ac_send_friend_update] 0-: Added uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com
[2014-05-06 14:30:32.090139] I [glusterd-rpc-ops.c:556:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
[2014-05-06 14:30:32.114941] I [glusterd-rpc-ops.c:359:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com, port: 0
[2014-05-06 14:30:32.124864] I [glusterd-handler.c:2463:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
[2014-05-06 14:30:32.124981] I [glusterd-handler.c:2508:__glusterd_handle_friend_update] 0-: Received uuid: 72f4a446-0ce8-4213-8584-0e4018de17a5, hostname:10.70.44.62
[2014-05-06 14:30:32.125012] I [glusterd-handler.c:2517:__glusterd_handle_friend_update] 0-: Received my uuid as Friend
[2014-05-06 14:31:02.076382] C [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-management: server 10.70.44.63:24007 has not responded in the last 30 seconds, disconnecting.
[2014-05-06 14:31:12.435312] I [glusterd-rpc-ops.c:359:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a, host: snapshot10.lab.eng.blr.redhat.com, port: 0
[2014-05-06 14:31:12.444011] I [glusterd-handler.c:2463:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 0fc68842-27a2-452c-8d4d-6fc8fdfce94a
[2014-05-06 14:31:12.444118] I [glusterd-handler.c:2508:__glusterd_handle_friend_update] 0-: Received uuid: 72f4a446-0ce8-4213-8584-0e4018de17a5, hostname:10.70.44.62
[2014-05-06 14:31:12.444145] I [glusterd-handler.c:2517:__glusterd_handle_friend_update] 0-: Received my uuid as Friend

Comment 3 krishnan parthasarathi 2014-05-06 07:21:26 UTC
Posted patch at http://review.gluster.org/7678

Comment 4 Ben Turner 2014-05-08 18:15:18 UTC
I was able to create volumes on the latest bits, this looks fixed in glusterfs-3.6.0-1.0.el6rhs.x86_64.

Comment 5 Rahul Hinduja 2014-05-09 06:45:10 UTC
Verified with build: glusterfs-3.6.0-1.0.el6rhs.x86_64

Did not observe peer disconnect.

[root@snapshot09 ~]# date ; cat /var/log/glusterfs/etc-glusterfs-glusterd.vol.log  | grep " C "
Fri May  9 20:11:05 IST 2014
[root@snapshot09 ~]# date ; cat /var/log/glusterfs/etc-glusterfs-glusterd.vol.log  | grep "responded"
Fri May  9 20:11:40 IST 2014
[root@snapshot09 ~]# 

After 4 mins

[root@snapshot09 ~]# date ; cat /var/log/glusterfs/etc-glusterfs-glusterd.vol.log  | grep " C "
Fri May  9 20:15:07 IST 2014
[root@snapshot09 ~]# date ; cat /var/log/glusterfs/etc-glusterfs-glusterd.vol.log  | grep "responded"
Fri May  9 20:15:09 IST 2014
[root@snapshot09 ~]# 

Moving the bug to verified state.

Comment 6 Nagaprasad Sathyanarayana 2014-05-19 10:56:51 UTC
Setting flags required to add BZs to RHS 3.0 Errata

Comment 8 errata-xmlrpc 2014-09-22 19:36:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html


Note You need to log in before you can comment on or make changes to this bug.