Bug 1225371

Summary: peers connected in the middle of a transaction are participating in the transaction
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Avra Sengupta <asengupt>
Component: glusterdAssignee: Avra Sengupta <asengupt>
Status: CLOSED ERRATA QA Contact: Triveni Rao <trao>
Severity: unspecified Docs Contact:
Priority: high    
Version: unspecifiedCC: annair, asengupt, ashah, asrivast, nlevinki, rcyriac, rjoseph, sashinde, vagarwal, vbellur
Target Milestone: ---Keywords: Triaged
Target Release: RHGS 3.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.0-3.el6rhs.x86_64 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1224292 Environment:
Last Closed: 2015-07-29 04:53:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1224290, 1224292    
Bug Blocks: 1202842    

Description Avra Sengupta 2015-05-27 08:39:15 UTC
+++ This bug was initially created as a clone of Bug #1224292 +++

+++ This bug was initially created as a clone of Bug #1224290 +++

Description of problem:
peers which are probed and awaiting connection, when get connected in between a transaction, cause unexpected behaviour with the operation.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Avra Sengupta 2015-06-09 05:57:24 UTC
Fixed with http://review.gluster.org/10937

Comment 3 Triveni Rao 2015-07-05 11:49:16 UTC
please provide proper description of the problem/steps as to what needs to be verified.

Comment 4 Triveni Rao 2015-07-06 10:34:43 UTC
this is related to client log messaging:

the messages id was verified with the corresponding label..

2015-07-06 03:33:11.259462] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-07-06 03:33:11.270045] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2
[2015-07-06 03:33:11.270614] I [MSGID: 114020] [client.c:2118:notify] 0-lot-client-0: parent translators are ready, attempting connect on transport
[2015-07-06 03:33:11.273115] I [MSGID: 114020] [client.c:2118:notify] 0-lot-client-1: parent translators are ready, attempting connect on transport


[root@casino-vm2 glusterfs]# rpm -qa | grep gluster
gluster-nagios-addons-0.2.3-1.el6rhs.x86_64
glusterfs-client-xlators-3.7.1-6.el6rhs.x86_64
glusterfs-server-3.7.1-6.el6rhs.x86_64
gluster-nagios-common-0.2.0-1.el6rhs.noarch
glusterfs-3.7.1-6.el6rhs.x86_64
glusterfs-api-3.7.1-6.el6rhs.x86_64
glusterfs-cli-3.7.1-6.el6rhs.x86_64
glusterfs-geo-replication-3.7.1-6.el6rhs.x86_64
vdsm-gluster-4.16.20-1.1.el6rhs.noarch
glusterfs-libs-3.7.1-6.el6rhs.x86_64
glusterfs-fuse-3.7.1-6.el6rhs.x86_64
glusterfs-rdma-3.7.1-6.el6rhs.x86_64
[root@casino-vm2 glusterfs]#

Comment 5 Triveni Rao 2015-07-06 10:36:43 UTC
apologies, this got messed up with another bug report.
moving back to on qa and need info flag.

Comment 6 Avra Sengupta 2015-07-08 09:50:08 UTC
This issue was found during a code walkthrough session, and is very hard to hit in a normal scenario. We can try running gluster peer probe and gluster snapshot create at the same time, and it should not result in a crash.

Comment 7 Triveni Rao 2015-07-09 09:51:42 UTC
did not see any crashes w.r.t glusterd or snapshot


[root@rhs-client4 ~]# gluster snapshot create  T_snap cross3 no-timestamp
snapshot create: success: Snap T_snap created successfully
[root@rhs-client4 ~]# 


root@rhs-client4 ~]# gluster peer probe 10.70.33.228
peer probe: success. 
[root@rhs-client4 ~]# gluster peer status
Number of Peers: 8

Hostname: 10.70.36.22
Uuid: 956ccf69-1213-437e-a4df-9119f70673bb
State: Peer in Cluster (Connected)

Hostname: 10.70.36.21
Uuid: 40b28e62-049a-43d5-9996-810079a1c52c
State: Peer in Cluster (Connected)

Hostname: 10.70.36.5
Uuid: f23bcd77-050f-477e-85d3-23d5166373db
State: Peer in Cluster (Connected)

Hostname: 10.70.34.58
Uuid: 675c67a5-e615-4d18-a17a-6a9310cc2f39
State: Peer in Cluster (Connected)

Hostname: 10.70.36.33
Uuid: 00d62afe-e6bb-4a56-8e9e-92f55f04f8f3
State: Peer in Cluster (Connected)

Hostname: 10.70.36.63
Uuid: 42d988c8-e530-4124-84f8-4985f6ae1907
State: Peer in Cluster (Connected)

Hostname: 10.70.33.210
Uuid: 9b1659df-cba6-403a-ba76-16b154c5d363
State: Peer in Cluster (Connected)

Hostname: 10.70.33.228
Uuid: 995c6370-0905-4a45-9446-034360f3b786
State: Peer in Cluster (Connected)
[root@rhs-client4 ~]# 



[root@rhs-client4 ~]# gluster snapshot activate T_snap
Snapshot activate: T_snap: Snap activated successfully
[root@rhs-client4 ~]# 
[root@rhs-client4 ~]# 
[root@rhs-client4 ~]# gluster snapshot list
snap1_GMT-2015.07.03-00.22.49
snap2_after_corruption_GMT-2015.07.03-23.56.16
T_snap
[root@rhs-client4 ~]# 



[root@rhs-client4 ~]# rpm -qa | grep gluster
samba-vfs-glusterfs-4.1.17-7.el6rhs.x86_64
glusterfs-client-xlators-3.7.1-8.el6rhs.x86_64
glusterfs-server-3.7.1-8.el6rhs.x86_64
gluster-nagios-common-0.2.0-1.el6rhs.noarch
gluster-nagios-addons-0.2.4-2.el6rhs.x86_64
vdsm-gluster-4.16.20-1.2.el6rhs.noarch
python-gluster-3.7.1-6.el6rhs.x86_64
glusterfs-3.7.1-8.el6rhs.x86_64
glusterfs-api-3.7.1-8.el6rhs.x86_64
glusterfs-cli-3.7.1-8.el6rhs.x86_64
glusterfs-ganesha-3.7.1-8.el6rhs.x86_64
glusterfs-rdma-3.7.1-8.el6rhs.x86_64
nfs-ganesha-gluster-2.2.0-3.el6rhs.x86_64
glusterfs-libs-3.7.1-8.el6rhs.x86_64
glusterfs-fuse-3.7.1-8.el6rhs.x86_64
glusterfs-geo-replication-3.7.1-8.el6rhs.x86_64
[root@rhs-client4 ~]# 


[2015-07-09 09:38:59.444342] I [MSGID: 115029] [server-handshake.c:610:server_setvolume] 0-cross3-server: accepted client from rhs-client4.lab.eng.blr.redhat.com-29745-2015/07/09-09:38:58:149062-cross3-snapd-client-0-0 (version: 3.7.1)
[2015-07-09 09:39:55.958320] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-07-09 09:39:56.560470] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2015-07-09 09:53:26.383375] I [snapview-server-mgmt.c:27:mgmt_cbk_snap] 0-mgmt: list of snapshots changed
[2015-07-09 09:54:55.190400] I [snapview-server-mgmt.c:27:mgmt_cbk_snap] 0-mgmt: list of snapshots changed

Comment 8 errata-xmlrpc 2015-07-29 04:53:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html