Bug 987462

Summary: Dist-geo-rep: geo-rep status says the session is stopped, but start says the session is not created
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: M S Vishwanath Bhat <vbhat>
Component: geo-replicationAssignee: Avra Sengupta <asengupt>
Status: CLOSED ERRATA QA Contact: M S Vishwanath Bhat <vbhat>
Severity: medium Docs Contact:
Priority: high    
Version: 2.1CC: aavati, amarts, csaba, mzywusko, rhs-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.15rhs-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-23 22:38:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description M S Vishwanath Bhat 2013-07-23 12:37:02 UTC
Description of problem:
There is inconsistency in the output of geo-rep status and the error message thrown by the geo-rep session start after rpm upgrade from beta3 to beta5.


I had a geo-rep session running with beta3 rpms and I stopped the geo-rep session, stopped both master and slave volume. Now i upgraded the rpms and restarted the volume. Now the geo-rep status says the volume is started but the start says the session is not created.


Version-Release number of selected component (if applicable):
glusterfs-3.4.0.12rhs.beta5-2.el6rhs.x86_64

How reproducible:
Hit once, not sure if 100% reproducible.


Steps to Reproduce:
1. Create and start a master and slave volume.
2. Create and start a geo-rep session between master and slave.
3. Now stop the geo-rep session and stop the master and slave volumes.
4. Stop all the glusterd in all nodes.
5. Upgraded the rpms from beta3 to beta5.
6. Now start the master and slave volumes.
7. Run geo-rep start
8. Run geo-rep status


Actual results:

[root@spitfire glusterfs-deploy-scripts]# gluster v geo hosa-master falcon::hosa-slave start
Session between hosa-master and falcon::hosa-slave has not been created. Please create session and retry.
geo-replication command failed


[root@spitfire glusterfs-deploy-scripts]# gluster v geo hosa-master falcon::hosa-slave status
NODE                       MASTER         SLAVE                 HEALTH     UPTIME       
----------------------------------------------------------------------------------------
spitfire.blr.redhat.com    hosa-master    falcon::hosa-slave    Stopped    N/A          
harrier.blr.redhat.com     hosa-master    falcon::hosa-slave    Stopped    N/A          
mustang.blr.redhat.com     hosa-master    falcon::hosa-slave    Stopped    N/A          
typhoon.blr.redhat.com     hosa-master    falcon::hosa-slave    Stopped    N/A        



Expected results:
A geo-rep session stopped with via CLI should be able to be re-started again via start.

Upgrade of rpms should not affect it. There should be consistency between the output of geo-rep status and the output of geo-rep start.

Additional info:
These error messaged are present in the glusterd log file.


[2013-07-23 11:56:57.714262] I [run.c:190:runner_log] 0-management: Ran script: /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh --volname=hosa-master --first=yes
[2013-07-23 11:56:57.787916] I [run.c:190:runner_log] 0-management: Ran script: /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh --volname=hosa-master --first=yes
[2013-07-23 11:57:24.577997] I [glusterd-geo-rep.c:2495:glusterd_get_gsync_status_mst_slv] 0-: geo-replication status hosa-master falcon::hosa-slave :session is not active
[2013-07-23 11:57:29.716895] E [glusterd-geo-rep.c:1132:glusterd_op_verify_gsync_start_options] 0-: Session between hosa-master and falcon::hosa-slave has not been created. Please create session and retry.
[2013-07-23 11:57:29.716974] E [glusterd-syncop.c:866:gd_stage_op_phase] 0-management: Staging of operation 'Volume Geo-replication' failed on localhost : Session between hosa-master and falcon::hosa-slave has not been created. Please create session and retry.
[2013-07-23 11:57:48.194817] I [glusterd-geo-rep.c:2495:glusterd_get_gsync_status_mst_slv] 0-: geo-replication status hosa-master falcon::hosa-slave :session is not active
[2013-07-23 12:17:30.643386] E [glusterd-geo-rep.c:1132:glusterd_op_verify_gsync_start_options] 0-: Session between hosa-master and falcon::hosa-slave has not been created. Please create session and retry.
[2013-07-23 12:17:30.643468] E [glusterd-syncop.c:866:gd_stage_op_phase] 0-management: Staging of operation 'Volume Geo-replication' failed on localhost : Session between hosa-master and falcon::hosa-slave has not been created. Please create session and retry.
[2013-07-23 12:27:45.056456] E [glusterd-geo-rep.c:1132:glusterd_op_verify_gsync_start_options] 0-: Session between hosa-master and falcon::hosa-slave has not been created. Please create session and retry.
[2013-07-23 12:27:45.056536] E [glusterd-syncop.c:866:gd_stage_op_phase] 0-management: Staging of operation 'Volume Geo-replication' failed on localhost : Session between hosa-master and falcon::hosa-slave has not been created. Please create session and retry.
[2013-07-23 12:28:07.365539] I [glusterd-geo-rep.c:2495:glusterd_get_gsync_status_mst_slv] 0-: geo-replication status hosa-master falcon::hosa-slave :session is not active
[2013-07-23 12:32:20.032069] E [glusterd-geo-rep.c:1132:glusterd_op_verify_gsync_start_options] 0-: Session between hosa-master and falcon::hosa-slave has not been created. Please create session and retry.
[2013-07-23 12:32:20.032190] E [glusterd-syncop.c:866:gd_stage_op_phase] 0-management: Staging of operation 'Volume Geo-replication' failed on localhost : Session between hosa-master and falcon::hosa-slave has not been created. Please create session and retry.
[2013-07-23 12:32:54.121195] I [glusterd-geo-rep.c:2495:glusterd_get_gsync_status_mst_slv] 0-: geo-replication status hosa-master falcon::hosa-slave :session is not active

Comment 1 M S Vishwanath Bhat 2013-07-23 12:40:26 UTC
And the create says the session is already created. And delete says the session is not present.


[root@spitfire glusterfs-deploy-scripts]# gluster v geo hosa-master falcon::hosa-slave create
Session between hosa-master and falcon::hosa-slave is already created.
geo-replication command failed


[root@spitfire glusterfs-deploy-scripts]# gluster v geo hosa-master falcon::hosa-slave delete
Geo-replication session between hosa-master and falcon::hosa-slave does not exist.
geo-replication command failed

Comment 3 Avra Sengupta 2013-08-01 12:03:08 UTC
Fixed with the fix of https://bugzilla.redhat.com/show_bug.cgi?id=980529 in the patch: https://code.engineering.redhat.com/gerrit/#/c/10827/

Comment 4 M S Vishwanath Bhat 2013-08-19 12:58:15 UTC
Now this is fixed. I tried by upgrading the 18rhs with 20rhs. Now it works fine.


After upgrade...

[root@pythagoras ]# gluster v geo master euclid::slave status
NODE                         MASTER    SLAVE            HEALTH     UPTIME       
----------------------------------------------------------------------------
pythagoras.blr.redhat.com    master    euclid::slave    Stopped    N/A          
ramanujan.blr.redhat.com     master    euclid::slave    Stopped    N/A          


[root@pythagoras ]# gluster v geo master euclid::slave start
Starting geo-replication session between master & euclid::slave has been successful


[root@pythagoras ]# gluster v geo master euclid::slave status
NODE                         MASTER    SLAVE            HEALTH             UPTIME       
------------------------------------------------------------------------------------
pythagoras.blr.redhat.com    master    euclid::slave    Initializing...    N/A          
ramanujan.blr.redhat.com     master    euclid::slave    Initializing...    N/A          

[root@pythagoras ]# gluster v geo master euclid::slave status
NODE                         MASTER    SLAVE            HEALTH    UPTIME         
-----------------------------------------------------------------------------
pythagoras.blr.redhat.com    master    euclid::slave    Stable    00:03:47       
ramanujan.blr.redhat.com     master    euclid::slave    Stable    00:03:45

Comment 5 Scott Haines 2013-09-23 22:38:45 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Comment 6 Scott Haines 2013-09-23 22:41:29 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html