Bug 991372 - Dist-geo-rep: geo-rep create returns success, but actually fails with "Not a valid slave volume" message in the logs even for a valid slave
Summary: Dist-geo-rep: geo-rep create returns success, but actually fails with "Not a ...
Keywords:
Status: CLOSED DUPLICATE of bug 980529
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: 2.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Avra Sengupta
QA Contact: Sudhir D
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-02 09:16 UTC by M S Vishwanath Bhat
Modified: 2016-06-01 01:56 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-08-21 06:53:35 UTC
Embargoed:


Attachments (Terms of Use)
glusterd logs from the "spitfire" node where this command was executed. (87.97 KB, text/x-log)
2013-08-02 09:16 UTC, M S Vishwanath Bhat
no flags Details

Description M S Vishwanath Bhat 2013-08-02 09:16:58 UTC
Created attachment 781894 [details]
glusterd logs from the "spitfire" node where this command was executed.

Description of problem:
geo-rep create force fails with the "Not a valid slave volume" Error message in the glusterd logs even for a valid slave volume. The ssh-pem setup is already setup and proper.

Version-Release number of selected component (if applicable):
glusterfs-3.4.0.14rhs-1.el6rhs.x86_64

How reproducible:
Not sure. In my current setup I have hit this many times.

Steps to Reproduce:
1. Create a master volume 2*2 distribute-replicated volume and start it.

[root@mustang ~]# gluster v i
 
Volume Name: master
Type: Distributed-Replicate
Volume ID: 8e3e1bf0-2858-44ac-b960-d1a23c749b31
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: spitfire:/rhs/bricks/brick1
Brick2: mustang:/rhs/bricks/brick2
Brick3: harrier:/rhs/bricks/brick3
Brick4: typhoon:/rhs/bricks/brick4
Options Reconfigured:
geo-replication.indexing: on
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
changelog.rollover-time: 30
changelog.changelog: on


2. Create a slave volume of 2 way pure replicated volume.

[root@falcon ~]# gluster v i
 
Volume Name: slave
Type: Replicate
Volume ID: e0f55b95-966b-45b0-92f1-e0ff23bde0e4
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: interceptor:/rhs/bricks/brick1
Brick2: lightning:/rhs/bricks/brick2

Make sure that the total available size of the slave volume is less than the total size master volume.

3. Now try to create the session with just create, it fails with master has more size than the slave. Now run create force, it actually returns true, but actually fails with the "Not a valid slave volume" in the logs. 

Below are the commands I executed in the same order and their output.

[root@spitfire glusterfs-deploy-scripts]# gluster v geo master falcon::slave create
Total size of master is greater than available size of slave.
geo-replication command failed
[root@spitfire glusterfs-deploy-scripts]# gluster v geo master falcon::slave create force
Creating geo-replication session between master & falcon::slave has been successful
[root@spitfire glusterfs-deploy-scripts]# gluster v geo master falcon::slave status
No active geo-replication sessions between master and falcon::slave
[root@spitfire glusterfs-deploy-scripts]# gluster v geo master falcon::slave start
Session between master and falcon::slave has not been created. Please create session and retry.
geo-replication command failed
[root@spitfire glusterfs-deploy-scripts]# gluster v geo master falcon::slave config
Geo-replication session between master and falcon::slave does not exist.
geo-replication command failed
[root@spitfire glusterfs-deploy-scripts]# gluster v geo master falcon::slave create force
Creating geo-replication session between master & falcon::slave has been successful
[root@spitfire glusterfs-deploy-scripts]# gluster v geo master falcon::slave create
Total size of master is greater than available size of slave.
geo-replication command failed
[root@spitfire glusterfs-deploy-scripts]# ping -c 1 -w 1 falcon
PING falcon (10.70.43.152) 56(84) bytes of data.
64 bytes from falcon (10.70.43.152): icmp_seq=1 ttl=64 time=0.750 ms

--- falcon ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 1ms
rtt min/avg/max/mdev = 0.750/0.750/0.750/0.000 ms


The machine "falcon" is pingable and has the gsync pem setup properly.


Actual results:

The error messages seen the glusterd logs

[2013-08-02 08:42:56.825438] E [glusterd-geo-rep.c:1662:glusterd_verify_slave] 0-: Not a valid slave
[2013-08-02 08:42:56.825584] E [glusterd-geo-rep.c:1764:glusterd_op_stage_gsync_create] 0-: falcon::slave is not a valid slave volume.
[2013-08-02 08:42:56.825614] E [glusterd-syncop.c:872:gd_stage_op_phase] 0-management: Staging of operation 'Volume Geo-replication Create' failed on localhost : Total size of master is greater than available size of slave.
[2013-08-02 08:43:02.308765] E [glusterd-geo-rep.c:1662:glusterd_verify_slave] 0-: Not a valid slave
[2013-08-02 08:43:02.308996] E [glusterd-geo-rep.c:1764:glusterd_op_stage_gsync_create] 0-: falcon::slave is not a valid slave volume.
[2013-08-02 08:43:03.072122] E [glusterd-geo-rep.c:3758:glusterd_create_essential_dir_files] 0-: Unable to fetch statefile path.
[2013-08-02 08:43:03.072224] E [glusterd-syncop.c:951:gd_commit_op_phase] 0-management: Commit of operation 'Volume Geo-replication Create' failed on localhost : Total size of master is greater than available size of slave.
[2013-08-02 08:43:06.921876] I [glusterd-geo-rep.c:2686:glusterd_get_gsync_status_mst_slv] 0-: geo-replication status master falcon::slave :session is not active
[2013-08-02 08:43:07.159756] I [glusterd-geo-rep.c:2704:glusterd_get_gsync_status_mst_slv] 0-: /var/lib/glusterd/geo-replication/master-slave/ssh%3A%2F%2Froot%4010.70.43.152%3Agluster%3A%2F%2F127.0.0.1%3Aslave.status statefile not present.
[2013-08-02 08:43:19.099710] E [glusterd-geo-rep.c:1182:glusterd_op_verify_gsync_start_options] 0-: Session between master and falcon::slave has not been created. Please create session and retry.
[2013-08-02 08:43:19.099866] E [glusterd-syncop.c:872:gd_stage_op_phase] 0-management: Staging of operation 'Volume Geo-replication' failed on localhost : Session between master and falcon::slave has not been created. Please create session and retry.
[2013-08-02 08:43:22.154938] E [glusterd-geo-rep.c:1943:glusterd_op_stage_gsync_set] 0-: Geo-replication session between master and falcon::slave does not exist.. statefile = /var/lib/glusterd/geo-replication/master-slave/ssh%3A%2F%2Froot%4010.70.43.152%3Agluster%3A%2F%2F127.0.0.1%3Aslave.status
[2013-08-02 08:43:22.155029] E [glusterd-syncop.c:872:gd_stage_op_phase] 0-management: Staging of operation 'Volume Geo-replication' failed on localhost : Geo-replication session between master and falcon::slave does not exist.
[2013-08-02 08:48:26.388897] E [glusterd-geo-rep.c:1662:glusterd_verify_slave] 0-: Not a valid slave
[2013-08-02 08:48:26.389105] E [glusterd-geo-rep.c:1764:glusterd_op_stage_gsync_create] 0-: falcon::slave is not a valid slave volume.
[2013-08-02 08:48:26.389105] E [glusterd-geo-rep.c:1764:glusterd_op_stage_gsync_create] 0-: falcon::slave is not a valid slave volume.
[2013-08-02 08:48:27.264512] E [glusterd-geo-rep.c:3758:glusterd_create_essential_dir_files] 0-: Unable to fetch statefile path.
[2013-08-02 08:48:27.264605] E [glusterd-syncop.c:951:gd_commit_op_phase] 0-management: Commit of operation 'Volume Geo-replication Create' failed on localhost : Total size of master is greater than available size of slave.
[2013-08-02 08:52:52.993660] E [glusterd-geo-rep.c:1662:glusterd_verify_slave] 0-: Not a valid slave
[2013-08-02 08:52:52.993879] E [glusterd-geo-rep.c:1764:glusterd_op_stage_gsync_create] 0-: falcon::slave is not a valid slave volume.
[2013-08-02 08:52:52.993956] E [glusterd-syncop.c:872:gd_stage_op_phase] 0-management: Staging of operation 'Volume Geo-replication Create' failed on localhost : Total size of master is greater than available size of slave.



Expected results:
When the pem file setup is present and proper and when the total available size of the slave is less than the total size of the master, the "geo-rep create" should fail. But "geo-rep create force" pass and establish a connection.

Additional info:

I have attached the glusterd logs from the master volume where I executed the geo-rep commands.

Comment 2 Avra Sengupta 2013-08-21 06:53:35 UTC

*** This bug has been marked as a duplicate of bug 980529 ***


Note You need to log in before you can comment on or make changes to this bug.