Description of problem: Status of geo-replication workers on master nodes is "inconsistent" if master volume is tiered. Version-Release number of selected component (if applicable): GlusterFS 5.2 installation from source code TAR file How reproducible: 100% Steps to Reproduce: 1. Set up two nodes. One will host geo-replication master volume. Master volume has to be tiered. Other node will host geo-replication slave volume. [root@SC-10-10-63-182 log]# glusterfsd --version glusterfs 5.2 [root@SC-10-10-63-183 log]# glusterfsd --version glusterfs 5.2 2. On master node create tiered volume: [root@SC-10-10-63-182 log]# gluster volume info master-volume-1 Volume Name: master-volume-1 Type: Tier Volume ID: aa95df34-f181-456c-aa26-9756b68ed679 Status: Started Snapshot Count: 0 Number of Bricks: 2 Transport-type: tcp Hot Tier : Hot Tier Type : Distribute Number of Bricks: 1 Brick1: 10.10.60.182:/exports/master-hot-tier/master-volume-1 Cold Tier: Cold Tier Type : Distribute Number of Bricks: 1 Brick2: 10.10.60.182:/exports/master-segment-1/master-volume-1 Options Reconfigured: features.ctr-sql-db-wal-autocheckpoint: 25000 features.ctr-sql-db-cachesize: 12500 cluster.tier-mode: cache features.ctr-enabled: on server.allow-insecure: on performance.quick-read: off performance.stat-prefetch: off nfs.addr-namelookup: off transport.address-family: inet nfs.disable: on cluster.enable-shared-storage: disable snap-activate-on-create: enable [root@SC-10-10-63-182 log]# gluster volume status master-volume-1 Status of volume: master-volume-1 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Hot Bricks: Brick 10.10.60.182:/exports/master-hot-tier /master-volume-1 62001 0 Y 15690 Cold Bricks: Brick 10.10.60.182:/exports/master-segment- 1/master-volume-1 62000 0 Y 9762 Tier Daemon on localhost N/A N/A Y 15713 Task Status of Volume master-volume-1 ------------------------------------------------------------------------------ There are no active volume tasks [root@SC-10-10-63-182 log]# gluster volume tier master-volume-1 status Node Promoted files Demoted files Status run time in h:m:s --------- --------- --------- --------- --------- localhost 0 0 in progress 0:3:40 Tiering Migration Functionality: master-volume-1: success 3. On slave node create slave volume: [root@SC-10-10-63-183 log]# gluster volume info slave-volume-1 Volume Name: slave-volume-1 Type: Distribute Volume ID: 569a340b-35f8-4109-8816-720982b11806 Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.10.60.183:/exports/slave-segment-1/slave-volume-1 Options Reconfigured: server.allow-insecure: on performance.quick-read: off performance.stat-prefetch: off nfs.addr-namelookup: off transport.address-family: inet nfs.disable: on cluster.enable-shared-storage: disable snap-activate-on-create: enable [root@SC-10-10-63-183 log]# gluster volume status slave-volume-1 Status of volume: slave-volume-1 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.10.60.183:/exports/slave-segment-1 /slave-volume-1 62000 0 Y 2532 Task Status of Volume slave-volume-1 ------------------------------------------------------------------------------ There are no active volume tasks 4. Set up SSH access to slave node: SSH from 182 to 183: 20660 01/21/2019 13:58:54.930122501 1548107934 command: /usr/bin/ssh nasgorep.60.183 /bin/pwd 20660 01/21/2019 13:58:55.021906148 1548107935 status=0 /usr/bin/ssh nasgorep.60.183 /bin/pwd 20694 01/21/2019 13:58:56.169890800 1548107936 command: /usr/bin/ssh -q -oConnectTimeout=5 nasgorep.60.183 /bin/pwd 2>&1 20694 01/21/2019 13:58:56.256032202 1548107936 status=0 /usr/bin/ssh -q -oConnectTimeout=5 nasgorep.60.183 /bin/pwd 2>&1 5. Initialize geo-replication from master volume to slave volume: [root@SC-10-10-63-182 log]# vi /var/log/glusterfs/cmd_history.log [2019-01-21 21:59:08.942567] : system:: execute gsec_create : SUCCESS [2019-01-21 21:59:42.722194] : volume geo-replication master-volume-1 nasgorep.60.183::slave-volume-1 create push-pem : SUCCESS [2019-01-21 21:59:49.527353] : volume geo-replication master-volume-1 nasgorep.60.183::slave-volume-1 start : SUCCESS [2019-01-21 21:59:55.636198] : volume geo-replication master-volume-1 nasgorep.60.183::slave-volume-1 status detail : SUCCESS 6. Check status of the geo-replication: Actual results: [root@SC-10-10-63-183 log]# /usr/sbin/gluster-mountbroker status +-----------+-------------+---------------------------+--------------+---------------------------+ | NODE | NODE STATUS | MOUNT ROOT | GROUP | USERS | +-----------+-------------+---------------------------+--------------+---------------------------+ | localhost | UP | /var/mountbroker-root(OK) | nasgorep(OK) | nasgorep(slave-volume-1) | +-----------+-------------+---------------------------+--------------+---------------------------+ [root@SC-10-10-63-182 log]# gluster volume geo-replication master-volume-1 nasgorep.60.183::slave-volume-1 status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 10.10.60.182 master-volume-1 /exports/master-hot-tier/master-volume-1 nasgorep nasgorep.60.183::slave-volume-1 N/A Stopped N/A N/A 10.10.60.182 master-volume-1 /exports/master-segment-1/master-volume-1 nasgorep nasgorep.60.183::slave-volume-1 N/A Stopped N/A N/A Expected results: Status of the geo-replication workers on master node has to be "Active". Additional info: Contents of file /var/log/glusterfs/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.log on master node has explanation what is wrong: [root@SC-10-10-63-182 log]# vi /var/log/glusterfs/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.log [2019-01-21 21:59:39.347943] W [gsyncd(config-get):304:main] <top>: Session config file not exists, using the default config path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 21:59:42.438145] I [gsyncd(monitor-status):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 21:59:42.454929] I [subcmds(monitor-status):29:subcmd_monitor_status] <top>: Monitor Status Change status=Created [2019-01-21 21:59:48.756702] I [gsyncd(config-get):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 21:59:49.4720] I [gsyncd(config-get):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 21:59:49.239733] I [gsyncd(config-get):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 21:59:49.475193] I [gsyncd(monitor):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 21:59:49.868150] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=Initializing... [2019-01-21 21:59:49.868396] I [monitor(monitor):157:monitor] Monitor: starting gsyncd worker slave_node=10.10.60.183 brick=/exports/master-segment-1/master-volume-1 [2019-01-21 21:59:49.871593] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=Initializing... [2019-01-21 21:59:49.871963] I [monitor(monitor):157:monitor] Monitor: starting gsyncd worker slave_node=10.10.60.183 brick=/exports/master-hot-tier/master-volume-1 [2019-01-21 21:59:50.4395] I [monitor(monitor):268:monitor] Monitor: worker died before establishing connection brick=/exports/master-segment-1/master-volume-1 [2019-01-21 21:59:50.7447] I [monitor(monitor):268:monitor] Monitor: worker died before establishing connection brick=/exports/master-hot-tier/master-volume-1 [2019-01-21 21:59:50.8415] I [gsyncd(agent /exports/master-segment-1/master-volume-1):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 21:59:50.10383] I [gsyncd(agent /exports/master-hot-tier/master-volume-1):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 21:59:50.14039] I [repce(agent /exports/master-segment-1/master-volume-1):97:service_loop] RepceServer: terminating on reaching EOF. [2019-01-21 21:59:50.15556] I [changelogagent(agent /exports/master-hot-tier/master-volume-1):72:__init__] ChangelogAgent: Agent listining... [2019-01-21 21:59:50.15964] I [repce(agent /exports/master-hot-tier/master-volume-1):97:service_loop] RepceServer: terminating on reaching EOF. [2019-01-21 21:59:55.141768] I [gsyncd(config-get):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 21:59:55.380496] I [gsyncd(status):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 21:59:55.625045] I [gsyncd(status):308:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/master-volume-1_10.10.60.183_slave-volume-1/gsyncd.conf [2019-01-21 22:00:00.66032] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=inconsistent [2019-01-21 22:00:00.66289] E [syncdutils(monitor):338:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 368, in twrap tf(*aargs) File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 339, in wmon slave_host, master, suuid, slavenodes) TypeError: 'int' object is not iterable Similar test on GlusterFS 3.12.14 does not show the same failure.
We have deprecated 'tier' feature of glusterfs. Hence not possible to fix it in future.