Description of problem: ======================== Had a 4node cluster, and created a 2*(4+2) volume. Ran load over two nfs mounted clients. When that was in progress, tried to attach a 4*1 distribute volume as hot tier and that timed out. This resulted in the volume landing in an inconsistent state where tier attach had successfully completed, but tier start had not taken place. In other words, tier process had not started, nor was there a <volname>-tier.log file created in the /var/log/glusterfs/ folder. 'gluster v info' showed a healthy state of tier volume, with cold and hot bricks running. On discussion with Rafi, whenever we do a tier-attach, stage 1 is to send a request to add hot tier brick to every node concerned, and on successful completion of that, stage 2 is to 'start' the tier process. However if a request times out in the middle of stage1, between stage1 and 2, or in the middle of stage2 , there is no rewind/rollback that takes place. This leaves us with a volume for which we have no definite clue of how functional it is. In my case, where stage1 looked to be successfully completed, the workaround was fairly simple, to do 'gluster v tier start force'. On doing that, I see these errors popping up in the window (pasted below) - which still gives a feeling all is not well with my tier volume. We need a way to work out a process/to-do-steps to recover, if we land up in such (or worse) a state. [root@dhcp47-64 ~]# Broadcast message from systemd-journald.eng.blr.redhat.com (Mon 2016-05-09 14:47:26 IST): bricks-brick4-nash_tier[1155]: [2016-05-09 09:17:26.799349] M [MSGID: 113075] [posix-helpers.c:1845:posix_health_check_thread_proc] 0-nash-posix: health-check failed, going down Message from syslogd@dhcp47-64 at May 9 14:47:26 ... bricks-brick4-nash_tier[1155]:[2016-05-09 09:17:26.799349] M [MSGID: 113075] [posix-helpers.c:1845:posix_health_check_thread_proc] 0-nash-posix: health-check failed, going down Broadcast message from systemd-journald.eng.blr.redhat.com (Mon 2016-05-09 14:47:56 IST): bricks-brick4-nash_tier[1155]: [2016-05-09 09:17:56.800744] M [MSGID: 113075] [posix-helpers.c:1851:posix_health_check_thread_proc] 0-nash-posix: still alive! -> SIGTERM Message from syslogd@dhcp47-64 at May 9 14:47:56 ... bricks-brick4-nash_tier[1155]:[2016-05-09 09:17:56.800744] M [MSGID: 113075] [posix-helpers.c:1851:posix_health_check_thread_proc] 0-nash-posix: still alive! -> SIGTERM [root@dhcp47-64 ~]# Version-Release number of selected component (if applicable): ============================================================ 3.7.9-3 How reproducible: Hit it once ================= Additional info: =============== [root@dhcp47-64 ~]# gluster v tier nash attach 10.70.47.64:/bricks/brick4/nash_tier 10.70.46.33:/bricks/brick4/nash_tier 10.70.46.121:/bricks/brick4/nash_tier 10.70.47.190:/bricks/brick4/nash_tier Error : Request timed out Tier command failed [root@dhcp47-64 ~]# [root@dhcp47-64 ~]# [root@dhcp47-64 ~]# gluster v tier nash attach 10.70.47.64:/bricks/brick4/nash_tier 10.70.46.33:/bricks/brick4/nash_tier 10.70.46.121:/bricks/brick4/nash_tier 10.70.47.190:/bricks/brick4/nash_tier volume attach-tier: failed: Volume nash is already a tier. Tier command failed [root@dhcp47-64 ~]# [root@dhcp47-64 ~]# gluster v info Volume Name: nash Type: Tier Volume ID: 16f0b5a8-913b-42d1-b3a7-e3e9344f5535 Status: Started Number of Bricks: 16 Transport-type: tcp Hot Tier : Hot Tier Type : Distribute Number of Bricks: 4 Brick1: 10.70.47.190:/bricks/brick4/nash_tier Brick2: 10.70.46.121:/bricks/brick4/nash_tier Brick3: 10.70.46.33:/bricks/brick4/nash_tier Brick4: 10.70.47.64:/bricks/brick4/nash_tier Cold Tier: Cold Tier Type : Distributed-Disperse Number of Bricks: 2 x (4 + 2) = 12 Brick5: 10.70.47.64:/bricks/brick1/nash Brick6: 10.70.46.121:/bricks/brick1/nash Brick7: 10.70.46.33:/bricks/brick1/nash Brick8: 10.70.47.190:/bricks/brick1/nash Brick9: 10.70.47.64:/bricks/brick2/nash Brick10: 10.70.46.121:/bricks/brick2/nash Brick11: 10.70.46.33:/bricks/brick2/nash Brick12: 10.70.47.190:/bricks/brick2/nash Brick13: 10.70.47.64:/bricks/brick3/nash Brick14: 10.70.46.121:/bricks/brick3/nash Brick15: 10.70.46.33:/bricks/brick3/nash Brick16: 10.70.47.190:/bricks/brick3/nash Options Reconfigured: cluster.tier-mode: cache features.ctr-enabled: on performance.readdir-ahead: on [root@dhcp47-64 ~]# [root@dhcp47-64 ~]# [root@dhcp47-64 ~]# gluster v tier nash start Tiering Migration Functionality: nash: success: Attach tier is successful on nash. use tier status to check the status. ID: 8950b59b-b423-4d25-911d-0a0eb7c65dce [root@dhcp47-64 ~]# ps -ef | grep tier root 1155 1 0 10:31 ? 00:00:58 /usr/sbin/glusterfsd -s 10.70.47.64 --volfile-id nash.10.70.47.64.bricks-brick4-nash_tier -p /var/lib/glusterd/vols/nash/run/10.70.47.64-bricks-brick4-nash_tier.pid -S /var/run/gluster/b93a3815e235a7bab53b4d2d1e796a83.socket --brick-name /bricks/brick4/nash_tier -l /var/log/glusterfs/bricks/bricks-brick4-nash_tier.log --xlator-option *-posix.glusterd-uuid=a34abfd0-300d-4d57-a047-8550c10acec8 --brick-port 49155 --xlator-option nash-server.listen-port=49155 root 5710 1 12 14:41 ? 00:00:02 /usr/sbin/glusterfs -s localhost --volfile-id rebalance/nash --xlator-option *dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off --xlator-option *dht.readdir-optimize=on --xlator-option *tier-dht.xattr-name=trusted.tier.tier-dht --xlator-option *dht.rebalance-cmd=6 --xlator-option *dht.node-uuid=a34abfd0-300d-4d57-a047-8550c10acec8 --xlator-option *dht.commit-hash=3112346234 --socket-file /var/run/gluster/gluster-tier-16f0b5a8-913b-42d1-b3a7-e3e9344f5535.sock --pid-file /var/lib/glusterd/vols/nash/tier/a34abfd0-300d-4d57-a047-8550c10acec8.pid -l /var/log/glusterfs/nash-tier.log root 5733 5605 0 14:42 pts/0 00:00:00 grep --color=auto tier [root@dhcp47-64 ~]# [root@dhcp47-64 ~]# [root@dhcp47-64 ~]# Broadcast message from systemd-journald.eng.blr.redhat.com (Mon 2016-05-09 14:47:26 IST): bricks-brick4-nash_tier[1155]: [2016-05-09 09:17:26.799349] M [MSGID: 113075] [posix-helpers.c:1845:posix_health_check_thread_proc] 0-nash-posix: health-check failed, going down Message from syslogd@dhcp47-64 at May 9 14:47:26 ... bricks-brick4-nash_tier[1155]:[2016-05-09 09:17:26.799349] M [MSGID: 113075] [posix-helpers.c:1845:posix_health_check_thread_proc] 0-nash-posix: health-check failed, going down Broadcast message from systemd-journald.eng.blr.redhat.com (Mon 2016-05-09 14:47:56 IST): bricks-brick4-nash_tier[1155]: [2016-05-09 09:17:56.800744] M [MSGID: 113075] [posix-helpers.c:1851:posix_health_check_thread_proc] 0-nash-posix: still alive! -> SIGTERM Message from syslogd@dhcp47-64 at May 9 14:47:56 ... bricks-brick4-nash_tier[1155]:[2016-05-09 09:17:56.800744] M [MSGID: 113075] [posix-helpers.c:1851:posix_health_check_thread_proc] 0-nash-posix: still alive! -> SIGTERM [root@dhcp47-64 ~]# [root@dhcp47-64 ~]# rpm -qa | grep gluster glusterfs-client-xlators-3.7.9-3.el7rhgs.x86_64 glusterfs-server-3.7.9-3.el7rhgs.x86_64 gluster-nagios-addons-0.2.6-1.el7rhgs.x86_64 python-gluster-3.7.5-19.el7rhgs.noarch vdsm-gluster-4.16.30-1.3.el7rhgs.noarch glusterfs-3.7.9-3.el7rhgs.x86_64 glusterfs-api-3.7.9-3.el7rhgs.x86_64 glusterfs-cli-3.7.9-3.el7rhgs.x86_64 glusterfs-geo-replication-3.7.9-3.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch glusterfs-libs-3.7.9-3.el7rhgs.x86_64 glusterfs-fuse-3.7.9-3.el7rhgs.x86_64 glusterfs-rdma-3.7.9-3.el7rhgs.x86_64 [root@dhcp47-64 ~]# [root@dhcp47-64 ~]#
Please ignore the SIGTERM messages. That happened with one of the bricks getting deleted, by mistake.
Recovery steps: discussed and reviewed with glusterd engineering after recreating the problem. The hot tier would have been attached on either all nodes or none of them. On seeing a timeout: 1. Check if the graph has become a tiered volume. 1a. if not, rerun attach tier. 1b. if it has, goto step 2. 2. Check if the rebalance daemons were created on each server. 2b. If rebalance daemons were not created, run gluster tier <vol> start
As tier is not being actively developed, I'm closing this bug. Feel free to open it if necessary.