Bug 1517131 - Grafana continues to list/display old bricks subsequent to a snapshot restore
Summary: Grafana continues to list/display old bricks subsequent to a snapshot restore
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: web-admin-tendrl-monitoring-integration
Version: rhgs-3.3
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Darshan
QA Contact: Rochelle
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-24 09:51 UTC by Vinayak Papnoi
Modified: 2017-12-18 04:38 UTC (History)
10 users (show)

Fixed In Version: tendrl-monitoring-integration-1.5.4-11.el7rhgs.noarch
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-18 04:38:26 UTC
Embargoed:


Attachments (Terms of Use)
The volume is 2x2 but the Bricks section is showing 8 bricks also the old bricks (/rhs/brick*) are present (184.17 KB, image/png)
2017-11-24 09:51 UTC, Vinayak Papnoi
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github Tendrl gluster-integration issues 518 0 None None None 2017-11-29 09:23:54 UTC
Red Hat Product Errata RHEA-2017:3478 0 normal SHIPPED_LIVE RHGS Web Administration packages 2017-12-18 09:34:49 UTC

Description Vinayak Papnoi 2017-11-24 09:51:00 UTC
Created attachment 1358578 [details]
The volume is 2x2 but the Bricks section is showing 8 bricks also the old bricks (/rhs/brick*) are present

Description of problem:
=======================

After doing a snapshot restore of a volume the old bricks are not removed from the list of bricks of that volume in Grafana Dashboard.


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.8.4-52.el7rhgs.x86_64
tendrl-grafana-plugins-1.5.4-5.el7rhgs.noarch
tendrl-ansible-1.5.4-1.el7rhgs.noarch
tendrl-selinux-1.5.3-2.el7rhgs.noarch
tendrl-node-agent-1.5.4-5.el7rhgs.noarch
tendrl-monitoring-integration-1.5.4-5.el7rhgs.noarch
tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch
tendrl-commons-1.5.4-4.el7rhgs.noarch
tendrl-api-1.5.4-2.el7rhgs.noarch
tendrl-api-httpd-1.5.4-2.el7rhgs.noarch
tendrl-notifier-1.5.4-3.el7rhgs.noarch
tendrl-ui-1.5.4-4.el7rhgs.noarch


How reproducible:
=================
1/1


Steps to Reproduce:
===================

1. Create a volume
2. Create a snapshot of the volume
3. Stop the volume and Activate and restore the snapshot
4. Start the volume

Actual results:
===============

Old bricks present.


Expected results:
=================

Old bricks shouldn't be present.


Additional info:
================

[root@dhcp43-93 glusterfs]# gluster v status speedster
Status of volume: speedster
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp43-93.lab.eng.blr.redhat.com:/run
/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf038
73b4/brick1/b1                              49152     0          Y       7721 
Brick dhcp41-170.lab.eng.blr.redhat.com:/ru
n/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf03
873b4/brick2/b2                             49152     0          Y       27235
Brick dhcp43-93.lab.eng.blr.redhat.com:/run
/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf038
73b4/brick3/b3                              49153     0          Y       7754 
Brick dhcp41-170.lab.eng.blr.redhat.com:/ru
n/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf03
873b4/brick4/b4                             49153     0          Y       27256
Self-heal Daemon on localhost               N/A       N/A        Y       2882 
Self-heal Daemon on dhcp41-170.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       21118
 
Task Status of Volume speedster
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp43-93 glusterfs]# gluster v info
Volume Name: speedster
Type: Distributed-Replicate
Volume ID: c4a9eacd-1c97-4c44-97fb-8619bf348dde
Status: Started
Snapshot Count: 254
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: dhcp43-93.lab.eng.blr.redhat.com:/run/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf03873b4/brick1/b1
Brick2: dhcp41-170.lab.eng.blr.redhat.com:/run/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf03873b4/brick2/b2
Brick3: dhcp43-93.lab.eng.blr.redhat.com:/run/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf03873b4/brick3/b3
Brick4: dhcp41-170.lab.eng.blr.redhat.com:/run/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf03873b4/brick4/b4
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
transport.address-family: inet
nfs.disable: on
features.quota: off
features.inode-quota: off
features.quota-deem-statfs: off
snap-activate-on-create: enable
auto-delete: enable

Comment 4 Darshan 2017-11-24 12:52:44 UTC
The fix for this would involve following steps:

1. listen to "snapshot_restore" gluster API event in tendrl. This just provides the volume name.

2. When above event is received, tendrl has to make another get state call to get the list of current bricks for restored volume.

3. Read the bricks for this volume from data store(etcd).

4. Check for bricks that are in data store but not in latest get-state call, These are the bricks to be removed.

5. Remove the bricks from data store(etcd).

6. Submit job for montoring-integration to remove bricks from graphite.

Comment 6 Bala Konda Reddy M 2017-12-05 16:08:03 UTC
Verified with tendrl-monitoring-integration-1.5.4-11.el7rhgs.noarch

On a successfully imported cluster, created volumes and snapshot. Stopped the volume and performed snapshot restore.

I am able to see new bricks on the bricks dashboard after snapshot restore.

Hence marking it as verified

Comment 8 errata-xmlrpc 2017-12-18 04:38:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3478


Note You need to log in before you can comment on or make changes to this bug.