Bug 1517131 - Grafana continues to list/display old bricks subsequent to a snapshot restore
Summary: Grafana continues to list/display old bricks subsequent to a snapshot restore
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: web-admin-tendrl-monitoring-integration
Version: rhgs-3.3
Hardware: x86_64
OS: Linux
Target Milestone: ---
: ---
Assignee: Darshan
QA Contact: Rochelle
Depends On:
TreeView+ depends on / blocked
Reported: 2017-11-24 09:51 UTC by Vinayak Papnoi
Modified: 2017-12-18 04:38 UTC (History)
10 users (show)

Fixed In Version: tendrl-monitoring-integration-1.5.4-11.el7rhgs.noarch
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2017-12-18 04:38:26 UTC
Target Upstream Version:

Attachments (Terms of Use)
The volume is 2x2 but the Bricks section is showing 8 bricks also the old bricks (/rhs/brick*) are present (184.17 KB, image/png)
2017-11-24 09:51 UTC, Vinayak Papnoi
no flags Details

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:3478 normal SHIPPED_LIVE RHGS Web Administration packages 2017-12-18 09:34:49 UTC
Github Tendrl gluster-integration issues 518 None None None 2017-11-29 09:23:54 UTC

Description Vinayak Papnoi 2017-11-24 09:51:00 UTC
Created attachment 1358578 [details]
The volume is 2x2 but the Bricks section is showing 8 bricks also the old bricks (/rhs/brick*) are present

Description of problem:

After doing a snapshot restore of a volume the old bricks are not removed from the list of bricks of that volume in Grafana Dashboard.

Version-Release number of selected component (if applicable):


How reproducible:

Steps to Reproduce:

1. Create a volume
2. Create a snapshot of the volume
3. Stop the volume and Activate and restore the snapshot
4. Start the volume

Actual results:

Old bricks present.

Expected results:

Old bricks shouldn't be present.

Additional info:

[root@dhcp43-93 glusterfs]# gluster v status speedster
Status of volume: speedster
Gluster process                             TCP Port  RDMA Port  Online  Pid
Brick dhcp43-93.lab.eng.blr.redhat.com:/run
73b4/brick1/b1                              49152     0          Y       7721 
Brick dhcp41-170.lab.eng.blr.redhat.com:/ru
873b4/brick2/b2                             49152     0          Y       27235
Brick dhcp43-93.lab.eng.blr.redhat.com:/run
73b4/brick3/b3                              49153     0          Y       7754 
Brick dhcp41-170.lab.eng.blr.redhat.com:/ru
873b4/brick4/b4                             49153     0          Y       27256
Self-heal Daemon on localhost               N/A       N/A        Y       2882 
Self-heal Daemon on dhcp41-170.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       21118
Task Status of Volume speedster
There are no active volume tasks
[root@dhcp43-93 glusterfs]# gluster v info
Volume Name: speedster
Type: Distributed-Replicate
Volume ID: c4a9eacd-1c97-4c44-97fb-8619bf348dde
Status: Started
Snapshot Count: 254
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Brick1: dhcp43-93.lab.eng.blr.redhat.com:/run/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf03873b4/brick1/b1
Brick2: dhcp41-170.lab.eng.blr.redhat.com:/run/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf03873b4/brick2/b2
Brick3: dhcp43-93.lab.eng.blr.redhat.com:/run/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf03873b4/brick3/b3
Brick4: dhcp41-170.lab.eng.blr.redhat.com:/run/gluster/snaps/4f853d0c5e5c4fb0ad31b1edf03873b4/brick4/b4
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
transport.address-family: inet
nfs.disable: on
features.quota: off
features.inode-quota: off
features.quota-deem-statfs: off
snap-activate-on-create: enable
auto-delete: enable

Comment 4 Darshan 2017-11-24 12:52:44 UTC
The fix for this would involve following steps:

1. listen to "snapshot_restore" gluster API event in tendrl. This just provides the volume name.

2. When above event is received, tendrl has to make another get state call to get the list of current bricks for restored volume.

3. Read the bricks for this volume from data store(etcd).

4. Check for bricks that are in data store but not in latest get-state call, These are the bricks to be removed.

5. Remove the bricks from data store(etcd).

6. Submit job for montoring-integration to remove bricks from graphite.

Comment 6 Bala Konda Reddy M 2017-12-05 16:08:03 UTC
Verified with tendrl-monitoring-integration-1.5.4-11.el7rhgs.noarch

On a successfully imported cluster, created volumes and snapshot. Stopped the volume and performed snapshot restore.

I am able to see new bricks on the bricks dashboard after snapshot restore.

Hence marking it as verified

Comment 8 errata-xmlrpc 2017-12-18 04:38:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.