Description of problem: ----------------------- When the RHGS node is removed from the trusted storage pool, snapshot daemon still runs on that node Version-Release number of selected component (if applicable): ------------------------------------------------------------- RHGS 3.1 Nightly build How reproducible: ----------------- Always Steps to Reproduce: ------------------- 1. Create a 'Trusted Storage Pool' ( gluster cluster ) of 3 RHGS nodes. ( NODE1, NODE2, NODE3 ) 2. Create a any type of volume with the bricks from NODE1 and NODE2, start the volume 3. Create a gluster volume snapshot of a volume 4. Activate the snapshot 5. Enable USS ( user-serviceable-snapshot ) on the volume 6. Remove the NODE3 from the 'Trusted Storage Pool' Actual results: --------------- snapshot daemon was still running on NODE3. Expected results: ----------------- snapshot daemon that should be no longer running, as NODE3 doesn't have anyother volumes or bricks
Created attachment 1044922 [details] sosreport from node1 - dhcp37-46
Created attachment 1044923 [details] sosreport from node2 - dhcp37-117
Created attachment 1044924 [details] sosreport from node3 - dhcp37-167
patch available in upstream: http://review.gluster.org/#/c/11509/
(In reply to Gaurav Kumar Garg from comment #7) > patch available in upstream: http://review.gluster.org/#/c/11509/ This patch is meant for some other problem. I don't think it will solve this problem.
(In reply to Atin Mukherjee from comment #10) > (In reply to Gaurav Kumar Garg from comment #7) > > patch available in upstream: http://review.gluster.org/#/c/11509/ > > This patch is meant for some other problem. I don't think it will solve this > problem. Ignore this comment.
Patch upstream at http://review.gluster.org/#/c/11509/
downstream patch url: https://code.engineering.redhat.com/gerrit/56148
Verified this bug with glusterfs-3.7.1-13 build and its working as expected. Steps followed are as below: 1) Added the nodes in the cluster: [root@dhcp35-148 brick1]# gluster peer status Number of Peers: 3 Hostname: 10.70.35.28 Uuid: a7b191cd-c7f0-4325-90bb-0aee49bff301 State: Peer in Cluster (Connected) Hostname: 10.70.35.214 Uuid: d2fe96b1-b2d5-4ee6-a166-f677bb2bc3aa State: Peer in Cluster (Connected) Hostname: 10.70.35.211 Uuid: 735e1a96-8770-4af6-afac-55acb9789d1c State: Peer in Cluster (Connected) 2) Created the volume, a snapshot of it , enabled uss, quota and nfs for the volume. 3) Verified that these services are running on the node (Hostname: 10.70.35.211), which is to be detached. [root@dhcp35-211 brick1]# ps aux|grep snapd root 7189 0.0 0.2 530816 21164 ? Ssl 16:09 0:00 /usr/sbin/glusterfsd -s localhost --volfile-id snapd/testvolume -p /var/lib/glusterd/vols/testvolume/run/testvolume-snapd.pid -l /var/log/glusterfs/snaps/testvolume/snapd.log --brick-name snapd-testvolume -S /var/run/gluster/90e5690e1ba0cf3c897a1744552e980d.socket --brick-port 49168 --xlator-option testvolume-server.listen-port=49168 --no-mem-accounting root 7326 0.0 0.0 112644 964 pts/0 S+ 16:12 0:00 grep --color=auto snapd [root@dhcp35-211 brick1]# ps aux|grep quota root 7213 0.0 0.4 459468 32300 ? Ssl 16:09 0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p /var/lib/glusterd/quotad/run/quotad.pid -l /var/log/glusterfs/quotad.log -S /var/run/gluster/887f0ad839cfcbb6e3655f06020a40bf.socket --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off root 7328 0.0 0.0 112640 964 pts/0 S+ 16:12 0:00 grep --color=auto quota [root@dhcp35-211 brick1]# ps aux|grep nfs root 7199 0.0 0.8 578896 65156 ? Ssl 16:09 0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/865cb336d8dffca50721eacbd0d46180.socket root 7338 0.0 0.0 112644 960 pts/0 S+ 16:12 0:00 grep --color=auto nfs [root@dhcp35-211 brick1]# gluster volume status Status of volume: testvolume Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.148:/rhs/brick1/b1 49165 0 Y 9244 Brick 10.70.35.28:/rhs/brick1/b1 49165 0 Y 6421 Brick 10.70.35.148:/rhs/brick2/b2 49166 0 Y 9262 Brick 10.70.35.28:/rhs/brick2/b2 49166 0 Y 6448 Snapshot Daemon on localhost 49168 0 Y 7189 NFS Server on localhost 2049 0 Y 7199 Self-heal Daemon on localhost N/A N/A Y 7204 Quota Daemon on localhost N/A N/A Y 7213 Snapshot Daemon on dhcp35-148.lab.eng.blr.r edhat.com 49169 0 Y 9650 NFS Server on dhcp35-148.lab.eng.blr.redhat .com 2049 0 Y 10152 Self-heal Daemon on dhcp35-148.lab.eng.blr. redhat.com N/A N/A Y 10160 Quota Daemon on dhcp35-148.lab.eng.blr.redh at.com N/A N/A Y 10601 Snapshot Daemon on 10.70.35.214 49162 0 Y 6867 NFS Server on 10.70.35.214 2049 0 Y 7217 Self-heal Daemon on 10.70.35.214 N/A N/A Y 7225 Quota Daemon on 10.70.35.214 N/A N/A Y 7569 Snapshot Daemon on 10.70.35.28 49169 0 Y 6683 NFS Server on 10.70.35.28 2049 0 Y 7063 Self-heal Daemon on 10.70.35.28 N/A N/A Y 7071 Quota Daemon on 10.70.35.28 N/A N/A Y 7409 Task Status of Volume testvolume ------------------------------------------------------------------------------ There are no active volume tasks 4) Detached the node (Hostname: 10.70.35.211) from the cluster: [root@dhcp35-148 brick1]# gluster peer detach 10.70.35.211 peer detach: success [root@dhcp35-148 brick1]# gluster peer status Number of Peers: 2 Hostname: 10.70.35.28 Uuid: a7b191cd-c7f0-4325-90bb-0aee49bff301 State: Peer in Cluster (Connected) Hostname: 10.70.35.214 Uuid: d2fe96b1-b2d5-4ee6-a166-f677bb2bc3aa State: Peer in Cluster (Connected) 5) Verified that after detaching the node (Hostname: 10.70.35.211) from the cluster, all the services are stopped on the node and there is no volume present. [root@dhcp35-211 brick1]# gluster volume status No volumes present [root@dhcp35-211 brick1]# ps aux|grep snapd root 7467 0.0 0.0 112640 964 pts/0 S+ 16:18 0:00 grep --color=auto snapd [root@dhcp35-211 brick1]# ps aux|grep quota root 7469 0.0 0.0 112640 964 pts/0 S+ 16:18 0:00 grep --color=auto quota [root@dhcp35-211 brick1]# ps aux|grep nfs root 7471 0.0 0.0 112640 964 pts/0 S+ 16:18 0:00 grep --color=auto nfs Based on the above observation, marking this bug as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1845.html