Bug 1175754 - [SNAPSHOT]: before the snap is marked to be deleted if the node goes down than the snaps are propagated on other nodes and glusterd hungs
Summary: [SNAPSHOT]: before the snap is marked to be deleted if the node goes down tha...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: 3.6.1
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard: SNAPSHOT
Depends On: 1101463 1104635 1104714
Blocks: glusterfs-3.6.2
TreeView+ depends on / blocked
 
Reported: 2014-12-18 14:05 UTC by Vijaikumar Mallikarjuna
Modified: 2016-05-11 22:47 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.6.2
Doc Type: Bug Fix
Doc Text:
Clone Of: 1104714
Environment:
Last Closed: 2015-02-11 09:11:39 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Anand Avati 2014-12-18 15:06:45 UTC
REVIEW: http://review.gluster.org/9299 (glusterd/snapshot: Check if LVM device path exists before delete.) posted (#1) for review on release-3.6 by Sachin Pandit (spandit)

Comment 2 Anand Avati 2014-12-18 17:20:57 UTC
REVIEW: http://review.gluster.org/9299 (glusterd/snapshot: Check if LVM device path exists before delete) posted (#2) for review on release-3.6 by Sachin Pandit (spandit)

Comment 3 Anand Avati 2014-12-19 07:20:01 UTC
REVIEW: http://review.gluster.org/9299 (glusterd/snapshot: Check if LVM device path exists before delete.) posted (#3) for review on release-3.6 by Sachin Pandit (spandit)

Comment 4 Anand Avati 2014-12-19 09:00:13 UTC
REVIEW: http://review.gluster.org/9299 (glusterd/snapshot: Check if LVM device path exists before delete) posted (#4) for review on release-3.6 by Sachin Pandit (spandit)

Comment 5 Anand Avati 2014-12-22 12:17:58 UTC
COMMIT: http://review.gluster.org/9299 committed in release-3.6 by Raghavendra Bhat (raghavendra) 
------
commit f9b4ef19d9e80ab4723e947a04c1c094843a0b6f
Author: Avra Sengupta <asengupt>
Date:   Tue Oct 21 08:42:40 2014 +0000

    glusterd/snapshot: Check if LVM device path exists before delete
    
    Check if the LV is present before deleting the LV. In case where
    the LV is absent (already deleted?), need not fail the snap delete
    operation.
    
    Also check if the LV is mounted before trying umount. In case it
    isn't umounted, only remove the LV.
    
    Change-Id: I0f5b2674797299d8748c6fac5b091f0caba65ca4
    BUG: 1175754
    Signed-off-by: Avra Sengupta <asengupt>
    Reviewed-on: http://review.gluster.org/8954
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Krishnan Parthasarathi <kparthas>
    Tested-by: Krishnan Parthasarathi <kparthas>
    Signed-off-by: Sachin Pandit <spandit>
    Reviewed-on: http://review.gluster.org/9299
    Reviewed-by: Raghavendra Bhat <raghavendra>

Comment 6 Raghavendra Bhat 2015-01-06 10:18:37 UTC
+++ This bug was initially created as a clone of Bug #1104714 +++

+++ This bug was initially created as a clone of Bug #1104635 +++

Description of problem:
=======================

In a case where the snap delete is issued before the snap is marked to be deleted if the node goes down and when the node comes back than the snaps are propagated on other nodes and glusterd hungs.



Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.6.1


Steps to Reproduce:
===================
1. Setup 4 node cluster
2. Create a volume
3. Create 256 snapshots of a volume
4. Start deleting snapshots of volume in a loop (--mode=script)
5. While snap deletion is in-progress, stop and start glusterd service on node multiple times.


Actual results:
===============
1. Snapshot commit failed on the node which went down.
2. Once the node is brought back the snap is present on all the systems and no entry in the missed_entry_list.
3. gluster hungs on the machines which were up


Expected results:
=================

1. Snapshot should fail with proper message.
2. Once the node is brought back the snap should be deleted from all the nodes.
3. gluster should not hung.

Since, it just not hamper the missed snap functionality but also the whole cluster becomes unresponsive, raising the bug with urgent severity

Comment 7 Raghavendra Bhat 2015-02-11 09:11:39 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.2, please reopen this bug report.

glusterfs-3.6.2 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should already be or become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

The fix for this bug likely to be included in all future GlusterFS releases i.e. release > 3.6.2.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/5978
[2] http://news.gmane.org/gmane.comp.file-systems.gluster.user
[3] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137


Note You need to log in before you can comment on or make changes to this bug.