Bug 1098045 - [SNAPSHOT]: recreation of already deleted/restored snaps occur due to optimization
Summary: [SNAPSHOT]: recreation of already deleted/restored snaps occur due to optimiz...
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: snapshot
Version: pre-release
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
Assignee: Avra Sengupta
QA Contact:
URL:
Whiteboard: SNAPSHOT
Depends On:
Blocks: 1098103
TreeView+ depends on / blocked
 
Reported: 2014-05-15 07:21 UTC by Avra Sengupta
Modified: 2015-10-22 15:40 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
: 1098103 (view as bug list)
Environment:
Last Closed: 2015-10-22 15:40:20 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Avra Sengupta 2014-05-15 07:21:43 UTC
Description of problem:
During snapshot create, if a brick is down and the quorum is met, the snapshot create is successful. In the missed_snaps_list, an entry is maintained for the missed create. The same happens for missed deletes and restores.

There was an optimization, in which if a create for a brick is pending, and a delete/restore is also missed for the same brick, we don't add a new entry to the list for delete/restore. We just mark the create entry as done. The rationale behind this optimization was if you haven't created anything yet, there is nothing to delete or restore from.

The issue with the optimizations is that, when the create actually happens after the brick comes up, the entry is updated from pending to done only in the node hosting the brick (as that node alone is aware of this  change). This information is propagated to other nodes during subsequent handshakes following this event but not before that.

So if after the create is successful, the node hosting the brick (and having the updated entry) again goes down. And during this time a delete/restore command is issued. the other nodes will utilize the optimization and not add a delete entry but instead mark the create pending entry as done. This leads to inconsistency and when the node comes back up, it not only contains the stale snap, but also makes other nodes believe that its a new snap.


Version-Release number of selected component (if applicable):


How reproducible:
Everytime


Steps to Reproduce:
1."pkill gluster" on one node.
2.Issue a create command from another node, and check /var/lib/glusterd/snaps/missed_snaps_list for the missed create entry.
3. Bring back glusterd on the first node. 
4. The missed_snaps_list gets synced and the missing snapshot is taken. In the first node, check the /var/lib/glusterd/snaps/missed_snaps_list to see that the entry is now updated from pending to done.
5. Again issue "pkill gluster" on the first node.
6. Now issue the delete command for the same snap from another node. 
7. Check /var/lib/glusterd/snaps/missed_snaps_list in this node to see that a new delete entry is not added, but the create entry is marked as done.
8. Now bring back glusterd on the first node.

Actual results:
When the first node comes back after the delete command is issued, the missed_snaps_list on syncing does not have any delete info. Hence it doesn't delete the stale snap. 
The other nodes also don't have any delete entry at this point in time, and hence assume the stale snap info to be a new snap info, and recreate the stale snap, thus landing the cluster in an inconsistent state.


Expected results:
The delete info should be correctly propagated, and the stale snap should be deleted. The other nodes should therefore not recreate the already deleted/restored snap


Additional info:

Comment 1 Anand Avati 2014-05-20 14:40:28 UTC
REVIEW: http://review.gluster.org/7811 (glusterd/snapshot: Removing missed snap optimization) posted (#1) for review on master by Avra Sengupta (asengupt)

Comment 2 Anand Avati 2014-05-28 12:21:47 UTC
REVIEW: http://review.gluster.org/7811 (glusterd/snapshot: Removing missed snap optimization) posted (#2) for review on master by Avra Sengupta (asengupt)

Comment 3 Anand Avati 2014-06-16 11:44:03 UTC
REVIEW: http://review.gluster.org/8077 (DUMMY PATCH FOR LOGS) posted (#1) for review on master by Avra Sengupta (asengupt)

Comment 4 Kaleb KEITHLEY 2015-10-22 15:40:20 UTC
pre-release version is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.


Note You need to log in before you can comment on or make changes to this bug.