1167012 – self-heal-algorithm with option "full" doesn't heal sparse files correctly

Bug 1167012 - self-heal-algorithm with option "full" doesn't heal sparse files correctly

Summary: self-heal-algorithm with option "full" doesn't heal sparse files correctly

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	replicate
Sub Component:
Version:	3.4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Ravishankar N
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1166020 1179563 1187547 1190633
Blocks:
TreeView+	depends on / blocked

Reported:	2014-11-22 16:18 UTC by Pranith Kumar K
Modified:	2015-10-07 14:01 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:	1166020
Environment:
Last Closed:	2015-10-07 14:01:36 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Pranith Kumar K 2014-11-22 16:18:52 UTC

+++ This bug was initially created as a clone of Bug #1166020 +++

Description of problem:
Here is Lindsay Mathieson's email on gluster-users with the description of the problems she faced.

On 11/18/2014 05:35 PM, Lindsay Mathieson wrote:
>
> I have a VM image which is a sparse file - 512GB allocated, but only 32GB used.
>
>  
>
>  
>
> root@vnb:~# ls -lh /mnt/gluster-brick1/datastore/images/100
>
> total 31G
>
> -rw------- 2 root root 513G Nov 18 19:57 vm-100-disk-1.qcow2
>
>  
>
>  
>
> I switched to full sync and rebooted.
>
>  
>
> heal was started on the image and it seemed to be just transfering the full file from node vnb to vng. iftop showed bandwidth at 500 Mb/s
>
>  
>
> Eventually the cumulative transfer got to 140GB which seemed odd as the real file size was 31G. I logged onto the second node (vng) and the *real* file size size was up to 191Gb.
>
>  
>
> It looks like the heal is not handling sparse files, rather it is transferring empty bytes to make up the allocated size. Thats a serious problem for the common habit of over committing your disk space with vm images. Not to mention the inefficiency.
Ah! this problem doesn't exist in diff self-heal :-(. Because the checksums of the files will match in the sparse regions. In full self-heal it just reads from the source file and writes to the sink file. What we can change there is if the file is a sparse file and the data that is read is all zeros (read will return all zeros as data in the sparse region) then read the stale file and compare if it is also all zeros. If both are 'zeros' then skip the write. I also checked that if the sparse file is created while the other brick is down, then also it preserves the holes(i.e. sparse regions). This problem only appears when both the files in their full size exist on both the bricks and full self-heal is done like here :-(.

Thanks for your valuable inputs. So basically you found 2 issues. I will raise 2 bugs one for each of the issues you found. I can CC you to the bugzilla, so that you can see the update on the bug once it is fixed. Do you want to be CCed to the bug?

Pranith
>
>  
>
> thanks,
>
>  
>
> -- 
>
> Lindsay
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users
> http://supercolony.gluster.org/mailman/listinfo/gluster-users



_______________________________________________
Gluster-users mailing list
Gluster-users
http://supercolony.gluster.org/mailman/listinfo/gluster-users



Version-Release number of selected component (if applicable):
Reported on 3.5.2 but issue exists everywhere.

How reproducible:
always

Steps to Reproduce:
1. Create a plain/distributed replicate volume.
2. Create a sparse VM
3. Configure the volume with cluster.data-self-heal-algorithm full.
4. Bring a brick down and modify data in the VM.
5. Bring the brick back up.
6. This will write sparse regions with data nullifying their usage for sparse VMs.

Comment 1 Niels de Vos 2015-05-17 21:59:08 UTC

GlusterFS 3.7.0 has been released (http://www.gluster.org/pipermail/gluster-users/2015-May/021901.html), and the Gluster project maintains N-2 supported releases. The last two releases before 3.7 are still maintained, at the moment these are 3.6 and 3.5.

This bug has been filed against the 3,4 release, and will not get fixed in a 3.4 version any more. Please verify if newer versions are affected with the reported problem. If that is the case, update the bug with a note, and update the version if you can. In case updating the version is not possible, leave a comment in this bug report with the version you tested, and set the "Need additional information the selected bugs from" below the comment box to "bugs".

If there is no response by the end of the month, this bug will get automatically closed.

Comment 2 Kaleb KEITHLEY 2015-10-07 14:01:36 UTC

GlusterFS 3.4.x has reached end-of-life.

If this bug still exists in a later release please reopen this and change the version or open a new bug.

Note You need to log in before you can comment on or make changes to this bug.