Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1206461

Summary: sparse file self heal fail under xfs version 2 with speculative preallocation feature on
Product: [Community] GlusterFS Reporter: zhoushicheng <madaozhou>
Component: replicateAssignee: Ravishankar N <ravishankar>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, jiaowopan, madaozhou, mingfan.lu, pkarampu, ravishankar, vbellur
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.8rc2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-16 12:45:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zhoushicheng 2015-03-27 07:56:00 UTC
Description of problem:

Under gluster 3.6.3 beta1 and file system of xfs version 2, test: tests/basic/afr/sparse-file-self-heal.t would fail.
With both full and diff algorithm, file big2bigger would become sparse file after self heal was done.

If the speculative preallocation feature of xfs is disabled, the test would pass but which may cause the potential of increasing fragmentation.

Version-Release number of selected component (if applicable):

glusterfs 3.6.3 beta1
xfs version 2

How reproducible:

Mount xfs file system without setting allocsize and run tests/basic/afr/sparse-file-self-heal.t

Steps to Reproduce:

1. Install gluster3.6.3 beta1.
2. Mount device with xfs file system.
3. Run tests/basic/afr/sparse-file-self-heal.t.

Actual results:

big2bigger will become sparse file under both full and diff self heal algorithms.

Expected results:

All tests pass.

Additional info:
Under xfs version 2 with speculative preallocation feature on, the test result is :
---------->>>> test info <<<<<-------------

Test Summary Report
-------------------
./tests/basic/afr/sparse-file-self-heal.t (Wstat: 0 Tests: 64 Failed: 2)
  Failed tests:  33, 61
Files=1, Tests=64, 46 wallclock secs ( 0.03 usr  0.01 sys +  1.99 cusr  2.31 csys =  4.34 CPU)
Result: FAIL
Failed tests  ./tests/basic/afr/sparse-file-self-heal.t

Under xfs version 2 with speculative preallocation feature off:
mount -o allocsize=64k -t xfs device dir
the test result is :
---------->>>> test info <<<<<-------------
[07:53:36] ./tests/basic/afr/sparse-file-self-heal.t .. ok    46257 ms
[07:54:22]
All tests successful.
Files=1, Tests=64, 46 wallclock secs ( 0.03 usr  0.01 sys +  1.96 cusr  2.30 csys =  4.30 CPU)
Result: PASS

Comment 1 Ravishankar N 2015-03-31 08:19:58 UTC
Hi zhoushicheng,

So the test that is failing is:
EXPECT "1" has_holes $B0/${V0}0/big2bigger

The has_holes function returns true if(`stat -c '%b*%B-%s' file_name`)) -lt 0). I think this means that XFS has not freed the preallocated blocks despite the fd being closed by the time we reach this line in the testcase. Could you run the test lowering the speculative_prealloc_lifetime ? 
(See http://xfs.org/index.php/XFS_FAQ#Q:_How_can_I_speed_up_or_avoid_delayed_removal_of_speculative_preallocation.3F)

Comment 2 zhoushicheng 2015-03-31 11:29:44 UTC
(In reply to Ravishankar N from comment #1)
> Hi zhoushicheng,
> 
> So the test that is failing is:
> EXPECT "1" has_holes $B0/${V0}0/big2bigger
> 
> The has_holes function returns true if(`stat -c '%b*%B-%s' file_name`)) -lt
> 0). I think this means that XFS has not freed the preallocated blocks
> despite the fd being closed by the time we reach this line in the testcase.
> Could you run the test lowering the speculative_prealloc_lifetime ? 
> (See
> http://xfs.org/index.php/XFS_FAQ#Q:
> _How_can_I_speed_up_or_avoid_delayed_removal_of_speculative_preallocation.3F)

This feature is said to be introduced by linux 3.8 (and later), but my linux version is 2.6.32-358.el6.x86_64. I cannot find speculative_prealloc_lifetime under /proc/sys/fs/xfs/, it seems that lowering the speculative_prealloc_lifetime is not an option under this condition.

Comment 3 Ravishankar N 2015-03-31 11:37:12 UTC
Well then you could try adding a sleep in the testcase or doing a drop_caches before `EXPECT "1" has_holes $B0/${V0}0/big2bigger` is hit.

You could also try the testcase manually so that you can observe the ll -lh, du -h and md5sum of the file in both bricks of the replica. They must be equal.

Comment 4 zhoushicheng 2015-04-01 02:26:19 UTC
After adding echo 3 > /proc/sys/vm/drop_caches before `EXPECT "1" has_holes $B0/${V0}0/big2bigger`, the test has passed.

And despite whether big2bigger was sparse or not after self heal, the checksum was the same.

Comment 5 Ravishankar N 2015-04-01 02:35:55 UTC
Hmm, so this is not a bug per se. It is  just that XFS takes more time to free the prealloc'd blocks.

By the way, in your bug description, you say "With both full and diff algorithm, file big2bigger would become sparse file after self heal was done." This is the expected behaviour because the test creates big2bigger as a sparse file:

TEST dd if=/dev/urandom of=$M0/big2bigger count=1 bs=1024k 
TEST truncate -s 2M $M0/big2bigger

Comment 6 zhoushicheng 2015-04-01 02:41:46 UTC
Thanks, big2bigger should become sparse file after self heal, sorry for my wrong description.

Comment 7 Ravishankar N 2015-04-01 03:15:27 UTC
(In reply to zhoushicheng from comment #4)
> After adding echo 3 > /proc/sys/vm/drop_caches before `EXPECT "1" has_holes
> $B0/${V0}0/big2bigger`, the test has passed.
> 
Would you like to send a patch to add this line to the test case describing why it is needed? The how-to is here:http://www.gluster.org/community/documentation/index.php/Development_Work_Flow.You can use this bug-id for the patch. Or if you would rather not, I can do it for you.

Comment 8 zhoushicheng 2015-04-01 03:28:48 UTC
(In reply to Ravishankar N from comment #7)
> (In reply to zhoushicheng from comment #4)
> > After adding echo 3 > /proc/sys/vm/drop_caches before `EXPECT "1" has_holes
> > $B0/${V0}0/big2bigger`, the test has passed.
> > 
> Would you like to send a patch to add this line to the test case describing
> why it is needed? The how-to is
> here:http://www.gluster.org/community/documentation/index.php/
> Development_Work_Flow.You can use this bug-id for the patch. Or if you would
> rather not, I can do it for you.

I will send a patch :)

Comment 9 Anand Avati 2015-04-10 05:56:01 UTC
REVIEW: http://review.gluster.org/10185 (test: Fix sparse file self heal test) posted (#1) for review on master by 仕成 周 (madaozhou)

Comment 10 zhoushicheng 2015-04-10 06:00:00 UTC
http://review.gluster.org/#/c/10185/

Comment 11 Anand Avati 2015-04-10 07:24:48 UTC
REVIEW: http://review.gluster.org/10185 (test: Fix sparse file self heal test) posted (#2) for review on master by zhoushicheng (madaozhou)

Comment 12 Anand Avati 2015-04-15 12:52:49 UTC
REVIEW: http://review.gluster.org/10253 (test: Fix sparse file self heal test) posted (#1) for review on master by zhoushicheng (madaozhou)

Comment 13 Anand Avati 2015-04-16 02:52:46 UTC
REVIEW: http://review.gluster.org/10253 (test: Fix sparse file self heal test) posted (#2) for review on master by zhoushicheng (madaozhou)

Comment 14 Anand Avati 2015-04-16 05:45:18 UTC
REVIEW: http://review.gluster.org/10253 (test: Fix sparse file self heal test) posted (#3) for review on master by zhoushicheng (madaozhou)

Comment 15 Anand Avati 2015-05-11 09:49:24 UTC
REVIEW: http://review.gluster.org/10253 (test: Fix sparse file self heal test) posted (#5) for review on master by zhoushicheng (madaozhou)

Comment 16 Anand Avati 2015-05-19 15:34:33 UTC
COMMIT: http://review.gluster.org/10253 committed in master by Vijay Bellur (vbellur) 
------
commit 8f788528e64c4c13e16f7ad2d9f667a3813e08cc
Author: zhoushicheng <madaozhou>
Date:   Fri Apr 10 12:10:26 2015 +0800

    test: Fix sparse file self heal test
    
    This patch solves problems caused by XFS with speculative preallocation feature on :
    Test EXPECT "1" has_holes $B0/${V0}0/big2bigger would fall when XFS has not freed the preallocated blocks.
    It is caused by XFS speculative preallocation feature. The test would pass if this feature is disabled.
    Speculative preallocation can speed up under linux 3.8(and later).
    Otherwise, the test would pass by dropping cache manually to speed up speculative preallocation.
    
    As in http://review.gluster.org/#/c/10411/, using "( cd $M0 ; umount $M0 )" to drop caches, which is
    better than "echo 3 > /proc/sys/vm/drop_caches".
    
    drop caches operation was added in test:
    tests/basic/afr/sparse-file-self-heal.t
    
    BUG: 1206461
    Change-Id: Ie2c9d1b92fa8307c44498752fdd100eb86f9689c
    Signed-off-by: zhoushicheng <madaozhou>
    Reviewed-on: http://review.gluster.org/10253
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Jeff Darcy <jdarcy>
    Reviewed-by: Ravishankar N <ravishankar>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 17 Ravishankar N 2015-05-28 09:53:06 UTC
Moving BZ to modified since the patch has been merged.

Comment 18 Niels de Vos 2016-06-16 12:45:03 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user