Bug 1149943 - duplicate librsync code should likely be linked removed and linked as a library
Summary: duplicate librsync code should likely be linked removed and linked as a library
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Niels de Vos
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-10-07 01:35 UTC by Wade Mealing
Modified: 2015-05-14 17:44 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.7.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-05-14 17:27:55 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Wade Mealing 2014-10-07 01:35:50 UTC
Description of problem:


I've noticed that there is some code in the glusterfs ( libglusterfs/src/checksum.c ) that looks to be lifted rom the librsync code.  

Is there anything obvious on why this was done, and not directly linked to librsync ?

I see it from this commit:

> commit 239d2cbdb0a4c32df9f21de8385e2c466b934178
> Author: Vikas Gorur <vikas>
> Date:   Thu Sep 17 05:56:24 2009 +0000
>
>      libglusterfs: Add checksum functions.
>
>      gf_rsync_weak_checksum:   Calculates a simple 32-bit checksum.
>      gf_rsync_strong_checksum: Calculates the MD5 checksum.
>
>      The strong checksum function makes use of Christophe Devine's
>      MD5 implementation (adapted from the rsync source code,
>      version 3.0.6. <http://www.samba.org/ftp/rsync/>).
>

I don't know why this wasn't just linked as a C library, however due to maintainability and future proofing this product we should extract this code and link to the relevant librsync library.

Comment 1 Wade Mealing 2014-10-07 02:20:14 UTC
I've noticed that librsync doesn't ship with RHEL and is most commonly pulled it from EPEL.

I guess it will depend on if there is plans to ship this package or not.

Wade.

Comment 2 Kaleb KEITHLEY 2014-10-07 12:36:33 UTC
There's nothing for us to do here, per se, until librsync is included in the rsync package.

When the source we copied is updated we would probably want to update our copy as well.

Comment 3 Wade Mealing 2014-10-13 01:10:05 UTC
Summarizing an email conversation (Correct me if i'm wrong):

Since we have librsync now in the Openstack online process, it is possible that this librsync could be duplicated into the gluster tags and used that way. Niels de Vos, mentioned that this code could likely be removed as its only used in the posix implementation (of something).

Can we get a confirmation of the game plan is from Kaleb since this bugzilla ?

Comment 4 Niels de Vos 2014-10-13 20:34:31 UTC
The duplicate rsync functions (gf_rsync_*) are only used in the storage/posix and storage/bd xlators for their rchecksum FOP implementation. For all I can see, there is no user (client or API) of rchecksum. Removing the rchecksum FOP would be one solution.

Some more discussion and agreement is needed. Maybe rchecksum could be useful for the upcoming bitrot detection?

Comment 5 Niels de Vos 2014-10-15 16:36:09 UTC
BitRot will not use the rchecksum FOP. However, Pranith mentioned that self-heal currently relies on that.

Pranith, could you check (or find someone that knows how self-heal relies on it) if we can replace the gf_rsync_* functions with a librsync calls?

Comment 6 Pranith Kumar K 2014-10-16 08:30:47 UTC
I don't see any problem in removing the code and linking to librsync. The only thing we may have to find out may be is if it gives same checksum values for the data that is given as output (Most probably it will. I am just being paranoid thats all)

Pranith

Comment 7 Niels de Vos 2014-10-16 08:35:15 UTC
Wade, can you comment on the change (or not) of the checksum values? I think you have looked into the changes of the algorithm? In case the checksum values change, we need a (more complicated) coordinated approach for updating all servers in the cluster. (The result of the checksum is sent to the clients for comparing the different blocks of a replicated file on different bricks.)

Comment 8 Wade Mealing 2014-10-20 04:43:44 UTC
It -looks- the same to me, although i'd almost definitely suggest running it through some of the test cases you guys have to be sure.

Comment 9 Anand Avati 2014-11-02 18:35:13 UTC
REVIEW: http://review.gluster.org/9035 (Replace copied (from rsync) checksum code by adler32() from zlib) posted (#1) for review on master by Niels de Vos (ndevos)

Comment 10 Anand Avati 2014-11-21 17:12:24 UTC
COMMIT: http://review.gluster.org/9035 committed in master by Vijay Bellur (vbellur) 
------
commit 66c789765d783bba8fe651e6a26feb5483af71a5
Author: Niels de Vos <ndevos>
Date:   Sun Nov 2 19:15:49 2014 +0100

    Replace copied (from rsync) checksum code by adler32() from zlib
    
    The weak checksum code that is included in libglusterfs has initialy
    been copied from the rsync sources. Instead of maintaining a copy of a
    function, we should use a function from a shared library. The algorithm
    seems to be Adler-32, zlib provides an implementation.
    
    The strong checksum function has already been replaced by MD5 from
    OpenSSL. It is time to also remove the comments about the origin of the
    implementation, because it is not correct anymore.
    
    Change-Id: I70c16ae1d1c36b458a035e4adb3e51a20afcf652
    BUG: 1149943
    Reported-by: Wade Mealing <wmealing>
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: http://review.gluster.org/9035
    Reviewed-by: Kaleb KEITHLEY <kkeithle>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 12 Wade Mealing 2014-12-15 07:00:58 UTC
The original intention was to move the checksum creation problem to librsync in an attempt to avoid a known collision problem apparent in the original rsync code.

So, I did a little more research and come up with some news about the adler32 algorithm.

So while adler32 doesn't include the problematic issue that I was avoiding with the md5 code, it does however have its share of new problems, specifically on small data sets. (See  http://www.zlib.net/maxino06_fletcher-adler.pdf)

I know in this particular instance, the code is checking for bit-level corruption and not forgery however the adler32 has similar issues to the MD5/CRC-32 implementation it replaced for data up to a few hundred bytes because the checksums for these messages have a poor coverage of the 32 available bits. (See http://en.wikipedia.org/wiki/Adler-32) 

I think we should aim to move to sha256.  The instructions show that sha256 on newer hardware will have significantly better results ( See http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/haswell-cryptographic-performance-paper.pdf ) and its pretty hard to beat a 2.67 cycle hash check for a block.  I realise not everyone has a haswell processor and intel has published the specs for a fast SHA-256 implementation (see http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/sha-256-implementations-paper.html ) 

Can we implement sha256 in this manner here ?

Comment 13 Niels de Vos 2015-01-05 12:01:31 UTC
The difficulty with changing the checksum algorithm is related with upgrades from one stable version to the next. The checksum is calculated on the server-side (brick), the client that compares the checksum from multiple servers/bricks would need to know what algorithm(s) the servers/bricks supports.

Pranith, as self-heal uses rchecksum(), can you think of any reasonable solution to change the checksum algorithm from adler32 to sha256 or something?

Comment 14 Pranith Kumar K 2015-01-05 12:11:55 UTC
Niels,
  The only problem that will happen if we change checksum algo, even when the data blocks are same, data content is copied from source and sink. So correctness won't be a problem if we change it. Just that the performance of self-heal will take a hit. But adler32 was chosen I believe as a weak checksum. For strong checksum it uses md5. So I don't see a big reason to change it as of now as sha256 is not weak checksum.

Pranith

Comment 15 Niels de Vos 2015-01-05 12:48:44 UTC
Wade, Pranith has a point there, the function is explicitly called gf_rsync_weak_checksum(). Replacing the implementation with a strong checksum does not really make much sense...

(Changing the algorithm would possibly introduce continuous healing of data that is already correct, in the case where users have different versions installed - performance impact.)

The performance difference between weak/strong checksums may not be very relevant anymore? If that is the case, we could maybe remove gf_rsync_weak_checksum() completely and use the MD5 checksumming from gf_rsync_strong_checksum() for everything. Adding a function to do sha256 just to replace adler32 does look like a good approach to me.

Comment 16 Wade Mealing 2015-01-27 03:10:39 UTC
I do see all very good points made there Neils.


> (Changing the algorithm would possibly introduce continuous healing of data 
> that is already correct, in the case where users have different versions 
> installed - performance impact.)

Right, if there is an implementation problem, I can understand not changing in mid-release as it would change customers expectations/performance and I'm aware how that can upset people.

> Adding a function to do sha256 just to replace adler32 does look like a good 
> approach to me.

I'm ok with that, although now the function is a little misnamed.


> The performance difference between weak/strong checksums may not be very 
> relevant anymore?

The new haswell CPU's from intel have increased performance ( see http://www.intel.com.au/content/dam/www/public/us/en/documents/white-papers/haswell-cryptographic-performance-paper.pdf ) for SHA checksums.

So TLDR for now:

Adler32 is probably fine for now.  Can we / should we get stronger checksums for the stronger checksumming for a future release ?

Comment 19 Niels de Vos 2015-05-14 17:27:55 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 20 Niels de Vos 2015-05-14 17:35:38 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 21 Niels de Vos 2015-05-14 17:38:00 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 22 Niels de Vos 2015-05-14 17:44:09 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.