Description of problem: I've noticed that there is some code in the glusterfs ( libglusterfs/src/checksum.c ) that looks to be lifted rom the librsync code. Is there anything obvious on why this was done, and not directly linked to librsync ? I see it from this commit: > commit 239d2cbdb0a4c32df9f21de8385e2c466b934178 > Author: Vikas Gorur <vikas> > Date: Thu Sep 17 05:56:24 2009 +0000 > > libglusterfs: Add checksum functions. > > gf_rsync_weak_checksum: Calculates a simple 32-bit checksum. > gf_rsync_strong_checksum: Calculates the MD5 checksum. > > The strong checksum function makes use of Christophe Devine's > MD5 implementation (adapted from the rsync source code, > version 3.0.6. <http://www.samba.org/ftp/rsync/>). > I don't know why this wasn't just linked as a C library, however due to maintainability and future proofing this product we should extract this code and link to the relevant librsync library.
I've noticed that librsync doesn't ship with RHEL and is most commonly pulled it from EPEL. I guess it will depend on if there is plans to ship this package or not. Wade.
There's nothing for us to do here, per se, until librsync is included in the rsync package. When the source we copied is updated we would probably want to update our copy as well.
Summarizing an email conversation (Correct me if i'm wrong): Since we have librsync now in the Openstack online process, it is possible that this librsync could be duplicated into the gluster tags and used that way. Niels de Vos, mentioned that this code could likely be removed as its only used in the posix implementation (of something). Can we get a confirmation of the game plan is from Kaleb since this bugzilla ?
The duplicate rsync functions (gf_rsync_*) are only used in the storage/posix and storage/bd xlators for their rchecksum FOP implementation. For all I can see, there is no user (client or API) of rchecksum. Removing the rchecksum FOP would be one solution. Some more discussion and agreement is needed. Maybe rchecksum could be useful for the upcoming bitrot detection?
BitRot will not use the rchecksum FOP. However, Pranith mentioned that self-heal currently relies on that. Pranith, could you check (or find someone that knows how self-heal relies on it) if we can replace the gf_rsync_* functions with a librsync calls?
I don't see any problem in removing the code and linking to librsync. The only thing we may have to find out may be is if it gives same checksum values for the data that is given as output (Most probably it will. I am just being paranoid thats all) Pranith
Wade, can you comment on the change (or not) of the checksum values? I think you have looked into the changes of the algorithm? In case the checksum values change, we need a (more complicated) coordinated approach for updating all servers in the cluster. (The result of the checksum is sent to the clients for comparing the different blocks of a replicated file on different bricks.)
It -looks- the same to me, although i'd almost definitely suggest running it through some of the test cases you guys have to be sure.
REVIEW: http://review.gluster.org/9035 (Replace copied (from rsync) checksum code by adler32() from zlib) posted (#1) for review on master by Niels de Vos (ndevos)
COMMIT: http://review.gluster.org/9035 committed in master by Vijay Bellur (vbellur) ------ commit 66c789765d783bba8fe651e6a26feb5483af71a5 Author: Niels de Vos <ndevos> Date: Sun Nov 2 19:15:49 2014 +0100 Replace copied (from rsync) checksum code by adler32() from zlib The weak checksum code that is included in libglusterfs has initialy been copied from the rsync sources. Instead of maintaining a copy of a function, we should use a function from a shared library. The algorithm seems to be Adler-32, zlib provides an implementation. The strong checksum function has already been replaced by MD5 from OpenSSL. It is time to also remove the comments about the origin of the implementation, because it is not correct anymore. Change-Id: I70c16ae1d1c36b458a035e4adb3e51a20afcf652 BUG: 1149943 Reported-by: Wade Mealing <wmealing> Signed-off-by: Niels de Vos <ndevos> Reviewed-on: http://review.gluster.org/9035 Reviewed-by: Kaleb KEITHLEY <kkeithle> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
The original intention was to move the checksum creation problem to librsync in an attempt to avoid a known collision problem apparent in the original rsync code. So, I did a little more research and come up with some news about the adler32 algorithm. So while adler32 doesn't include the problematic issue that I was avoiding with the md5 code, it does however have its share of new problems, specifically on small data sets. (See http://www.zlib.net/maxino06_fletcher-adler.pdf) I know in this particular instance, the code is checking for bit-level corruption and not forgery however the adler32 has similar issues to the MD5/CRC-32 implementation it replaced for data up to a few hundred bytes because the checksums for these messages have a poor coverage of the 32 available bits. (See http://en.wikipedia.org/wiki/Adler-32) I think we should aim to move to sha256. The instructions show that sha256 on newer hardware will have significantly better results ( See http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/haswell-cryptographic-performance-paper.pdf ) and its pretty hard to beat a 2.67 cycle hash check for a block. I realise not everyone has a haswell processor and intel has published the specs for a fast SHA-256 implementation (see http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/sha-256-implementations-paper.html ) Can we implement sha256 in this manner here ?
The difficulty with changing the checksum algorithm is related with upgrades from one stable version to the next. The checksum is calculated on the server-side (brick), the client that compares the checksum from multiple servers/bricks would need to know what algorithm(s) the servers/bricks supports. Pranith, as self-heal uses rchecksum(), can you think of any reasonable solution to change the checksum algorithm from adler32 to sha256 or something?
Niels, The only problem that will happen if we change checksum algo, even when the data blocks are same, data content is copied from source and sink. So correctness won't be a problem if we change it. Just that the performance of self-heal will take a hit. But adler32 was chosen I believe as a weak checksum. For strong checksum it uses md5. So I don't see a big reason to change it as of now as sha256 is not weak checksum. Pranith
Wade, Pranith has a point there, the function is explicitly called gf_rsync_weak_checksum(). Replacing the implementation with a strong checksum does not really make much sense... (Changing the algorithm would possibly introduce continuous healing of data that is already correct, in the case where users have different versions installed - performance impact.) The performance difference between weak/strong checksums may not be very relevant anymore? If that is the case, we could maybe remove gf_rsync_weak_checksum() completely and use the MD5 checksumming from gf_rsync_strong_checksum() for everything. Adding a function to do sha256 just to replace adler32 does look like a good approach to me.
I do see all very good points made there Neils. > (Changing the algorithm would possibly introduce continuous healing of data > that is already correct, in the case where users have different versions > installed - performance impact.) Right, if there is an implementation problem, I can understand not changing in mid-release as it would change customers expectations/performance and I'm aware how that can upset people. > Adding a function to do sha256 just to replace adler32 does look like a good > approach to me. I'm ok with that, although now the function is a little misnamed. > The performance difference between weak/strong checksums may not be very > relevant anymore? The new haswell CPU's from intel have increased performance ( see http://www.intel.com.au/content/dam/www/public/us/en/documents/white-papers/haswell-cryptographic-performance-paper.pdf ) for SHA checksums. So TLDR for now: Adler32 is probably fine for now. Can we / should we get stronger checksums for the stronger checksumming for a future release ?
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user