+++ This bug was initially created as a clone of Bug #1378778 +++ Description of problem: Same as summary. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from Krutika Dhananjay on 2016-09-23 06:58:22 EDT --- All is not well in the compound fops world. I wrote and ran the following simple test on master: <script> 0 #!/bin/bash 1 . $(dirname $0)/../../include.rc 2 . $(dirname $0)/../../volume.rc 3 4 cleanup 5 6 TEST glusterd 7 TEST pidof glusterd 8 TEST $CLI volume create $V0 replica 3 $H0:$B0/${V0}{0,1,2} 9 TEST $CLI volume start $V0 10 TEST $GFS --volfile-id=$V0 --volfile-server=$H0 $M0 11 12 TEST dd if=/dev/urandom of=$M0/file bs=1024 count=10 2>/dev/null 13 md5sum_file=$(md5sum $M0/file | awk '{print $1}') 14 15 TEST $CLI volume set $V0 cluster.use-compound-fops on 16 17 TEST dd if=$M0/file of=$M0/file-copy bs=1024 count=10 2>/dev/null 18 19 EXPECT_WITHIN $UMOUNT_TIMEOUT "Y" force_umount $M0 20 TEST $GFS --volfile-id=$V0 --volfile-server=$H0 $M0 21 22 EXPECT "$md5sum_file" echo `md5sum $M0/file-copy | awk '{print $1}'` 23 24 EXPECT_WITHIN $UMOUNT_TIMEOUT "Y" force_umount $M0 25 TEST $CLI volume stop $V0 26 TEST $CLI volume delete $V0 27 28 cleanup </script> The md5sum comparison at line 22 fails. The script runs fine when I "comment out" line 15, implying the bug is in compound fops. Will post my findings after the issue is RC'd. -Krutika --- Additional comment from Worker Ant on 2016-10-17 06:21:35 EDT --- REVIEW: http://review.gluster.org/15654 (compound fops: Fix file corruption issue) posted (#1) for review on master by Krutika Dhananjay (kdhananj) --- Additional comment from Worker Ant on 2016-10-17 08:59:33 EDT --- REVIEW: http://review.gluster.org/15654 (compound fops: Fix file corruption issue) posted (#2) for review on master by Krutika Dhananjay (kdhananj) --- Additional comment from Worker Ant on 2016-10-20 02:58:12 EDT --- REVIEW: http://review.gluster.org/15654 (compound fops: Fix file corruption issue) posted (#3) for review on master by Krutika Dhananjay (kdhananj) --- Additional comment from Worker Ant on 2016-10-20 06:55:36 EDT --- REVIEW: http://review.gluster.org/15654 (compound fops: Fix file corruption issue) posted (#4) for review on master by Krutika Dhananjay (kdhananj) --- Additional comment from Worker Ant on 2016-10-24 02:20:45 EDT --- REVIEW: http://review.gluster.org/15654 (compound fops: Fix file corruption issue) posted (#5) for review on master by Krutika Dhananjay (kdhananj)
REVIEW: http://review.gluster.org/15709 (compound fops: Fix file corruption issue) posted (#1) for review on release-3.9 by Krutika Dhananjay (kdhananj)
REVIEW: http://review.gluster.org/15709 (compound fops: Fix file corruption issue) posted (#2) for review on release-3.9 by Krutika Dhananjay (kdhananj)
COMMIT: http://review.gluster.org/15709 committed in release-3.9 by Raghavendra G (rgowdapp) ------ commit 3e4d13f90710d03e28efbc578c82a465fcd486f8 Author: Krutika Dhananjay <kdhananj> Date: Mon Oct 17 15:13:28 2016 +0530 compound fops: Fix file corruption issue Backport of: http://review.gluster.org/#/c/15654/ 1. Address of a local variable @args is copied into state->req in server3_3_compound (). But even after the function has gone out of scope, in server_compound_resume () this pointer is accessed and dereferenced. This patch fixes that. 2. Compound fops, by virtue of NOT having a vector sizer (like the one writev has), ends up having both the header and the data (in case one of its member fops is WRITEV) in the same hdr_iobuf. This buffer was not being preserved through the lifetime of the compound fop, causing it to be overwritten by a parallel write fop, even when the writev associated with the currently executing compound fop is yet to hit the desk, thereby corrupting the file's data. This is fixed by associating the hdr_iobuf with the iobref so its memory remains valid through the lifetime of the fop. 3. Also fixed a use-after-free bug in protocol/client in compound fops cbk, missed by Linux but caught by NetBSD. Finally, big thanks to Pranith Kumar K and Raghavendra Gowdappa for their help in debugging this file corruption issue. Change-Id: I58da39ae544ad81192849926399a971c4c01c986 BUG: 1387984 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: http://review.gluster.org/15709 NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp> Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org>
glusterfs-3.9.0rc2 is released[1] and packages are available for different distributions[2] to test. [1] http://www.gluster.org/pipermail/maintainers/2016-October/001601.html [2] http://www.gluster.org/pipermail/maintainers/2016-October/001605.html and http://www.gluster.org/pipermail/maintainers/2016-October/001606.html
Gluster 3.9 GA is released http://blog.gluster.org/2016/11/announcing-gluster-3-9/