Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 809982

Summary: truncation of offset in self-heal
Product: [Community] GlusterFS Reporter: Anand Avati <aavati>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.2.5CC: chrisw, gluster-bugs, jdarcy, maillistofyinyin, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 3.2.7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-11 05:25:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anand Avati 2012-04-04 19:29:33 UTC
Description of problem: Casting a 64bit offset to a void * during self-heal causes corruption.


Version-Release number of selected component (if applicable): release-3.2


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

--snip--


Hello,


 Michael and I ran a battery of testing today and
closed out the two issues identified below (of March
11).


FYI RE the "background-self-heal-only" patch;

 It has been tested now to our satisfaction and
 works as described/intended.


http://midnightcode.org/projects/saturn/code/glusterfs-3.2.6-background-only.patch



FYI RE the 2GB replicate error;

 >>>    2) Of the file that were replicated, not all were
 >>>          corrupted (capped at 2G -- note that we
 >>>          confirmed that this was the first 2G of the
 >>>          source file contents).
 >>>
 >>> So is there a known replicate issue with files
 >>> greater than 2GB?

 We have confirmed this issue and the referenced
 patch appears to correct the problem.  We were
 able to get one particular file to reliably fail at 2GB
 under GlusterFS 3.2.6, and then correctly
 transfer it and many other >2GB files, after
 applying this patch.

 The error stems from putting the off_t (64bit)
 offset value into a void * cookie value typecast
 as long (unsigned 32bit) and then restoring it into
 an off_t again.  The tip-off was a recurring offset
 of 18446744071562067968 seen in the logs. The
 effect is described well here;

http://stackoverflow.com/questions/5628484/unexpected-behavior-from-unsigned-int64

 We can't explain why this issue was intermittent,
 and we're not sure if the rw_sh->offset is the
 correct 64bit offset to use.  However that offset
 appeared to match the cookie value in all tested
 pre-failure states.  Please advise if there is a
 better (more correct) off_t offset to use.


http://midnightcode.org/projects/saturn/code/glusterfs-3.2.6-2GB.patch



Thanks for your help,

--snip--

Comment 1 Pranith Kumar K 2012-06-11 10:49:30 UTC
We took the truncation of offset in 32 bit patch. background self-heal only option is not taken in.

Comment 2 Jeff Darcy 2012-10-31 13:30:04 UTC
ASSIGNED until the specific patches in Gerrit appear here.

Comment 3 Jeff Darcy 2012-10-31 20:56:24 UTC
http://review.gluster.org/3972 addresses this in 3.2, but I don't see anything for 3.3/master.

Comment 4 Pranith Kumar K 2012-11-01 07:30:21 UTC
The bug was present only on 3.2. The code was modified a lot because of granular-self-heal at the time of 3.3 which took away this bug in 3.3.

Comment 5 Vijay Bellur 2012-12-11 05:25:20 UTC
Fixed in 3.2.7