Bug 773493
Summary: | File size initially wrong after a replica volumes image failure and repair. | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Jeff Byers <jbyers> |
Component: | replicate | Assignee: | Pranith Kumar K <pkarampu> |
Status: | CLOSED DEFERRED | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.2.5 | CC: | a9016009, bugs, gluster-bugs, rwheeler, vbellur |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-12-14 19:40:33 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jeff Byers
2012-01-11 23:30:47 UTC
hi Jeff, Thanks for logging the bug. I have doubts in step 6,7 in the steps you provided. How much time did you wait after step 6 is completed?. In step-7 what command are you using to read the file and how are you verifying its size to be N and not N*2?. Do you have a script that tests this. Pranith When the full test is running, the client is writing and reading files continuously, writing the bunch, then verifying the bunch. When reproducing manually, it is human time in seconds, or minutes. The commands that generate the file and verify the file are home grown. The generator takes a size and file name, writes a header and fills in the rest with a data pattern. The verifier takes the name, and from the file header determines the expected size, reads and verifies the data pattern in 64K chunks, and verifies that the file is the correct size by seeing EOF when it should. Note that the verifier program does not use 'stat' information for the file size, the file contents itself determines the size. There are test scripts and programs, but they are tied into a system and might be difficult to extract them from. The file generator and verifier may be free standing enough though. ~ Jeff Byers ~ Hi Jeff, From what I understand about your verifier, it gets the size the generator wrote to the header of the file and then verifies that the data of given size is present and correct. Is this correct? But this still doesn't clarify Pranith's doubt about how size is N not N*2. If the verifier says data is alright, then the size it reads form the header must be N*2. Where is the size shown as N? Also, from where are you verifying the files. From the mountpoint or the bricks? If it is possible, can you also provide the file generator & verifier. Kaushal (In reply to comment #3) > Hi Jeff, > > From what I understand about your verifier, it gets the size the generator > wrote to the header of the file and then verifies that the data of given size > is present and correct. Is this correct? > But this still doesn't clarify Pranith's doubt about how size is N not N*2. If > the verifier says data is alright, then the size it reads form the header must > be N*2. Where is the size shown as N? > Also, from where are you verifying the files. From the mountpoint or the > bricks? > > If it is possible, can you also provide the file generator & verifier. > > Kaushal Hi Jeff, From what I understand about your verifier, it gets the size the generator wrote to the header of the file and then verifies that the data of given size is present and correct. Is this correct? Jeff: Yes But this still doesn't clarify Pranith's doubt about how size is N not N*2. If the verifier says data is alright, then the size it reads form the header must be N*2. Where is the size shown as N? Jeff: I wasnt clear enough. The data was correct up to N, but EOF was reached on the verify read 64KB loop before the N*2 worth of reads were done. But if one does a "dir", or waits a while, and the verifier is rerun, all of the data is then there. In our test scripts, we went so far as to put a retry loop, and halfway through the verify retries, did the "dir" which allowed it to then pass. This was done until we determined that 'performance.stat-prefetch=off' would resolve the problem completely. Also, from where are you verifying the files. From the mount-point or the bricks? Jeff: All of the I/O is being done from the mount-point of the volumes. Our configuration is 2-way GlusterFS replica volume, with local GlusterFS clients and Samba arbitrated by CTDB, with the host being Windows using CIFS. If it is possible, can you also provide the file generator & verifier. We are looking at that, and are going to try to simplify the environment enough that we could send something over to you. ~ Jeff Byers ~ Jeff, can you check if the issue still exists with 3.3.0 release? Can you point me to the change list, or where in the source the fix was implemented so that, when I have time, I can add logging to make sure that the code path with the fix was executed? Thanks.
> We are looking at that, and are going to try to
> simplify the environment enough that we could send
> something over to you.
>
> ~ Jeff Byers ~
Jeff,
Could you let us know if you have a simpler test case that can re-create the issue?
Pranith
(In reply to comment #8) > > We are looking at that, and are going to try to > > simplify the environment enough that we could send > > something over to you. > > > > ~ Jeff Byers ~ > > Jeff, > Could you let us know if you have a simpler test case that can > re-create the issue? > > Pranith We believe that we provided some simplified test scripts and instructions on how to use them to you guys about a year ago. The test engineer remembers getting authorization to do this, and putting something together. Unfortunately, it has been too long, and cannot find any local copies of what we had sent over to you. Do you remember communicating with, and/or receiving anything from our David McBride? We do have our original programs and scripts of course, but the work to make them externally useful would need to be repeated. What is the likelihood of any use being make of any test scripts and programs we would provide now? The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug. If there has been no update before 9 December 2014, this bug will get automatocally closed. |