Bug 480663
Summary: | data corruption and general brokenness with ramdisks (rd) | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Bryn M. Reeves <bmr> |
Component: | kernel | Assignee: | Don Howard <dhoward> |
Status: | CLOSED ERRATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 5.2 | CC: | cward, devin.nate, dzickus, hjia, jplans, tao, yann.le-vot |
Target Milestone: | rc | Keywords: | FutureFeature, Triaged |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-09-02 08:20:37 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 483701 |
Description
Bryn M. Reeves
2009-01-19 18:40:30 UTC
Updating PM score. in kernel-2.6.18-141.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified. ~~ Attention - RHEL 5.4 Beta Released! ~~ RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner! If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value. Questions can be posted to this bug or your customer or partner representative. Dear RHEL: I have downloaded kernel-2.6.18-157.el5.src.rpm to review based on some research I've done on the new brd (replacement ramdisk/rd) implementation. I am interested in it because our company makes heavy use of a ramdisk during certain data transformations and corruption would be bad(tm). I've come across a few issues and do not see these patches having been applied in the rhel5.4 beta kernel release. In any event, a quick google of 'ext2 corruption brd', 'ext3 corruption brd', and 'ext4 corruption brd' will show 2 basic threads. They appear around Mar/Apr 2009. In one, there appears to be probs with ext3 and ext4, which appears to be a prob with brd, ext3 jbd/jbd2, and htree's. In the second, there appears to be probs with ext2. Note: I'm not the author of any of these patches and not a kernel contributor... I've simply googled the info. Please don't apply these patches based on my authority without appropriate RHEL review ;) Specific links and info below: 1. ext2 corruption (and proposed fix). http://patchwork.ozlabs.org/patch/25808/ 2. ext3 corruption in brd.c (and proposed fix). http://lkml.indiana.edu/hypermail/linux/kernel/0904.1/03531.html 3. ext3/4 corruption evidenced by jbd/jbd2 errors (and proposed fix) Proposed fix: http://marc.info/?l=linux-ext4&m=123731584711382&w=2 (this link has 3 other fixes.. I have no idea what they are for). Prob desc thread: http://osdir.com/ml/linux-ext4/2009-03/msg00274.html 4. ext3/4 corruption evidenced by corrupt htrees http://osdir.com/ml/linux-ext4/2009-03/msg00316.html I haven't found any evidence of a fix for this yet. 4a. The above link also refers to a patch titled "Nick's "fs: new inode i_state corruption fix" patch". Thanks, Devin Nate Run the test script several days. on i686, with kernel 2.6.18-92.el5 it was not able to reproduce the data corruption, and data corruption does not happen on 2.6.18-160.el5. on x86_64, also data corruption does not happen. The patch linux-2.6-misc-backport-new-ramdisk-driver.patch is include in kernel source package. Hi Redhat, Hushan; I took the opportunity after I posted to run the testing which uses fsstress. The command is in a while loop, and retrievable from one of the above links. I can post if need be. The actual fsstress command is this: fsstress -d /mnt/test_file_system/work -p 3 -l 0 -n 100000000 -X & Anyhow, having run the test, I can confirm corruption in ext2, ext3, and ext4 when running on a ramdisk (brd), and also a loop device (loop0), regardless of where I put that loop device (on a ramdisk, on a tmpfs /dev/shm, or in a file on an existing ext3 file system. The only time I have been successful is with ext3 mounted data=journal. Furthermore, I took the opportunity to download the source code for kernel-2.6.18-157.el5 and linux-2.6.30.4 and can confirm the linux-2.6-misc-backport-new-ramdisk-driver.patch is missing code now upstream (again, the patch being a 1 liner). There is an additional bug report in rhel bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=494927 Thanks, Devin Hi Devin - Thanks for the feedback. The ext2/3 patches are not related to this bug. Please open a separate bz to follow those issues. The brd.c patch is related and looks like a good addition to the existing brd patch. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html |