Bug 480663 - data corruption and general brokenness with ramdisks (rd)
Summary: data corruption and general brokenness with ramdisks (rd)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.2
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Don Howard
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 483701
TreeView+ depends on / blocked
 
Reported: 2009-01-19 18:40 UTC by Bryn M. Reeves
Modified: 2018-10-20 03:34 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-02 08:20:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:1243 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.4 kernel security and bug fix update 2009-09-01 08:53:34 UTC

Description Bryn M. Reeves 2009-01-19 18:40:30 UTC
Description of problem:
The ramdisk driver has long been known to have issues, particularly with a lot of page reclaim going on (memory pressure/swapping are good ways to help it misbehave). It was replaced upstream in 2.6.25 with a complete rewrite.

From Nick's description of the replacement driver:

   "The old one is really difficult because it effectively implements a block
    device which serves data out of its own buffer cache.  It relies on the 
    dirty bit being set, to pin its backing store in cache, however there are 
    non trivial paths which can clear the dirty bit (eg.  
    try_to_free_buffers()), which had recently lead to data corruption.  And
    in general it is completely wrong for a block device driver to do this."

It's quite easy to trigger data corruption and other problems (hangs and oopses have been reported) by stressing the ramdisk driver or the VM.

Version-Release number of selected component (if applicable):
2.6.18-92.el5

How reproducible:
Takes a while.

Steps to Reproduce:
1. Create a number of large ramdisks
2. Configure the system with sufficiently little RAM that swapping will occur
3. Create and mount file systems on the ramdisks
4. Apply load to the ramdisk backed file systems (e.g. repeated tar/untar)
5. Verify data written to the devices at (4).
  
Actual results:
After some time (may take several days depending on I/O and memory load) data inconsistencies will appear. Normally manifests as a zero page in the middle of a ramdisk. Occasional hangs/oopses have been reported by end users but no data is currently available for these cases.

Expected results:
Reliable behavior of ramdisks under system load.

Additional info:
The user who reported this behavior has tested with the post-2.6.25 ramdisk rewrite and is unable to reproduce with those kernels.

New driver was merged in commit 9db5579be4bb5320c3248f6acf807aedf05ae143

Comment 5 RHEL Program Management 2009-02-16 15:21:32 UTC
Updating PM score.

Comment 14 Don Zickus 2009-04-27 15:59:09 UTC
in kernel-2.6.18-141.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 16 Chris Ward 2009-07-03 18:21:20 UTC
~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.

Questions can be posted to this bug or your customer or partner representative.

Comment 17 Devin Nate 2009-08-03 20:51:40 UTC
Dear RHEL:

I have downloaded kernel-2.6.18-157.el5.src.rpm to review based on some research I've done on the new brd (replacement ramdisk/rd) implementation. I am interested in it because our company makes heavy use of a ramdisk during certain data transformations and corruption would be bad(tm).

I've come across a few issues and do not see these patches having been applied in the rhel5.4 beta kernel release. In any event, a quick google of 'ext2 corruption brd', 'ext3 corruption brd', and 'ext4 corruption brd' will show 2 basic threads. They appear around Mar/Apr 2009.

In one, there appears to be probs with ext3 and ext4, which appears to be a prob with brd, ext3 jbd/jbd2, and htree's.

In the second, there appears to be probs with ext2.

Note: I'm not the author of any of these patches and not a kernel contributor... I've simply googled the info. Please don't apply these patches based on my authority without appropriate RHEL review ;)

Specific links and info below:

1. ext2 corruption (and proposed fix).
http://patchwork.ozlabs.org/patch/25808/

2. ext3 corruption in brd.c (and proposed fix).
http://lkml.indiana.edu/hypermail/linux/kernel/0904.1/03531.html

3. ext3/4 corruption evidenced by jbd/jbd2 errors (and proposed fix)
Proposed fix: http://marc.info/?l=linux-ext4&m=123731584711382&w=2 (this link has 3 other fixes.. I have no idea what they are for).
Prob desc thread: http://osdir.com/ml/linux-ext4/2009-03/msg00274.html

4. ext3/4 corruption evidenced by corrupt htrees
http://osdir.com/ml/linux-ext4/2009-03/msg00316.html
I haven't found any evidence of a fix for this yet.

4a. The above link also refers to a patch titled "Nick's "fs: new inode
i_state corruption fix" patch".

Thanks,
Devin Nate

Comment 18 Hushan Jia 2009-08-04 06:20:53 UTC
Run the test script several days.
on i686, with kernel 2.6.18-92.el5 it was not able to reproduce the data corruption, and data corruption does not happen on 2.6.18-160.el5.
on x86_64, also data corruption does not happen.

The patch linux-2.6-misc-backport-new-ramdisk-driver.patch is include in kernel source package.

Comment 21 Devin Nate 2009-08-04 15:22:26 UTC
Hi Redhat, Hushan;

I took the opportunity after I posted to run the testing which uses fsstress. The command is in a while loop, and retrievable from one of the above links. I can post if need be. The actual fsstress command is this:
fsstress -d /mnt/test_file_system/work -p 3 -l 0 -n 100000000 -X &

Anyhow, having run the test, I can confirm corruption in ext2, ext3, and ext4 when running on a ramdisk (brd), and also a loop device (loop0), regardless of where I put that loop device (on a ramdisk, on a tmpfs /dev/shm, or in a file on an existing ext3 file system. The only time I have been successful is with ext3 mounted data=journal.

Furthermore, I took the opportunity to download the source code for kernel-2.6.18-157.el5 and linux-2.6.30.4 and can confirm the linux-2.6-misc-backport-new-ramdisk-driver.patch is missing code now upstream (again, the patch being a 1 liner).

There is an additional bug report in rhel bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=494927

Thanks,
Devin

Comment 23 Don Howard 2009-08-04 15:56:09 UTC
Hi Devin - 

Thanks for the feedback.

The ext2/3 patches are not related to this bug.  Please open a separate bz to follow those issues.

The brd.c patch is related and looks like a good addition to the existing brd patch.

Comment 27 errata-xmlrpc 2009-09-02 08:20:37 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html


Note You need to log in before you can comment on or make changes to this bug.