Bug 155473 - ext3 data corruption under Samba share
ext3 data corruption under Samba share
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
high Severity high
: ---
: ---
Assigned To: Stephen Tweedie
Brian Brock
Depends On:
Blocks: 156321
  Show dependency treegraph
Reported: 2005-04-20 14:35 EDT by Wendy Cheng
Modified: 2007-11-30 17:07 EST (History)
6 users (show)

See Also:
Fixed In Version: RHSA-2005-663
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-09-28 10:59:14 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
patch 2-1 (9.62 KB, patch)
2005-04-20 14:41 EDT, Wendy Cheng
no flags Details | Diff
patch 2-2 (4.18 KB, patch)
2005-04-20 14:43 EDT, Wendy Cheng
no flags Details | Diff
aclbreak.tar.gz (1.80 MB, application/octet-stream)
2005-07-06 06:20 EDT, Bastien Nocera
no flags Details

  None (edit)
Description Wendy Cheng 2005-04-20 14:35:46 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.3) Gecko/20040924

Description of problem:
Based on the fsck log we collected from a customer site that reported data corruptions with a 1.8TB filesystem on RHEL 3 system, it was tentatively concluded that the issues found in bugzilla 138951 (opened against RHEL 4) was the cause. 

The filesystem is mounted as smb share that gets accessed via Window machines.

The symptoms include "ls" error messages such as:
[root@nycpr350fil graphics]# ls -al
ls: athletic.zip: Input/output error
ls: •DSC_0010.psd: Input/output error
ls: •DSC_0014.psd: Input/output error
ls: WeatherPlusLOGOfeb9.eps: Input/output erro
From /var/log/messages file:
EXT3-fs error (device power2(232,49)): ext3_free_blocks: bit already cleared for block 33103513
that shows bitmap corruption. The blocks that are in use may be marked as available for reuse and subsequently get allocated as "free" blocks.

Version-Release number of selected component (if applicable):

How reproducible:
Didn't try

Steps to Reproduce:
1. (occurs twice on mission cirtical production system)

Actual Results:  filesystem corrupted

Expected Results:  no corrutions

Additional info:

This has been occurred twice on a mission critical system with large LUN (1.8TB). Other than downtime is not acceptable, the fsck time for the LUN with this size is also unmanageable.
Comment 1 Wendy Cheng 2005-04-20 14:41:16 EDT
Created attachment 113428 [details]
patch 2-1
Comment 2 Wendy Cheng 2005-04-20 14:43:19 EDT
Created attachment 113429 [details]
patch 2-2

Stephen Tweedie backported these two patches into RHEL 3. A RHEL 3 .31EL
beehive based test kernel with these two patches had been sent to customer
Comment 19 Bastien Nocera 2005-07-06 06:17:49 EDT
*** Bug 161056 has been marked as a duplicate of this bug. ***
Comment 20 Bastien Nocera 2005-07-06 06:20:03 EDT
Created attachment 116401 [details]

Test case from bug #161056.
Comment 21 Ernie Petrides 2005-07-15 20:22:04 EDT
A fix for this problem has just been committed to the RHEL3 U6
patch pool this evening (in kernel version 2.4.21-32.12.EL).
Comment 24 Red Hat Bugzilla 2005-09-28 10:59:14 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.