Bug 204731 - external scsi filesystem remount read-only during remove data
external scsi filesystem remount read-only during remove data
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.3
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Eric Sandeen
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-08-31 06:38 EDT by Benedikt Schaefer
Modified: 2007-11-16 20:14 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-12-05 08:56:37 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Benedikt Schaefer 2006-08-31 06:38:59 EDT
Description of problem:
We have a scsi raid (ERQ16+) which are connected with LSI20320R SCSI Adapter to
a UNIWIDE Server (UniServer_3326). Filesystem is ext3 Partition size is 1TB.
With rsync we copy 300GB to this filesystem. After the sucessful copy we want to
delete all data with rm -rf. But this fails and the fs is remounted ro.

Error messages from /var/log/messages:
Aug 30 14:31:33 oss3 kernel: EXT3-fs error (device sdb1): ext3_free_blocks_sb:
bit already cleared for block 16
Aug 30 14:31:33 oss3 kernel: Aborting journal on device sdb1.
Aug 30 14:31:33 oss3 kernel: EXT3-fs error (device sdb1) in
ext3_reserve_inode_write: Journal has aborted
Aug 30 14:31:33 oss3 kernel: EXT3-fs error (device sdb1) in ext3_truncate:
Journal has aborted
Aug 30 14:31:33 oss3 kernel: EXT3-fs error (device sdb1) in
ext3_reserve_inode_write: Journal has aborted
Aug 30 14:31:33 oss3 kernel: EXT3-fs error (device sdb1) in ext3_orphan_del:
Journal has aborted
Aug 30 14:31:33 oss3 kernel: EXT3-fs error (device sdb1) in
ext3_reserve_inode_write: Journal has aborted
Aug 30 14:31:33 oss3 kernel: EXT3-fs error (device sdb1) in ext3_delete_inode:
Journal has aborted
Aug 30 14:31:33 oss3 kernel: __journal_remove_journal_head: freeing b_committed_data
Aug 30 14:31:33 oss3 last message repeated 90 times
Aug 30 14:31:33 oss3 kernel: ext3_abort called.
Aug 30 14:31:33 oss3 kernel: EXT3-fs error (device sdb1): ext3_journal_start_sb:
Detected aborted journal
Aug 30 14:31:33 oss3 kernel: Remounting filesystem read-only
Aug 30 14:31:33 oss3 kernel: __journal_remove_journal_head: freeing b_committed_data
Aug 30 14:31:33 oss3 last message repeated 86 times

After "crash" e2fsck seems not help, I have to recreate the fs.

Version-Release number of selected component (if applicable):
Server:
 UNIWIDE 3326
  CPUs: 2 x Dual Core AMD Opteron(tm) Processor 870
  MEM: 2GB
  HDD: 1 x ATLAS10K4_36SCA
  SCSI: 2 x SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X
Fusion-MPT Dual Ultra320 SCSI (rev 08)
  
RAID: EasyRaid Q16+ with 16 x 250GB Hitachi SATA 
      Configured with 2 Raidsets each Raidset with 3 slices (900GB)

OS: RHEL4U2 kernel 2.6.9-22.0.2
    RHEL4U2 kernel 2.6.9.34

How reproducible:
Every time

Steps to Reproduce:
1. connect raid to server
2. boot server
3. mkfs.ext3
4. rsync -av /data /raid (300GB) (maybe earlier)
5. rm -rf /raid
  
Actual results:
fs remount ro

Expected results:
delete data

Additional info:
We change the raid,SCSI HBA, SCSI Terminator, SCSI Kabel and it also happens
We saw the same problem also on an other server (TYAN GT24).
Comment 1 Eric Sandeen 2006-12-04 18:11:16 EST
Are the above messages the first errors you see?  Are there any other error
messages before this?

What is the output of e2fsck?  You say it doesn't help, what do you mean by
that, does e2fsck fail, or?

If you have the hardware, is it possible to recreate this on a different type of
storage subsystem?  (SATA drive, or different type of raid, simpler geometry,
or...)  When it fails on the other server, is it the same IO hardware? (hba,
raid etc?)

Thanks,

-Eric
Comment 2 Benedikt Schaefer 2006-12-05 02:31:21 EST
Dear Eric, 
 
Thanks for your answer. 
We have found the error at the hardware (defect PCI Slot). 
 
best regards 
Benedikt Schaefer 
Comment 3 Benedikt Schaefer 2006-12-05 02:32:02 EST
Dear Eric, 
 
Thanks for your answer. 
We have found the error at the hardware (defect PCI Slot). 
 
best regards 
Benedikt Schaefer 
Comment 4 Eric Sandeen 2006-12-05 08:56:37 EST
Closing; hardware problem.

Note You need to log in before you can comment on or make changes to this bug.