Bug 232677

Summary: ext2/3 filesystem corruption with writable loopback devices on s390
Product: Red Hat Enterprise Linux 5 Reporter: Bryn M. Reeves <bmr>
Component: kernelAssignee: Jan Glauber <jglauber>
Status: CLOSED DUPLICATE QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 5.0CC: dhoward, dzickus, esandeen
Target Milestone: ---   
Target Release: ---   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-30 13:58:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
currupt file system image
none
reproducer script none

Description Bryn M. Reeves 2007-03-16 16:11:05 UTC
Description of problem:
Creating an ext3/3 filesystem on a writable loopback device on s390 leads to
corruption of the resize inode.

Version-Release number of selected component (if applicable):
kernel-2.6.18-8.EL

How reproducible:
100%

Steps to Reproduce:
1. dd if=/dev/zero of=/tmp/img0
2. losetup /dev/loop0 /tmp/img0 
2. mke2fs -j /dev/loop0
3. mount /dev/loop0 /mnt
4. ls -R /mnt
5. umount /mnt
6. e2fsck -fn /dev/loop0

Actual results:
# e2fsck -fn /dev/loop0
e2fsck 1.39 (29-May-2006)
Resize inode not valid.  Recreate? no

Pass 1: Checking inodes, blocks, and sizes
Inode 7, i_blocks is 254, should be 4.  Fix? no

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/loop0: ********** WARNING: Filesystem still has errors **********

/dev/loop0: 11/4096 files (9.1% non-contiguous), 1691/16384 blocks


Expected results:
# e2fsck -fn /dev/loop0
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/loop0: 11/4096 files (9.1% non-contiguous), 1691/16384 blocks


Additional info:
The resize inode is always affected. On other filesystems there's also sometimes
problems in the dtime field of deleted inodes, e.g:


Resize inode not valid.  Recreate? no

Pass 1: Checking inodes, blocks, and sizes
Inode 7, i_blocks is 254, should be 252.  Fix? no

Deleted inode 17 has zero dtime.  Fix? no

Deleted inode 25 has zero dtime.  Fix? no


I wasn't able to reproduce this with msdos/vfat. Haven't tested with other fs types.

Comment 2 Eric Sandeen 2007-04-20 20:41:16 UTC
Hmm maybe you can bzip2 the /tmp/img0 image & attach it?

Does an image created on s390 check ok on x86, or vice versa?

The whole sequence works fine for me on x86

It's odd that you see an inode 25 on a fresh fs with no files....

Comment 3 Bryn M. Reeves 2007-04-21 02:42:40 UTC
It's s390 specific - I hit it while reproducing bug 232663 (big endian specific)
and found I couldn't reproduce it on ppc.

It sounds a lot like 236605 - fixed in 2.6.18-8.1.3.el5. I've run through about
five mkfs/mount/fsck cycles now without seeing it.



Comment 4 Bryn M. Reeves 2007-04-21 02:53:13 UTC
Created attachment 153232 [details]
currupt file system image

Corrupt file system image (2M uncompressed)

Comment 5 Eric Sandeen 2007-04-21 03:23:05 UTC
Thanks Bryn.  What exactly was done to create this image?

Comment 6 Bryn M. Reeves 2007-04-21 18:11:00 UTC
As described in comment #1: a dd from /dev/zero to set up the image (bs=1M
count=2), then losetup, then mke2fs -j on the loop device.

It should be quite easy to reproduce on s390 (at least with a kernel prior to
2.6.18-8.1.3.el5). I've still not seen it on the later kernel, but I'll keep
testing for now.

I'm fairly certain this is just another instance of the page_mkclean problem
from bug 236605. When I first saw this it seemed to happen every time, but
running a script 1000 times just now gave me 236 failures on 2.6.18-8.el5. The
same script on 2.6.18-8.1.3.el5 didn't see any corruption in 1000 runs.


Comment 7 Bryn M. Reeves 2007-04-21 18:18:43 UTC
Created attachment 153245 [details]
reproducer script

Comment 9 Eric Sandeen 2007-04-21 21:46:20 UTC
Thanks Bryn.  If you're pretty sure this is already resolved by 236605, I'm
happy to wait 'til I hear otherwise... :)

Comment 10 RHEL Program Management 2007-04-25 20:38:44 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 11 Bryn M. Reeves 2007-05-30 13:58:05 UTC
Eventually got back to testing this with 2.6.18-8.el5 plus the patch for bug
236605 - same result. I see errors around 25% of the time without it which
disappear when it's applied - I think it's safe to close this as a dup.


*** This bug has been marked as a duplicate of 236605 ***