Bug 232677

Summary:

ext2/3 filesystem corruption with writable loopback devices on s390

Product:

Red Hat Enterprise Linux 5

Reporter:

Bryn M. Reeves <bmr>

Component:

kernel

Assignee:

Jan Glauber <jglauber>

Status:

CLOSED DUPLICATE

QA Contact:

Martin Jenner <mjenner>

Severity:

high

Docs Contact:

Priority:

high

Version:

5.0

CC:

dhoward, dzickus, esandeen

Target Milestone:

---

Target Release:

---

Hardware:

s390x

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2007-05-30 13:58:05 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
currupt file system image	none
reproducer script	none

Description Bryn M. Reeves 2007-03-16 16:11:05 UTC

Description of problem:
Creating an ext3/3 filesystem on a writable loopback device on s390 leads to
corruption of the resize inode.

Version-Release number of selected component (if applicable):
kernel-2.6.18-8.EL

How reproducible:
100%

Steps to Reproduce:
1. dd if=/dev/zero of=/tmp/img0
2. losetup /dev/loop0 /tmp/img0 
2. mke2fs -j /dev/loop0
3. mount /dev/loop0 /mnt
4. ls -R /mnt
5. umount /mnt
6. e2fsck -fn /dev/loop0

Actual results:
# e2fsck -fn /dev/loop0
e2fsck 1.39 (29-May-2006)
Resize inode not valid.  Recreate? no

Pass 1: Checking inodes, blocks, and sizes
Inode 7, i_blocks is 254, should be 4.  Fix? no

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/loop0: ********** WARNING: Filesystem still has errors **********

/dev/loop0: 11/4096 files (9.1% non-contiguous), 1691/16384 blocks


Expected results:
# e2fsck -fn /dev/loop0
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/loop0: 11/4096 files (9.1% non-contiguous), 1691/16384 blocks


Additional info:
The resize inode is always affected. On other filesystems there's also sometimes
problems in the dtime field of deleted inodes, e.g:


Resize inode not valid.  Recreate? no

Pass 1: Checking inodes, blocks, and sizes
Inode 7, i_blocks is 254, should be 252.  Fix? no

Deleted inode 17 has zero dtime.  Fix? no

Deleted inode 25 has zero dtime.  Fix? no


I wasn't able to reproduce this with msdos/vfat. Haven't tested with other fs types.

Comment 2 Eric Sandeen 2007-04-20 20:41:16 UTC

Hmm maybe you can bzip2 the /tmp/img0 image & attach it?

Does an image created on s390 check ok on x86, or vice versa?

The whole sequence works fine for me on x86

It's odd that you see an inode 25 on a fresh fs with no files....

Comment 3 Bryn M. Reeves 2007-04-21 02:42:40 UTC

It's s390 specific - I hit it while reproducing bug 232663 (big endian specific)
and found I couldn't reproduce it on ppc.

It sounds a lot like 236605 - fixed in 2.6.18-8.1.3.el5. I've run through about
five mkfs/mount/fsck cycles now without seeing it.

Comment 4 Bryn M. Reeves 2007-04-21 02:53:13 UTC

Created attachment 153232 [details]
currupt file system image

Corrupt file system image (2M uncompressed)

Comment 5 Eric Sandeen 2007-04-21 03:23:05 UTC

Thanks Bryn.  What exactly was done to create this image?

Comment 6 Bryn M. Reeves 2007-04-21 18:11:00 UTC

As described in comment #1: a dd from /dev/zero to set up the image (bs=1M
count=2), then losetup, then mke2fs -j on the loop device.

It should be quite easy to reproduce on s390 (at least with a kernel prior to
2.6.18-8.1.3.el5). I've still not seen it on the later kernel, but I'll keep
testing for now.

I'm fairly certain this is just another instance of the page_mkclean problem
from bug 236605. When I first saw this it seemed to happen every time, but
running a script 1000 times just now gave me 236 failures on 2.6.18-8.el5. The
same script on 2.6.18-8.1.3.el5 didn't see any corruption in 1000 runs.

Comment 7 Bryn M. Reeves 2007-04-21 18:18:43 UTC

Created attachment 153245 [details]
reproducer script

Comment 9 Eric Sandeen 2007-04-21 21:46:20 UTC

Thanks Bryn.  If you're pretty sure this is already resolved by 236605, I'm
happy to wait 'til I hear otherwise... :)

Comment 10 RHEL Program Management 2007-04-25 20:38:44 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 11 Bryn M. Reeves 2007-05-30 13:58:05 UTC

Eventually got back to testing this with 2.6.18-8.el5 plus the patch for bug
236605 - same result. I see errors around 25% of the time without it which
disappear when it's applied - I think it's safe to close this as a dup.


*** This bug has been marked as a duplicate of 236605 ***