Bug 441606 - fsck.ext3 fails with * glibc detected * error
Summary: fsck.ext3 fails with * glibc detected * error
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: e2fsprogs
Version: 4.6
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Eric Sandeen
QA Contact:
URL:
Whiteboard:
Keywords: Regression
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-04-08 23:26 UTC by Francisco Jesus Monserrat Coll
Modified: 2018-10-19 21:04 UTC (History)
4 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2008-07-24 19:58:41 UTC


Attachments (Terms of Use)
Core of fsck.ext3 (81.83 KB, application/octet-stream)
2008-04-09 00:42 UTC, Francisco Jesus Monserrat Coll
no flags Details
strace output of th error (88.90 KB, application/octet-stream)
2008-04-09 00:46 UTC, Francisco Jesus Monserrat Coll
no flags Details
bad filesystem image (3.01 KB, application/x-bzip2)
2008-04-10 18:29 UTC, Eric Sandeen
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0732 normal SHIPPED_LIVE e2fsprogs bug fix update 2008-07-23 16:46:24 UTC

Description Francisco Jesus Monserrat Coll 2008-04-08 23:26:10 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13

Description of problem:

trying to repair a disk, fsck.ext3 aborts with a glibc detected error, due to some pointer error:

 
Log:

-----------------------------
 fsck.ext3 -y /dev/sdb1 
e2fsck 1.35 (28-Feb-2004)
/home/sces contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 377434 has a bad extended attribute block 762658.  Clear? yes

*** glibc detected *** free(): invalid pointer: 0x0000007fbffffc2b ***
Aborted
---------------------------

 
 This is inside a virtual machine, so the disk has no phisical errors. It's possible to mount the filesystem read-only, and access to most of contents.



Version-Release number of selected component (if applicable):
e2fsprogs-1.35-12.11.el4_6.1 

How reproducible:
Always


Steps to Reproduce:
1. try to fix the disk
2.
3.

Actual Results:


Expected Results:


Additional info:

Comment 1 Eric Sandeen 2008-04-09 00:23:23 UTC
If you set ulimit -c unlimited, do you get a core when it fails?

Or, is there any chance of providing an e2image for the problematic filesystem?

Thanks,
-Eric

Comment 2 Francisco Jesus Monserrat Coll 2008-04-09 00:42:14 UTC
Created attachment 301735 [details]
Core of fsck.ext3

Comment 3 Francisco Jesus Monserrat Coll 2008-04-09 00:46:18 UTC
Created attachment 301736 [details]
strace output of th error

Comment 4 Eric Sandeen 2008-04-09 01:18:14 UTC
Thanks, I'll look at the core soon.

On the chance that this is something already fixed in a pending update, would
you be willing to try:

http://people.redhat.com/esandeen/e2fsck.test/e2fsck.static ?

(it's a static x86_64 binary)

Thanks,
-Eric

Comment 5 Francisco Jesus Monserrat Coll 2008-04-09 01:49:43 UTC
Same problem:

./e2fsck.static -y /dev/sdb1
e2fsck 1.35 (28-Feb-2004)
/home/sces contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 377434 has a bad extended attribute block 762658.  Clear? yes

*** glibc detected *** free(): invalid pointer: 0x0000007fbffffc25 ***


Comment 6 Eric Sandeen 2008-04-09 02:08:10 UTC
Ok, thanks for checking.

-Eric

Comment 7 Eric Sandeen 2008-04-09 15:27:51 UTC
Which version of glibc do you currently have installed, so I can install a
debuginfo to match, here?
Thanks,
-Eric

Comment 8 Francisco Jesus Monserrat Coll 2008-04-09 22:42:58 UTC
The version installed by default in a RHE 4 system:

glibc-2.3.4-2.39
glibc-2.3.4-2.39
glibc-common-2.3.4-2.3

thanks



Comment 9 Eric Sandeen 2008-04-10 02:29:54 UTC
thanks.  In retrospect that was a silly question for me to ask. :)

I've got an intentionally-corrupted image which replicates the problem, now, so
should have this sorted soon.

Thanks for all the info,
-Eric

Comment 10 Eric Sandeen 2008-04-10 02:47:18 UTC
Well, turns out to be a simple fix; it was an uninitialized variable, and it's
already fixed upstream.

http://git.kernel.org/?p=fs/ext2/e2fsprogs.git;a=commitdiff;h=86bc90f4f11df090f86dc764a4ea2d6dd5c13ffe

contains the fix.

For what it's worth, the bug was introduced in 1.35-12.7.el4, 1.35-12.6.el4
should not contain have this problem.  I'll try to move forward with this quickly.

Thanks for all the good bug-reporter work on this one.  :)

-Eric

Comment 11 RHEL Product and Program Management 2008-04-10 03:18:42 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 13 Eric Sandeen 2008-04-10 18:29:34 UTC
Created attachment 302044 [details]
bad filesystem image

bzipp'd 128M filesystem image with a bad xattr.

# bunzip2 badfs.bz2
# e2fsck -fy badfs

should show the error.

Comment 15 RHEL Product and Program Management 2008-04-10 18:53:36 UTC
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.

Comment 16 James Laska 2008-04-10 19:03:41 UTC
esandeen: is there a reproducible way to create the bad fs image that you've
generated?  Are there other failure scenario's worth considering?

Thanks!

Comment 18 Eric Sandeen 2008-04-10 19:42:22 UTC
I did something like this.

dd if=/dev/zero of=badfs bs=1M count=128
mkfs.ext3 -b 4096 -F badfs
mkdir mnt
mount -o loop,user_xattr badfs mnt
touch mnt/file
for I in `seq 1 16`; do setfattr -n "user.attr$I" -v "fairly long string"
mnt/file; done
umount mnt
debugfs -R "stat file" badfs | grep "File ACL"

you'll see:
File ACL: 5143    Directory ACL: 0

so use a hex editor, go to offset (5143x4096) in badfs, and you'll see something
like:

02 EA

which is the attribute magic.  corrupt it by changing it to "EB" or somesuch...
and there you go.  fsck will now go down the codepath that tries to free the
uninit'd variable.  There may be other ways to *hit* it but this is really a
pretty simple flaw, with a simple fix, and this should be sufficient.

Thanks,
-Eric


Comment 24 errata-xmlrpc 2008-07-24 19:58:41 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0732.html


Note You need to log in before you can comment on or make changes to this bug.