Bug 441606 - fsck.ext3 fails with * glibc detected * error
fsck.ext3 fails with * glibc detected * error
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: e2fsprogs (Show other bugs)
4.6
x86_64 Linux
low Severity medium
: rc
: ---
Assigned To: Eric Sandeen
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-04-08 19:26 EDT by Francisco Jesus Monserrat Coll
Modified: 2010-10-22 19:52 EDT (History)
4 users (show)

See Also:
Fixed In Version: RHBA-2008-0732
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-24 15:58:41 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Core of fsck.ext3 (81.83 KB, application/octet-stream)
2008-04-08 20:42 EDT, Francisco Jesus Monserrat Coll
no flags Details
strace output of th error (88.90 KB, application/octet-stream)
2008-04-08 20:46 EDT, Francisco Jesus Monserrat Coll
no flags Details
bad filesystem image (3.01 KB, application/x-bzip2)
2008-04-10 14:29 EDT, Eric Sandeen
no flags Details

  None (edit)
Description Francisco Jesus Monserrat Coll 2008-04-08 19:26:10 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13

Description of problem:

trying to repair a disk, fsck.ext3 aborts with a glibc detected error, due to some pointer error:

 
Log:

-----------------------------
 fsck.ext3 -y /dev/sdb1 
e2fsck 1.35 (28-Feb-2004)
/home/sces contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 377434 has a bad extended attribute block 762658.  Clear? yes

*** glibc detected *** free(): invalid pointer: 0x0000007fbffffc2b ***
Aborted
---------------------------

 
 This is inside a virtual machine, so the disk has no phisical errors. It's possible to mount the filesystem read-only, and access to most of contents.



Version-Release number of selected component (if applicable):
e2fsprogs-1.35-12.11.el4_6.1 

How reproducible:
Always


Steps to Reproduce:
1. try to fix the disk
2.
3.

Actual Results:


Expected Results:


Additional info:
Comment 1 Eric Sandeen 2008-04-08 20:23:23 EDT
If you set ulimit -c unlimited, do you get a core when it fails?

Or, is there any chance of providing an e2image for the problematic filesystem?

Thanks,
-Eric
Comment 2 Francisco Jesus Monserrat Coll 2008-04-08 20:42:14 EDT
Created attachment 301735 [details]
Core of fsck.ext3
Comment 3 Francisco Jesus Monserrat Coll 2008-04-08 20:46:18 EDT
Created attachment 301736 [details]
strace output of th error
Comment 4 Eric Sandeen 2008-04-08 21:18:14 EDT
Thanks, I'll look at the core soon.

On the chance that this is something already fixed in a pending update, would
you be willing to try:

http://people.redhat.com/esandeen/e2fsck.test/e2fsck.static ?

(it's a static x86_64 binary)

Thanks,
-Eric
Comment 5 Francisco Jesus Monserrat Coll 2008-04-08 21:49:43 EDT
Same problem:

./e2fsck.static -y /dev/sdb1
e2fsck 1.35 (28-Feb-2004)
/home/sces contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 377434 has a bad extended attribute block 762658.  Clear? yes

*** glibc detected *** free(): invalid pointer: 0x0000007fbffffc25 ***
Comment 6 Eric Sandeen 2008-04-08 22:08:10 EDT
Ok, thanks for checking.

-Eric
Comment 7 Eric Sandeen 2008-04-09 11:27:51 EDT
Which version of glibc do you currently have installed, so I can install a
debuginfo to match, here?
Thanks,
-Eric
Comment 8 Francisco Jesus Monserrat Coll 2008-04-09 18:42:58 EDT
The version installed by default in a RHE 4 system:

glibc-2.3.4-2.39
glibc-2.3.4-2.39
glibc-common-2.3.4-2.3

thanks

Comment 9 Eric Sandeen 2008-04-09 22:29:54 EDT
thanks.  In retrospect that was a silly question for me to ask. :)

I've got an intentionally-corrupted image which replicates the problem, now, so
should have this sorted soon.

Thanks for all the info,
-Eric
Comment 10 Eric Sandeen 2008-04-09 22:47:18 EDT
Well, turns out to be a simple fix; it was an uninitialized variable, and it's
already fixed upstream.

http://git.kernel.org/?p=fs/ext2/e2fsprogs.git;a=commitdiff;h=86bc90f4f11df090f86dc764a4ea2d6dd5c13ffe

contains the fix.

For what it's worth, the bug was introduced in 1.35-12.7.el4, 1.35-12.6.el4
should not contain have this problem.  I'll try to move forward with this quickly.

Thanks for all the good bug-reporter work on this one.  :)

-Eric
Comment 11 RHEL Product and Program Management 2008-04-09 23:18:42 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 13 Eric Sandeen 2008-04-10 14:29:34 EDT
Created attachment 302044 [details]
bad filesystem image

bzipp'd 128M filesystem image with a bad xattr.

# bunzip2 badfs.bz2
# e2fsck -fy badfs

should show the error.
Comment 15 RHEL Product and Program Management 2008-04-10 14:53:36 EDT
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.
Comment 16 James Laska 2008-04-10 15:03:41 EDT
esandeen: is there a reproducible way to create the bad fs image that you've
generated?  Are there other failure scenario's worth considering?

Thanks!
Comment 18 Eric Sandeen 2008-04-10 15:42:22 EDT
I did something like this.

dd if=/dev/zero of=badfs bs=1M count=128
mkfs.ext3 -b 4096 -F badfs
mkdir mnt
mount -o loop,user_xattr badfs mnt
touch mnt/file
for I in `seq 1 16`; do setfattr -n "user.attr$I" -v "fairly long string"
mnt/file; done
umount mnt
debugfs -R "stat file" badfs | grep "File ACL"

you'll see:
File ACL: 5143    Directory ACL: 0

so use a hex editor, go to offset (5143x4096) in badfs, and you'll see something
like:

02 EA

which is the attribute magic.  corrupt it by changing it to "EB" or somesuch...
and there you go.  fsck will now go down the codepath that tries to free the
uninit'd variable.  There may be other ways to *hit* it but this is really a
pretty simple flaw, with a simple fix, and this should be sufficient.

Thanks,
-Eric
Comment 24 errata-xmlrpc 2008-07-24 15:58:41 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0732.html

Note You need to log in before you can comment on or make changes to this bug.