607843 – fsck exits with signal 11 (segmentation fault)

Bug 607843 - fsck exits with signal 11 (segmentation fault)

Summary: fsck exits with signal 11 (segmentation fault)

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	e2fsprogs
Sub Component:
Version:	5.5
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Eric Sandeen
QA Contact:	BaseOS QE - Apps
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1005192 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-06-25 01:44 UTC by Lachlan McIlroy
Modified:	2018-11-14 12:43 UTC (History)
CC List:	6 users (show)
Fixed In Version:	e2fsprogs-1.39-27.el5
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-07-21 09:05:50 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Patch that should fix the problem (1.08 KB, patch) 2010-06-25 01:50 UTC, Lachlan McIlroy	no flags	Details \| Diff
Patch to prevent floating point precision errors (2.20 KB, patch) 2010-06-30 02:00 UTC, Lachlan McIlroy	no flags	Details \| Diff
Show Obsolete (1) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2011:1080	0	normal	SHIPPED_LIVE	e2fsprogs bug fix and enhancement update	2011-07-21 09:04:54 UTC

Description Lachlan McIlroy 2010-06-25 01:44:48 UTC

Description of problem:

Customer reports that fsck is segfaulting every time when checking a particular volume.

# fsck.ext3 /dev/VolGroup99/LogVol00
e2fsck 1.39 (29-May-2006)
sasroot has been mounted 34 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Segmentation fault

Version-Release number of selected component (if applicable):
e2fsprogs-1.39-23.el5.x86_64

How reproducible:
Everytime the customer runs fsck on the above volume.
 
Additional info:

I asked the customer for an e2image of that volume and ran fsck on it but it succeeded without a problem:

e2fsck 1.39 (29-May-2006)
sasroot has been mounted 34 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
sasroot: 298241/10485760 files (0.8% non-contiguous), 2629160/20971520 blocks

I asked them to run fsck under gdb to get a stacktrace:

(gdb) run /dev/VolGroup99/LogVol00
Starting program: /sbin/fsck.ext3 /dev/VolGroup99/LogVol00
e2fsck 1.39 (29-May-2006)
sasroot has been mounted 34 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure

Program received signal SIGSEGV, Segmentation fault.
0x000000000041c647 in get_icount_el (icount=0xbe48030, ino=1671252, create=1) at icount.c:257
257                     if (ino == icount->list[mid].ino) {
(gdb) bt
#0  0x000000000041c647 in get_icount_el (icount=0xbe48030, ino=1671252, create=1) at icount.c:257
#1  0x000000000041cd75 in ext2fs_icount_increment (icount=0xbe48030, ino=1671252, ret=0x7fff41c59bce) at icount.c:378
#2  0x000000000040ada4 in check_dir_block (fs=0xbb41e30, db=0x2af49ba21e68, priv_data=0x7fff41c59c50) at pass2.c:1014
#3  0x000000000041a673 in ext2fs_dblist_iterate (dblist=0xbb5dd30, func=0x40a720 <check_dir_block>,  
  priv_data=0x7fff41c59c50) at dblist.c:235
#4  0x000000000040a2d7 in e2fsck_pass2 (ctx=0xbb41b30) at pass2.c:149
#5  0x00000000004030a6 in e2fsck_run (ctx=0xbb41b30) at e2fsck.c:203
#6  0x00000000004023af in main (argc=<value optimized out>, argv=<value optimized out>) at unix.c:1148 

It looks like 'mid' has a bad value and we've run off the end of the list array.

It corresponds to this code in lib/ext2fs/icount.c:get_icount_el()

        while (low <= high) {
#if 0
                mid = (low+high)/2;
#else
                if (low == high)
                        mid = low;
                else {
                        /* Interpolate for efficiency */
                        lowval = icount->list[low].ino;
                        highval = icount->list[high].ino;

                        if (ino < lowval)
                                range = 0;
                        else if (ino > highval)
                                range = 1;
                        else 
                                range = ((float) (ino - lowval)) /
                                        (highval - lowval);
                        mid = low + ((int) (range * (high-low)));
                }
#endif
                if (ino == icount->list[mid].ino) {
                        icount->cursor = mid+1;
                        return &icount->list[mid];
                }
                if (ino < icount->list[mid].ino)
                        high = mid-1;
                else
                        low = mid+1;
        }

So looks like we might have a floating point error when calculating mid.

Comment 2 Lachlan McIlroy 2010-06-25 01:50:09 UTC

Created attachment 426741 [details]
Patch that should fix the problem

This patch was sourced from:
http://git.kernel.org/?p=fs/ext2/e2fsprogs.git;a=commitdiff;h=641b66bc7ee0a880b0eb0125dff5f8ed8dd5a160

Comment 5 Lachlan McIlroy 2010-06-30 02:00:26 UTC

Created attachment 427842 [details]
Patch to prevent floating point precision errors

This version of the patch fixes two more cases of the same bug.

The first patch was tested by the customer and it allowed e2fsck to run a bit further before hitting the same bug in another bit of code:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000413f0f in get_refcount_el (refcount=0x18f4d730, blk=12354051, create=0) at ea_refcount.c:202
202                     if (blk == refcount->list[mid].ea_blk) {

This patch fixes this case and also a third case found by inspection.

Comment 6 Eric Sandeen 2010-07-05 16:34:25 UTC

Lachlan, thanks for the patch and the upstream submission

Comment 9 RHEL Program Management 2010-08-09 19:44:20 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 13 Eric Sandeen 2011-01-28 00:10:37 UTC

Built & tagged in e2fsprogs-1.39-27.el5

Comment 18 errata-xmlrpc 2011-07-21 09:05:50 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1080.html

Comment 19 errata-xmlrpc 2011-07-21 12:38:46 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1080.html

Comment 20 Soham Chakraborty 2013-10-02 11:12:47 UTC

*** Bug 1005192 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.