Bug 893904

Summary: xfs_repair segfaults when using ag_stride option.
Product: Red Hat Enterprise Linux 6 Reporter: Steven Haigh <netwiz>
Component: xfsprogsAssignee: Eric Sandeen <esandeen>
Status: CLOSED ERRATA QA Contact: Boris Ranto <branto>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.3CC: branto, dchinner, eguan, esandeen, fharshav, lnovich
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: xfsprogs-3.1.1-11.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1020438 (view as bug list) Environment:
Last Closed: 2013-11-21 21:19:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1020438    

Description Steven Haigh 2013-01-10 07:49:35 UTC
Description of problem:
When running xfs_repair with the ag_stride option to attempt to speed up the repair, part way through Phase 2, we receive a segfault.

Version-Release number of selected component (if applicable):
[root@zeus ~]# rpm -qa | grep xfs
xfsprogs-3.1.1-7.el6.x86_64
[root@zeus ~]#

Steps to Reproduce:
[root@zeus ~]# xfs_repair -o bhash=16384 -o ihash=16384 -o ag_stride=16 -t 60 /dev/xvdb
Phase 1 - find and verify superblock...
        - reporting progress in intervals of 1 minute
Phase 2 - using internal log
        - zero log...
XFS: totally zeroed log
Segmentation fault
[root@zeus ~]#

Additional info:
The latest version of xfsprogs (3.1.8) are known to fix this problem.

See:
http://comments.gmane.org/gmane.comp.file-systems.xfs.general/43191
http://blog.jcuff.net/2012/04/turn-xfsrepair-up-to-eleven-with.html

Updating xfsprogs would probably fix a whole lot of issues, as well as increase performance of actual repairs.

Comment 2 Eric Sandeen 2013-01-10 15:33:50 UTC
commit 356357a446424c54851608bb7cebce6cead14590
Author: Christoph Hellwig <hch>
Date:   Fri Mar 2 08:35:08 2012 +0000

    repair: fix incorrect use of thread local data in dir and attr code
    
    The attribute and dirv1 code use pthread thread local data incorrectly in
    a few places, which will make them fail in horrible ways when using the
    ag_stride options.
    
    Replace the use of thread local data with simple local allocations given
    that there is no needed to micro-optimize these allocations as much
    as e.g. the extent map.  The added benefit is that we have to allocate
    less memory, and can free it quickly.
    
    Reviewed-by: Dave Chinner <dchinner>
    Reported-by: Tom Crane <T.Crane.uk>
    Tested-by: Tom Crane <T.Crane.uk>
    Signed-off-by: Christoph Hellwig <hch>

Comment 3 RHEL Program Management 2013-01-14 06:48:22 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 21 errata-xmlrpc 2013-11-21 21:19:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1657.html