Bug 1033480

Summary: xfs_repair: fatal error -- could not iget root inode -- error - 117 when repairing broken filesystems
Product: Red Hat Enterprise Linux 7 Reporter: Eryu Guan <eguan>
Component: xfsprogsAssignee: Eric Sandeen <esandeen>
Status: CLOSED CURRENTRELEASE QA Contact: Eryu Guan <eguan>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: branto, dchinner, eguan, jjarvis, linuxdev-kernel-it, qcai, rwheeler
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: xfsprogs-3.2.0-0.6.alpha2.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 09:53:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 807834    
Attachments:
Description Flags
033.full none

Description Eryu Guan 2013-11-22 08:26:41 UTC
Created attachment 827637 [details]
033.full

Description of problem:
xfstests xfs/033 fails

=== xfs/033.out.bad ===
QA output created by 033
meta-data=DDEV isize=XXX agcount=N, agsize=XXX blks
data     = bsize=XXX blocks=XXX, imaxpct=PCT
         = sunit=XXX swidth=XXX, unwritten=X
naming   =VERN bsize=XXX
log      =LDEV bsize=XXX blocks=XXX
realtime =RDEV extsz=XXX blocks=XXX, rtextents=XXX
Corrupting root inode - setting bits to 0
Wrote X.XXKb (value 0x0)
Phase 1 - find and verify superblock...
Phase 2 - using <TYPEOF> log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
bad magic number 0x0 on inode INO
bad version number 0x0 on inode INO
bad magic number 0x0 on inode INO, resetting magic number
bad version number 0x0 on inode INO, resetting version number
imap claims a free inode INO is in use, correcting imap and clearing inode
cleared root inode INO
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
root inode lost
        - check for inodes claiming duplicate blocks...
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
reinitializing root directory
xfs_imap_to_bp: xfs_trans_read_buf() returned error 117.
cache_node_purge: refcount was 1, not zero (node=0x1e3c220)

fatal error -- could not iget root inode -- error - 117
_check_xfs_filesystem: filesystem on /dev/sda6 is inconsistent (c) (see /var/lib/xfstests/results//xfs/033.full)
_check_xfs_filesystem: filesystem on /dev/sda6 is inconsistent (r) (see /var/lib/xfstests/results//xfs/033.full)

Version-Release number of selected component (if applicable):
xfsprogs-3.2.0-0.1.alpha1.el7

How reproducible:
always

Steps to Reproduce:
1. check xfs/033 on xfs
2.
3.

Actual results:
test fails

Expected results:
test passes

Additional info:

Comment 4 Eryu Guan 2014-01-10 13:19:02 UTC
xfsprogs-3.2.0-0.4.alpha2.el7 still fails but with different error message.

*** xfs_repair -n output ***
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
xfs_inode_buf_verify: XFS_CORRUPTION_ERROR
bad magic number 0x0 on inode 32
bad version number 0x0 on inode 32
bad magic number 0x0 on inode 32, would reset magic number
bad version number 0x0 on inode 32, would reset version number
imap claims a free inode 32 is in use, would correct imap and clear inode
would clear root inode 32
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
root inode would be lost
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
bad magic number 0x0 on inode 32, would reset magic number
bad version number 0x0 on inode 32, would reset version number
        - agno = 2
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
would reinitialize root directory
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.
*** end xfs_repair output

Comment 5 Eryu Guan 2014-01-17 07:37:01 UTC
*** Bug 1054636 has been marked as a duplicate of this bug. ***

Comment 7 Eric Sandeen 2014-02-19 21:59:26 UTC
commit dd9093de944cd802427bd42953ad5ccc1d5fb875
Author: Dave Chinner <dchinner>
Date:   Mon Feb 3 11:55:36 2014 +1100

    xfs_repair: fix discontiguous directory block support
    
seems to have fixed this, though I'm not quite sure why.

Comment 8 Eric Sandeen 2014-02-21 18:29:15 UTC
Ok, that commit only masked the problem.  Dave sent 2 patches upstream which should fix this:

[PATCH 1/2] libxfs: contiguous buffers are not discontigous
[PATCH 2/2] libxfs: clear stale buffer errors on write

Comment 9 Eric Sandeen 2014-02-21 19:02:22 UTC
*** Bug 1059250 has been marked as a duplicate of this bug. ***

Comment 11 Eryu Guan 2014-02-25 03:13:13 UTC
Confirmed xfs/033 passed with xfsprogs-3.2.0-0.6.alpha2.el7

Set to VERIFIED.

Comment 12 Ludek Smid 2014-06-13 09:53:40 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.