Bug 510823 - "Structure needs cleaning" when reading xfs_fsr'd files from an XFS partition (extent count for ino XYZ data fork too low (6) for file format)
Summary: "Structure needs cleaning" when reading xfs_fsr'd files from an XFS partitio...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 13
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Eric Sandeen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-07-11 01:22 UTC by Cristian Ciupitu
Modified: 2010-11-01 22:17 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-11-01 22:17:34 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
xfs_metadump /dev/mapper/luks-Videos /tmp/luks-Videos.xfs_metadump (compressed with lzma) (634.84 KB, application/octet-stream)
2009-07-11 01:22 UTC, Cristian Ciupitu
no flags Details
Big hammer patch (567 bytes, patch)
2009-07-11 04:43 UTC, Eric Sandeen
no flags Details | Diff
Repair output w/ patch on xfsprogs-3.0.1 (4.02 KB, text/plain)
2009-07-11 04:44 UTC, Eric Sandeen
no flags Details
Revised patch to print the correct inode nr (568 bytes, application/force-download)
2009-07-11 04:53 UTC, Eric Sandeen
no flags Details
Revised patch to only clear problematic inodes! (566 bytes, patch)
2009-07-11 05:05 UTC, Eric Sandeen
no flags Details | Diff
xfs_metadump /dev/mapper/luks-Music /tmp/luks-Music.xfs_metadump (compressed with lzma) (4.82 MB, application/octet-stream)
2009-12-25 23:36 UTC, Cristian Ciupitu
no flags Details
xfs_metadump /dev/sda3 /tmp/sda3.xfs_metadump (compressed with lzma) (5.04 MB, application/octet-stream)
2009-12-30 05:58 UTC, Cristian Ciupitu
no flags Details
xfs_metadump /dev/mapper/luks-Videos /tmp/luks-Videos.xfs_metadump (compressed with xz) (5.26 MB, application/octet-stream)
2010-03-28 21:52 UTC, Cristian Ciupitu
no flags Details

Description Cristian Ciupitu 2009-07-11 01:22:56 UTC
Created attachment 351316 [details]
xfs_metadump /dev/mapper/luks-Videos /tmp/luks-Videos.xfs_metadump (compressed with lzma)

Description of problem:
When I'm trying to read some files located on a XFS partition, I'm getting a "Structure needs cleaning" error. I have run:
# xfs_check /dev/mapper/luks-Videos:
extent count for ino 2339503222 data fork too low (6) for file format
extent count for ino 2339503223 data fork too low (6) for file format
extent count for ino 2775216080 data fork too low (6) for file format

and
# xfs_repair -n /dev/mapper/luks-Videos
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

but the error still persists.


Version-Release number of selected component (if applicable):
kernel-2.6.29.5-191.fc11.x86_64
xfsprogs-3.0.0-2.fc11.x86_64


How reproducible:
Only this time.


Steps to Reproduce:
I have no idea how to reproduce. When the file system got corrupted I was changing the metadata (SELinux context, owner, permission) of the files. I don't remember exactly which one I was changing, but it might have been the permissions. The commands that I have used are: 
# chcon -R -t public_content_t /mnt/Videos
# chown -R root.root /mnt/Videos
# find /mnt/Videos -type f -exec chmod 444 {} \; -o -type d -exec chmod 755 {} \;
The strange thing is that I have also run xfs_fsr before this metadata change and there were no errors.

Run "find /mnt/Videos -inum 2339503222 -o -inum 2339503223 -o -inum 2775216080" on the filesystem. You can use the attached metadata dump.

  
Actual results:
find: `....': Structure needs cleaning


Expected results:
No errors.


Additional info:
I have attached a LZMA compressed metadump of the partition.

Kernel errors from /var/log/messages:
Jul 10 23:22:45 hermes kernel: Filesystem "dm-11": corrupt inode 2339503222 (btree).  Unmount and run xfs_repair.
Jul 10 23:22:45 hermes kernel: Filesystem "dm-11": XFS internal error xfs_iformat_btree at line 625 of file fs/xfs/xfs_inode.c.  Caller 0xffffffffa0b1320c
Jul 10 23:22:45 hermes kernel:
Jul 10 23:22:45 hermes kernel: Pid: 9304, comm: find Tainted: P           2.6.29.5-191.fc11.x86_64 #1
Jul 10 23:22:45 hermes kernel: Call Trace:
Jul 10 23:22:45 hermes kernel: [<ffffffffa0b0aacc>] xfs_error_report+0x41/0x43 [xfs]
Jul 10 23:22:45 hermes kernel: [<ffffffffa0b1320c>] ? xfs_iformat+0x2bb/0x451 [xfs]
Jul 10 23:22:45 hermes kernel: [<ffffffffa0b12eb6>] xfs_iformat_btree+0x114/0x1af [xfs]
Jul 10 23:22:45 hermes kernel: [<ffffffffa0b1320c>] xfs_iformat+0x2bb/0x451 [xfs]
Jul 10 23:22:45 hermes kernel: [<ffffffffa0b1344e>] xfs_iread+0xac/0x16e [xfs]
Jul 10 23:22:45 hermes kernel: [<ffffffffa0b0f0c5>] xfs_iget+0x324/0x4ab [xfs]
Jul 10 23:22:45 hermes kernel: [<ffffffffa0b2684d>] xfs_lookup+0x82/0xb1 [xfs]
Jul 10 23:22:45 hermes kernel: [<ffffffffa0b2f4c5>] xfs_vn_lookup+0x45/0x84 [xfs]
Jul 10 23:22:45 hermes kernel: [<ffffffff810dcb7c>] do_lookup+0xcf/0x165
Jul 10 23:22:45 hermes kernel: [<ffffffff810dd753>] __link_path_walk+0x5d0/0x707
Jul 10 23:22:45 hermes kernel: [<ffffffff810ddc88>] path_walk+0x4c/0x8f
Jul 10 23:22:45 hermes kernel: [<ffffffff810deb90>] do_path_lookup+0x10e/0x169
Jul 10 23:22:45 hermes kernel: [<ffffffff810df65b>] ? getname+0x15a/0x1c2
Jul 10 23:22:45 hermes kernel: [<ffffffff810e00af>] user_path_at+0x56/0x93
Jul 10 23:22:45 hermes kernel: [<ffffffff810cb808>] ? virt_to_head_page+0xe/0x31
Jul 10 23:22:45 hermes kernel: [<ffffffff810e9e5c>] ? mntput_no_expire+0x36/0x150
Jul 10 23:22:45 hermes kernel: [<ffffffff810d8787>] vfs_lstat_fd+0x21/0x4c
Jul 10 23:22:45 hermes kernel: [<ffffffff810d6124>] ? __fput+0x191/0x1a0
Jul 10 23:22:45 hermes kernel: [<ffffffff810d88ca>] sys_newfstatat+0x2c/0x4c
Jul 10 23:22:45 hermes kernel: [<ffffffff810dc928>] ? path_put+0x22/0x26
Jul 10 23:22:45 hermes kernel: [<ffffffff810882ea>] ? audit_syscall_entry+0x11e/0x14a
Jul 10 23:22:45 hermes kernel: [<ffffffff813aba79>] ? trace_hardirqs_on_thunk+0x3a/0x3c
Jul 10 23:22:45 hermes kernel: [<ffffffff8101133a>] system_call_fastpath+0x16/0x1b

xfs_db info:
# for i in 2339503222 2339503223 2775216080; do xfs_db -c "inode $i" -c "print" /dev/mapper/luks-Videos ; done > /tmp/xfs_db
core.magic = 0x494e
core.mode = 0100664
core.version = 2
core.format = 3 (btree)
core.nlinkv2 = 1
core.onlink = 0
core.projid = 0
core.uid = 500
core.gid = 0
core.flushiter = 25
core.atime.sec = Sun May 31 11:24:37 2009
core.atime.nsec = 000000000
core.mtime.sec = Fri May 29 02:35:25 2009
core.mtime.nsec = 000000000
core.ctime.sec = Fri Jul 10 22:36:02 2009
core.ctime.nsec = 727133615
core.size = 2575625124
core.nblocks = 628816
core.extsize = 0
core.nextents = 6
core.naextents = 0
core.forkoff = 12
core.aformat = 1 (local)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 2683712888
next_unlinked = null
u.bmbt.level = 1
u.bmbt.numrecs = 1
u.bmbt.keys[1] = [startoff] 1:[0]
u.bmbt.ptrs[1] = 1:105179562
a.sfattr.hdr.totsize = 56
a.sfattr.hdr.count = 1
a.sfattr.list[0].namelen = 7
a.sfattr.list[0].valuelen = 42
a.sfattr.list[0].root = 0
a.sfattr.list[0].secure = 1
a.sfattr.list[0].name = "selinux"
a.sfattr.list[0].value = "unconfined_u:object_r:public_content_t:s0\000"

core.magic = 0x494e
core.mode = 0100664
core.version = 2
core.format = 3 (btree)
core.nlinkv2 = 1
core.onlink = 0
core.projid = 0
core.uid = 500
core.gid = 0
core.flushiter = 17
core.atime.sec = Fri Jul 10 22:08:52 2009
core.atime.nsec = 000000000
core.mtime.sec = Sun Jun  7 18:04:15 2009
core.mtime.nsec = 000000000
core.ctime.sec = Fri Jul 10 22:38:07 2009
core.ctime.nsec = 502009843
core.size = 2575881497
core.nblocks = 628879
core.extsize = 0
core.nextents = 6
core.naextents = 0
core.forkoff = 12
core.aformat = 1 (local)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 2683712881
next_unlinked = null
u.bmbt.level = 1
u.bmbt.numrecs = 1
u.bmbt.keys[1] = [startoff] 1:[0]
u.bmbt.ptrs[1] = 1:107705356
a.sfattr.hdr.totsize = 56
a.sfattr.hdr.count = 1
a.sfattr.list[0].namelen = 7
a.sfattr.list[0].valuelen = 42
a.sfattr.list[0].root = 0
a.sfattr.list[0].secure = 1
a.sfattr.list[0].name = "selinux"
a.sfattr.list[0].value = "unconfined_u:object_r:public_content_t:s0\000"

core.magic = 0x494e
core.mode = 0100664
core.version = 2
core.format = 3 (btree)
core.nlinkv2 = 1
core.onlink = 0
core.projid = 0
core.uid = 500
core.gid = 0
core.flushiter = 21
core.atime.sec = Fri Jul 10 21:57:54 2009
core.atime.nsec = 000000000
core.mtime.sec = Sun Jun  7 17:59:08 2009
core.mtime.nsec = 000000000
core.ctime.sec = Fri Jul 10 22:39:55 2009
core.ctime.nsec = 781008470
core.size = 2423492915
core.nblocks = 591675
core.extsize = 0
core.nextents = 6
core.naextents = 0
core.forkoff = 12
core.aformat = 1 (local)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 537204218
next_unlinked = null
u.bmbt.level = 1
u.bmbt.numrecs = 1
u.bmbt.keys[1] = [startoff] 1:[0]
u.bmbt.ptrs[1] = 1:103075306
a.sfattr.hdr.totsize = 56
a.sfattr.hdr.count = 1
a.sfattr.list[0].namelen = 7
a.sfattr.list[0].valuelen = 42
a.sfattr.list[0].root = 0
a.sfattr.list[0].secure = 1
a.sfattr.list[0].name = "selinux"
a.sfattr.list[0].value = "unconfined_u:object_r:public_content_t:s0\000"

Comment 1 Cristian Ciupitu 2009-07-11 01:29:52 UTC
I don't know if this matters, but the filesystem was mounted with relatime,nosuid,nodev,logbsize=262144.

Comment 2 Cristian Ciupitu 2009-07-11 01:39:14 UTC
More information about the problematic inodes:
# for i in 2339503222 2339503223 2775216080; do xfs_db -c "inode $i" -c "type data" -c "print" -c "type text" -c "print" /dev/mapper/luks-Videos ; echo; done

00: 494e81b4 02030000 000001f4 00000000 00000001 00000000 00000000 00000019
20: 4a223ec5 00000000 4a1f1fbd 00000000 4a579822 2b572daf 00000000 9984eba4
40: 00000000 00099850 00000000 00000006 00000c01 00000000 00000000 9ff63578
60: ffffffff 00010001 00000000 00000000 00000000 00000000 00000000 00000000
80: 00000000 00000000 00000000 00000000 00000000 0644e9aa 00000000 00000000
a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 003d0200
c0: 072a0473 00380100 072a0473 656c696e 7578756e 636f6e66 696e6564 5f753a6f
e0: 626a6563 745f723a 7075626c 69635f63 6f6e7465 6e745f74 3a733000 00000000
00:  49 4e 81 b4 02 03 00 00 00 00 01 f4 00 00 00 00  IN..............
10:  00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 19  ................
20:  4a 22 3e c5 00 00 00 00 4a 1f 1f bd 00 00 00 00  J.......J.......
30:  4a 57 98 22 2b 57 2d af 00 00 00 00 99 84 eb a4  JW...W..........
40:  00 00 00 00 00 09 98 50 00 00 00 00 00 00 00 06  .......P........
50:  00 00 0c 01 00 00 00 00 00 00 00 00 9f f6 35 78  ..............5x
60:  ff ff ff ff 00 01 00 01 00 00 00 00 00 00 00 00  ................
70:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
80:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
90:  00 00 00 00 06 44 e9 aa 00 00 00 00 00 00 00 00  .....D..........
a0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
b0:  00 00 00 00 00 00 00 00 00 00 00 00 00 3d 02 00  ................
c0:  07 2a 04 73 00 38 01 00 07 2a 04 73 65 6c 69 6e  ...s.8.....selin
d0:  75 78 75 6e 63 6f 6e 66 69 6e 65 64 5f 75 3a 6f  uxunconfined.u.o
e0:  62 6a 65 63 74 5f 72 3a 70 75 62 6c 69 63 5f 63  bject.r.public.c
f0:  6f 6e 74 65 6e 74 5f 74 3a 73 30 00 00 00 00 00  ontent.t.s0.....

00: 494e81b4 02030000 000001f4 00000000 00000001 00000000 00000000 00000011
20: 4a5791c4 00000000 4a2bd6ef 00000000 4a57989f 1dec0ff3 00000000 9988d519
40: 00000000 0009988f 00000000 00000006 00000c01 00000000 00000000 9ff63571
60: ffffffff 00010001 00000000 00000000 00000000 00000000 00000000 00000000
80: 00000000 00000000 00000000 00000000 00000000 066b740c 00000000 00000000
a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 003d0200
c0: 072a0473 00380100 072a0473 656c696e 7578756e 636f6e66 696e6564 5f753a6f
e0: 626a6563 745f723a 7075626c 69635f63 6f6e7465 6e745f74 3a733000 00000000
00:  49 4e 81 b4 02 03 00 00 00 00 01 f4 00 00 00 00  IN..............
10:  00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 11  ................
20:  4a 57 91 c4 00 00 00 00 4a 2b d6 ef 00 00 00 00  JW......J.......
30:  4a 57 98 9f 1d ec 0f f3 00 00 00 00 99 88 d5 19  JW..............
40:  00 00 00 00 00 09 98 8f 00 00 00 00 00 00 00 06  ................
50:  00 00 0c 01 00 00 00 00 00 00 00 00 9f f6 35 71  ..............5q
60:  ff ff ff ff 00 01 00 01 00 00 00 00 00 00 00 00  ................
70:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
80:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
90:  00 00 00 00 06 6b 74 0c 00 00 00 00 00 00 00 00  .....kt.........
a0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
b0:  00 00 00 00 00 00 00 00 00 00 00 00 00 3d 02 00  ................
c0:  07 2a 04 73 00 38 01 00 07 2a 04 73 65 6c 69 6e  ...s.8.....selin
d0:  75 78 75 6e 63 6f 6e 66 69 6e 65 64 5f 75 3a 6f  uxunconfined.u.o
e0:  62 6a 65 63 74 5f 72 3a 70 75 62 6c 69 63 5f 63  bject.r.public.c
f0:  6f 6e 74 65 6e 74 5f 74 3a 73 30 00 00 00 00 00  ontent.t.s0.....

00: 494e81b4 02030000 000001f4 00000000 00000001 00000000 00000000 00000015
20: 4a578f32 00000000 4a2bd5bc 00000000 4a57990b 2e8d3e56 00000000 90739133
40: 00000000 0009073b 00000000 00000006 00000c01 00000000 00000000 200515fa
60: ffffffff 00010001 00000000 00000000 00000000 00000000 00000000 00000000
80: 00000000 00000000 00000000 00000000 00000000 0624cdea 00000000 00000000
a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 003d0200
c0: 072a0473 00380100 072a0473 656c696e 7578756e 636f6e66 696e6564 5f753a6f
e0: 626a6563 745f723a 7075626c 69635f63 6f6e7465 6e745f74 3a733000 00000000
00:  49 4e 81 b4 02 03 00 00 00 00 01 f4 00 00 00 00  IN..............
10:  00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 15  ................
20:  4a 57 8f 32 00 00 00 00 4a 2b d5 bc 00 00 00 00  JW.2....J.......
30:  4a 57 99 0b 2e 8d 3e 56 00 00 00 00 90 73 91 33  JW.....V.....s.3
40:  00 00 00 00 00 09 07 3b 00 00 00 00 00 00 00 06  ................
50:  00 00 0c 01 00 00 00 00 00 00 00 00 20 05 15 fa  ................
60:  ff ff ff ff 00 01 00 01 00 00 00 00 00 00 00 00  ................
70:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
80:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
90:  00 00 00 00 06 24 cd ea 00 00 00 00 00 00 00 00  ................
a0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
b0:  00 00 00 00 00 00 00 00 00 00 00 00 00 3d 02 00  ................
c0:  07 2a 04 73 00 38 01 00 07 2a 04 73 65 6c 69 6e  ...s.8.....selin
d0:  75 78 75 6e 63 6f 6e 66 69 6e 65 64 5f 75 3a 6f  uxunconfined.u.o
e0:  62 6a 65 63 74 5f 72 3a 70 75 62 6c 69 63 5f 63  bject.r.public.c
f0:  6f 6e 74 65 6e 74 5f 74 3a 73 30 00 00 00 00 00  ontent.t.s0.....

Comment 3 Cristian Ciupitu 2009-07-11 01:44:56 UTC
Yet more information about the problematic inodes (bmap):
# for i in 2339503222 2339503223 2775216080; do xfs_db -c "inode $i" -c "bmap" /dev/mapper/luks-Videos ; echo; done

data offset 0 startblock 144376995 (4/10159267) count 132941 flag 0
data offset 132941 startblock 142689524 (4/8471796) count 124096 flag 0
data offset 257037 startblock 72876027 (2/5767163) count 119047 flag 0
data offset 376084 startblock 74065920 (2/6957056) count 118696 flag 0
data offset 494780 startblock 79347078 (2/12238214) count 118367 flag 0
data offset 613147 startblock 105163894 (3/4500598) count 15668 flag 0

data offset 0 startblock 8196997 (0/8196997) count 123938 flag 0
data offset 123938 startblock 221519933 (6/20193341) count 122787 flag 0
data offset 246725 startblock 39569279 (1/6014847) count 121824 flag 0
data offset 368549 startblock 82250220 (2/15141356) count 120848 flag 0
data offset 489397 startblock 74539922 (2/7431058) count 119526 flag 0
data offset 608923 startblock 107685401 (3/7022105) count 19955 flag 0

data offset 0 startblock 252167887 (7/17286863) count 120452 flag 0
data offset 120452 startblock 74684141 (2/7575277) count 118288 flag 0
data offset 238740 startblock 47701535 (1/14147103) count 117616 flag 0
data offset 356356 startblock 75260617 (2/8151753) count 117028 flag 0
data offset 473384 startblock 41107902 (1/7553470) count 116662 flag 0
data offset 590046 startblock 103073678 (3/2410382) count 1628 flag 0

Comment 4 Eric Sandeen 2009-07-11 04:43:41 UTC
Created attachment 351324 [details]
Big hammer patch

This patch adds the same check as xfs_check did; when it finds this inode it will clear it out.  That's a bit extreme, I don't -think- that it's really corrupted pe se; if the files are critical you might try just commenting out the corruption check in the kernel, and see if you can copy off the problematic files...

I'll attach the xfs_repair output from my run w/ this patch so you can see what it did; about 8 inodes got nuked.

You could make an un-obfuscated metadump image, restore it to a -file-, run repair on that, mount it, and see what you've got, if you want to do a safe dry run.

Ideally it'd be nice to just pack the extents back into the inode body, I'll see if we can do that.  But this patch should clear out the problem inodes.

Comment 5 Eric Sandeen 2009-07-11 04:44:37 UTC
Created attachment 351325 [details]
Repair output w/ patch on xfsprogs-3.0.1

This is what repair did w/ that patch in place.

Comment 6 Eric Sandeen 2009-07-11 04:49:00 UTC
Note that patch kills off 8 inodes; I think you only found 3 problematic ones via xfs_check so perhaps it's over-zealous or just plain wrong... I'd hold off using it unless you're anxious & brave ;)

Comment 7 Eric Sandeen 2009-07-11 04:53:20 UTC
Created attachment 351326 [details]
Revised patch to print the correct inode nr

last patch was printing wrong inode nrs, oops.

Comment 8 Eric Sandeen 2009-07-11 05:05:47 UTC
Created attachment 351329 [details]
Revised patch to only clear problematic inodes!

ok 3rd time's the charm; with this patch (with the check in the right place, oops) xfs_repair finds (and clears) the same 3 problematic inodes as xfs_check does.

Comment 9 Eric Sandeen 2009-07-11 16:42:21 UTC
I've sent that patch for repair to the list, though I'd still like to know how we -got- here ...

http://oss.sgi.com/archives/xfs/2009-07/msg00103.html

-Eric

Comment 10 Cristian Ciupitu 2009-07-12 01:34:43 UTC
[just for the record]
I disabled the error checking code from the xfs kernel module and I mounted the corrupted filesystem read-only. Then I copied the 3 problematic files without any issues. I can't say for sure that their content is intact because I don't have any checksums/hashes, but a short visual inspection proved that the videos are fine.

Comment 11 Douglas Clowes 2009-10-10 02:06:57 UTC
This has also happened to me on FC10 with video files in two xfs LVMs.

The files were created from at least as early as 20090825 by mythtv - I am not sure that it couldn't have been earlier if mythtv could have removed these files. The latest was created on 20090930 with 18 files and no apparent pattern except that two bad files on the one day on different LVMs is not uncommon. There are quite a lot of files which are not affected.

My xfs/kernel update history, from yum.log, is:
Feb 27 12:13:35 Installed: kernel-2.6.27.15-170.2.24.fc10.x86_64
Feb 27 12:23:00 Installed: kernel-2.6.27.15-170.2.24.fc10.x86_64
Mar 02 07:16:26 Installed: xfsprogs-2.10.2-1.fc10.x86_64
Mar 02 07:16:26 Installed: xfsdump-2.2.48-1.fc9.x86_64
Mar 07 09:58:31 Installed: kernel-2.6.27.19-170.2.35.fc10.x86_64
Mar 07 09:58:57 Installed: kernel-2.6.27.19-170.2.35.fc10.x86_64
Apr 04 11:18:33 Installed: kernel-2.6.27.21-170.2.56.fc10.x86_64
Apr 04 11:19:27 Installed: kernel-2.6.27.21-170.2.56.fc10.x86_64
May 30 09:24:05 Installed: kernel-2.6.27.24-170.2.68.fc10.x86_64
May 30 09:24:16 Installed: kernel-2.6.27.24-170.2.68.fc10.x86_64
Jun 27 10:09:48 Installed: kernel-2.6.27.25-170.2.72.fc10.x86_64
Jun 27 10:10:01 Installed: kernel-2.6.27.25-170.2.72.fc10.x86_64
Aug 08 11:44:40 Installed: kernel-2.6.27.29-170.2.78.fc10.x86_64
Aug 08 11:45:47 Installed: kernel-2.6.27.29-170.2.78.fc10.x86_64
Aug 22 08:32:26 Installed: kernel-2.6.27.29-170.2.79.fc10.x86_64
Aug 22 08:33:22 Installed: kernel-2.6.27.29-170.2.79.fc10.x86_64
Sep 12 10:54:34 Installed: kernel-2.6.27.30-170.2.82.fc10.x86_64
Sep 12 10:55:45 Installed: kernel-2.6.27.30-170.2.82.fc10.x86_64
Oct 05 11:44:04 Installed: kernel-2.6.27.35-170.2.94.fc10.x86_64
Oct 05 11:44:57 Installed: kernel-2.6.27.35-170.2.94.fc10.x86_64

This possibly points to a change in kernel-2.6.27.29-170.2.79.fc10.x86_64 on Aug 22 as introducing a change - or maybe earlier if it wasn't triggered or the files were removed or ...

Like the OP, I also run xfs_fsr, so maybe these files could have had more extents (needing a btree) and been reduced to produce this situation, rather than being created like this. Excuse me if it just doesn't work like this since I am less than experienced here.

I suspect that, as the OP notes, disabling this check in the kernel will reveal the data because this isn't really an error - the extents exist and the data is in them, they're just not in the expected structure. If this is the case, the worst thing you could do would be to destroy the data by clearing the inode. Far better to either fix the structure or to ignore the condition as not an error, both. Just my $0.02 :)

Comment 12 Eric Sandeen 2009-10-10 03:24:27 UTC
(In reply to comment #11)

> I suspect that, as the OP notes, disabling this check in the kernel will reveal
> the data because this isn't really an error - the extents exist and the data is
> in them, they're just not in the expected structure. If this is the case, the
> worst thing you could do would be to destroy the data by clearing the inode.
> Far better to either fix the structure or to ignore the condition as not an
> error, both. Just my $0.02 :)  

It's a tough call, normally this would be a bug/corruption but I think in this particular case, there is some codepath where even though the nr. of extents has been reduced, the inode format hasn't been changed, so you're right, it's sorta-kinda valid - just not how xfs was designed.  :)

It probably would be better to -fix- the problem than to junk the inodes in xfs_repair, but IMHO the kernel should keep flagging it as an error.

I'll need to block out some time to sort out how to make repair do the right thing; in the meantime the repair patch "at least" gets the filesystem clean again (at the expense of a few files...)

FWIW I think it's unlikely that any kernel in the F10 timeframe would have introduced this; this part of xfs really hasn't changed much.  I think it's just a corner case bug in the extent handling.

Comment 13 Eric Sandeen 2009-10-10 03:48:55 UTC
Douglas, any chance that you also changed xattr metadata on these files?

Comment 14 Douglas Clowes 2009-10-10 10:40:23 UTC
Just confirming that I patched xfs_inode.c to disable the check and all of the files are fine. I have copied them all, deleted the old ones and gone back to the stock kernel. They're video files and they look as expected - no missing bits.

The patch, for those who want it is:

diff -uNrp kernel-2.6.27.orig/fs/xfs/xfs_inode.c kernel-2.6.27.new/fs/xfs/xfs_inode.c
--- kernel-2.6.27.orig/fs/xfs/xfs_inode.c	2009-10-10 12:30:11.000000000 +1100
+++ kernel-2.6.27.new/fs/xfs/xfs_inode.c	2009-10-10 14:56:34.000000000 +1100
@@ -630,8 +630,8 @@ xfs_iformat_btree(
 	 * or the number of extents is greater than the number of
 	 * blocks.
 	 */
-	if (unlikely(XFS_IFORK_NEXTENTS(ip, whichfork) <= ifp->if_ext_max
-	    || XFS_BMDR_SPACE_CALC(nrecs) >
+	if (unlikely(/*XFS_IFORK_NEXTENTS(ip, whichfork) <= ifp->if_ext_max
+	    || */XFS_BMDR_SPACE_CALC(nrecs) >
 			XFS_DFORK_SIZE(dip, ip->i_mount, whichfork)
 	    || XFS_IFORK_NEXTENTS(ip, whichfork) > ip->i_d.di_nblocks)) {
 		xfs_fs_repair_cmn_err(CE_WARN, ip->i_mount,

I have stopped running xfs_fsr for the time being.

Comment 15 Eric Sandeen 2009-10-10 14:30:41 UTC
One other thing for interested parties to note is that ifp->if_ext_max also changes;* so it seems somewhat likely that if_ext_max got modified incorrectly, leaving enough room in the inode that btree was not necessary.

* there is a sliding divider between data & attr space in the inodes w/ attr2 format

Comment 16 Eric Sandeen 2009-12-18 20:16:35 UTC
Douglas, now that you've stopped running fsr, has the problem gone away?

Thanks,
-Eric

Comment 17 Douglas Clowes 2009-12-24 13:11:35 UTC
Eric,

Since I have stopped running fsr there has not been a recurrence of the problem.

Temper that information with the knowledge that unless there was a change to precipitate the problem, it ran for quite some time before precipitating the problem above.

Regards,

Douglas

Comment 18 Eric Sandeen 2009-12-24 17:32:21 UTC
Douglas, thanks - I think there's something real here, but just haven't gotten to the bottom of it yet.

-Eric

Comment 19 Cristian Ciupitu 2009-12-25 23:08:08 UTC
The problem still exists for kernel-2.6.31.6-166.fc12.x86_64.

Comment 20 Cristian Ciupitu 2009-12-25 23:36:40 UTC
Created attachment 380368 [details]
xfs_metadump /dev/mapper/luks-Music /tmp/luks-Music.xfs_metadump (compressed with lzma)

Comment 21 Cristian Ciupitu 2009-12-25 23:43:38 UTC
Some details:

# xfs_check  /dev/mapper/luks-Music
extent count for ino 7723 data fork too low (6) for file format
agi unlinked bucket 25 is 18967193 in ag 3 (inode=421620377)
link count mismatch for inode 151967493 (name ?), nlink 2, counted 3
link count mismatch for inode 421620377 (name ?), nlink 0, counted 1


# xfs_repair -n /dev/mapper/luks-Music
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
extent count for ino 7723 data fork too low (6) for file format
bad data fork in inode 7723
would have cleared inode 7723
        - agno = 1
        - agno = 2
7fc8175fe710: Badness in key lookup (length)
bp=(bno 182497856, len 16384 bytes) key=(bno 182497856, len 8192 bytes)
        - agno = 3
        - agno = 4
        - agno = 5
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
entry "~uTorrentPartFile_668C77E8.dat" at block 0 offset 48 in directory inode 7722 references free inode 7723
	would clear inode number in entry at offset 48...
extent count for ino 7723 data fork too low (6) for file format
bad data fork in inode 7723
would have cleared inode 7723
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
entry "~uTorrentPartFile_668C77E8.dat" in directory inode 7722 points to free inode 7723, would junk entry
bad hash table for directory inode 7722 (no data entry): would rebuild
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected dir inode 421620377, would move to lost+found
Phase 7 - verify link counts...
would have reset inode 421620377 nlinks from 0 to 2
No modify flag set, skipping filesystem flush and exiting.

Comment 22 Eric Sandeen 2009-12-28 17:50:45 UTC
Cristian, are you still running xfs_fsr over this filesystem?  Trying to decide whether that may be the root cause, or related to it.

Thanks,
-Eric

Comment 23 Cristian Ciupitu 2009-12-28 19:45:53 UTC
I run xfs_fsr on all my XFS filesystems.

Comment 24 Cristian Ciupitu 2009-12-28 22:22:04 UTC
Some debug information generated with:
# for i in 7722 7723 151967493 421620377; do xfs_db -c "inode $i" -c "print" /dev/mapper/luks-Music ; echo; done

core.magic = 0x494e
core.mode = 040775
core.version = 2
core.format = 2 (extents)
core.nlinkv2 = 19
core.onlink = 0
core.projid = 0
core.uid = 500
core.gid = 500
core.flushiter = 19
core.atime.sec = Sat Dec 26 01:05:53 2009
core.atime.nsec = 160701007
core.mtime.sec = Fri Dec 25 23:20:54 2009
core.mtime.nsec = 257576994
core.ctime.sec = Sat Dec 26 01:05:53 2009
core.ctime.nsec = 170701083
core.size = 4096
core.nblocks = 1
core.extsize = 0
core.nextents = 1
core.naextents = 0
core.forkoff = 12
core.aformat = 1 (local)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 1008644418
next_unlinked = null
u.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,516,1,0]
a.sfattr.hdr.totsize = 56
a.sfattr.hdr.count = 1
a.sfattr.list[0].namelen = 7
a.sfattr.list[0].valuelen = 42
a.sfattr.list[0].root = 0
a.sfattr.list[0].secure = 1
a.sfattr.list[0].name = "selinux"
a.sfattr.list[0].value = "unconfined_u:object_r:public_content_t:s0\000"

core.magic = 0x494e
core.mode = 0100664
core.version = 2
core.format = 3 (btree)
core.nlinkv2 = 1
core.onlink = 0
core.projid = 0
core.uid = 500
core.gid = 500
core.flushiter = 16
core.atime.sec = Fri Dec 25 22:54:56 2009
core.atime.nsec = 087577150
core.mtime.sec = Fri Dec 25 22:54:56 2009
core.mtime.nsec = 087577150
core.ctime.sec = Fri Dec 25 22:54:56 2009
core.ctime.nsec = 087577150
core.size = 7903796
core.nblocks = 1880
core.extsize = 0
core.nextents = 6
core.naextents = 0
core.forkoff = 12
core.aformat = 1 (local)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 1008644418
next_unlinked = null
u.bmbt.level = 1
u.bmbt.numrecs = 1
u.bmbt.keys[1] = [startoff] 1:[0]
u.bmbt.ptrs[1] = 1:27403191
a.sfattr.hdr.totsize = 56
a.sfattr.hdr.count = 1
a.sfattr.list[0].namelen = 7
a.sfattr.list[0].valuelen = 42
a.sfattr.list[0].root = 0
a.sfattr.list[0].secure = 1
a.sfattr.list[0].name = "selinux"
a.sfattr.list[0].value = "unconfined_u:object_r:public_content_t:s0\000"

core.magic = 0x494e
core.mode = 040755
core.version = 2
core.format = 1 (local)
core.nlinkv2 = 2
core.onlink = 0
core.projid = 0
core.uid = 500
core.gid = 500
core.flushiter = 28
core.atime.sec = Sat Dec 26 01:05:53 2009
core.atime.nsec = 066700918
core.mtime.sec = Tue Nov 17 03:50:26 2009
core.mtime.nsec = 475625881
core.ctime.sec = Sat Dec 26 01:05:53 2009
core.ctime.nsec = 066700918
core.size = 72
core.nblocks = 0
core.extsize = 0
core.nextents = 0
core.naextents = 0
core.forkoff = 12
core.aformat = 1 (local)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 3501057926
next_unlinked = null
u.sfdir2.hdr.count = 2
u.sfdir2.hdr.i8count = 0
u.sfdir2.hdr.parent.i4 = 151108920
u.sfdir2.list[0].namelen = 21
u.sfdir2.list[0].offset = 0x60
u.sfdir2.list[0].name = "Desperado - Theme.mp3"
u.sfdir2.list[0].inumber.i4 = 151967494
u.sfdir2.list[1].namelen = 31
u.sfdir2.list[1].offset = 0x80
u.sfdir2.list[1].name = "Shakespeare in Love - Theme.mp3"
u.sfdir2.list[1].inumber.i4 = 151967495
a.sfattr.hdr.totsize = 56
a.sfattr.hdr.count = 1
a.sfattr.list[0].namelen = 7
a.sfattr.list[0].valuelen = 42
a.sfattr.list[0].root = 0
a.sfattr.list[0].secure = 1
a.sfattr.list[0].name = "selinux"
a.sfattr.list[0].value = "unconfined_u:object_r:public_content_t:s0\000"

core.magic = 0x494e
core.mode = 040755
core.version = 2
core.format = 1 (local)
core.nlinkv2 = 0
core.onlink = 0
core.projid = 0
core.uid = 500
core.gid = 500
core.flushiter = 26
core.atime.sec = Tue Nov 17 03:50:23 2009
core.atime.nsec = 856625848
core.mtime.sec = Tue Nov 17 03:50:23 2009
core.mtime.nsec = 842626205
core.ctime.sec = Tue Nov 17 03:50:26 2009
core.ctime.nsec = 475625881
core.size = 6
core.nblocks = 0
core.extsize = 0
core.nextents = 0
core.naextents = 0
core.forkoff = 12
core.aformat = 1 (local)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 871547417
next_unlinked = null
u.sfdir2.hdr.count = 0
u.sfdir2.hdr.i8count = 0
u.sfdir2.hdr.parent.i4 = 151967493
a.sfattr.hdr.totsize = 56
a.sfattr.hdr.count = 1
a.sfattr.list[0].namelen = 7
a.sfattr.list[0].valuelen = 42
a.sfattr.list[0].root = 0
a.sfattr.list[0].secure = 1
a.sfattr.list[0].name = "selinux"
a.sfattr.list[0].value = "unconfined_u:object_r:public_content_t:s0\000"

Comment 25 Cristian Ciupitu 2009-12-30 05:58:42 UTC
Created attachment 380895 [details]
xfs_metadump /dev/sda3 /tmp/sda3.xfs_metadump (compressed with lzma)

Comment 26 Cristian Ciupitu 2009-12-30 06:13:01 UTC
I also have this problem with /dev/sda3.

# xfs_check /dev/sda3
extent count for ino 407239594 data fork too low (7) for file format

# xfs_repair -n /dev/sda3
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
extent count for ino 407239594 data fork too low (7) for file format
bad data fork in inode 407239594
would have cleared inode 407239594
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
entry "[CENSORED].avi" at block 1 offset 208 in directory inode 337915797 references free inode 407239594
	would clear inode number in entry at offset 208...
extent count for ino 407239594 data fork too low (7) for file format
bad data fork in inode 407239594
would have cleared inode 407239594
        - agno = 2
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
entry "[CENSORED].avi" in directory inode 337915797 points to free inode 407239594, would junk entry
bad hash table for directory inode 337915797 (no data entry): would rebuild
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

# for i in 407239594 337915797; do  xfs_db -c "inode $i" -c "print" /dev/sda3 -c "type data" -c "print" -c "type text" -c "print"; echo; done
core.magic = 0x494e
core.mode = 0100664
core.version = 2
core.format = 3 (btree)
core.nlinkv2 = 1
core.onlink = 0
core.projid = 0
core.uid = 500
core.gid = 500
core.flushiter = 7
core.atime.sec = Mon Dec 14 02:01:26 2009
core.atime.nsec = 347739000
core.mtime.sec = Thu Nov 19 02:32:45 2009
core.mtime.nsec = 344750000
core.ctime.sec = Wed Dec 23 05:26:10 2009
core.ctime.nsec = 173022406
core.size = 732743680
core.nblocks = 178895
core.extsize = 0
core.nextents = 7
core.naextents = 1
core.forkoff = 15
core.aformat = 2 (extents)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 2666163675
next_unlinked = null
u.bmbt.level = 1
u.bmbt.numrecs = 1
u.bmbt.keys[1] = [startoff] 1:[0]
u.bmbt.ptrs[1] = 1:5752896
a.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,25452469,1,0]
00: 494e81b4 02030000 000001f4 000001f4 00000001 00000000 00000000 00000007
20: 4b258056 14ba1378 4b04922d 148c77b0 4b318dd2 0a501cc6 00000000 2bacc800
40: 00000000 0002bacf 00000000 00000007 00010f02 00000000 00000000 9eea6ddb
60: ffffffff 00010001 00000000 00000000 8fc05c74 00000000 00b8e800 00000b00
80: 2f4045cc 00000000 01448000 00002b97 f4e05216 00000000 01e8ac00 000031be
a0: 00000000 0057c840 027b4000 00006cd2 63404f80 00000000 031a4000 00006e43
c0: 67e03b60 00000000 00000000 00330100 07250473 656c696e 7578756e 00000000
e0: 00000000 0000308b f6a00001 745f723a 75736572 5f686f6d 655f743a 73300000
00:  49 4e 81 b4 02 03 00 00 00 00 01 f4 00 00 01 f4  IN..............
10:  00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 07  ................
20:  4b 25 80 56 14 ba 13 78 4b 04 92 2d 14 8c 77 b0  K..V...xK.....w.
30:  4b 31 8d d2 0a 50 1c c6 00 00 00 00 2b ac c8 00  K1...P..........
40:  00 00 00 00 00 02 ba cf 00 00 00 00 00 00 00 07  ................
50:  00 01 0f 02 00 00 00 00 00 00 00 00 9e ea 6d db  ..............m.
60:  ff ff ff ff 00 01 00 01 00 00 00 00 00 00 00 00  ................
70:  8f c0 5c 74 00 00 00 00 00 b8 e8 00 00 00 0b 00  ...t............
80:  2f 40 45 cc 00 00 00 00 01 44 80 00 00 00 2b 97  ..E......D......
90:  f4 e0 52 16 00 00 00 00 01 e8 ac 00 00 00 31 be  ..R...........1.
a0:  00 00 00 00 00 57 c8 40 02 7b 40 00 00 00 6c d2  .....W........l.
b0:  63 40 4f 80 00 00 00 00 03 1a 40 00 00 00 6e 43  c.O...........nC
c0:  67 e0 3b 60 00 00 00 00 00 00 00 00 00 33 01 00  g............3..
d0:  07 25 04 73 65 6c 69 6e 75 78 75 6e 00 00 00 00  ...selinuxun....
e0:  00 00 00 00 00 00 30 8b f6 a0 00 01 74 5f 72 3a  ......0.....t.r.
f0:  75 73 65 72 5f 68 6f 6d 65 5f 74 3a 73 30 00 00  user.home.t.s0..

core.magic = 0x494e
core.mode = 040775
core.version = 2
core.format = 2 (extents)
core.nlinkv2 = 6
core.onlink = 0
core.projid = 0
core.uid = 500
core.gid = 500
core.flushiter = 85
core.atime.sec = Wed Dec 23 05:51:45 2009
core.atime.nsec = 455223030
core.mtime.sec = Wed Dec 23 05:51:40 2009
core.mtime.nsec = 689223375
core.ctime.sec = Wed Dec 30 07:28:37 2009
core.ctime.nsec = 973371696
core.size = 8192
core.nblocks = 3
core.extsize = 0
core.nextents = 3
core.naextents = 0
core.forkoff = 12
core.aformat = 1 (local)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 1
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 4119265208
next_unlinked = null
u.bmx[0-2] = [startoff,startblock,blockcount,extentflag] 0:[0,21494161,1,0] 1:[1,21494160,1,0] 2:[8388608,21494178,1,0]
a.sfattr.hdr.totsize = 56
a.sfattr.hdr.count = 1
a.sfattr.list[0].namelen = 7
a.sfattr.list[0].valuelen = 42
a.sfattr.list[0].root = 0
a.sfattr.list[0].secure = 1
a.sfattr.list[0].name = "selinux"
a.sfattr.list[0].value = "unconfined_u:object_r:public_content_t:s0\000"
00: 494e41fd 02020000 000001f4 000001f4 00000006 00000000 00000000 00000055
20: 4b3193d1 1b2226f6 4b3193cc 2914b6cf 4b3ae505 3a047930 00000000 00002000
40: 00000000 00000003 00000000 00000003 00000c01 00000000 00000040 f586ffb8
60: ffffffff 00000000 00000000 000028ff 32200001 00000000 00000200 000028ff
80: 32000001 00000001 00000000 000028ff 34400001 84782e73 66761424 2f970000
a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
c0: 00000000 00380100 072a0473 656c696e 7578756e 636f6e66 696e6564 5f753a6f
e0: 626a6563 745f723a 7075626c 69635f63 6f6e7465 6e745f74 3a733000 00000000
00:  49 4e 41 fd 02 02 00 00 00 00 01 f4 00 00 01 f4  INA.............
10:  00 00 00 06 00 00 00 00 00 00 00 00 00 00 00 55  ...............U
20:  4b 31 93 d1 1b 22 26 f6 4b 31 93 cc 29 14 b6 cf  K1......K1......
30:  4b 3a e5 05 3a 04 79 30 00 00 00 00 00 00 20 00  K.....y0........
40:  00 00 00 00 00 00 00 03 00 00 00 00 00 00 00 03  ................
50:  00 00 0c 01 00 00 00 00 00 00 00 40 f5 86 ff b8  ................
60:  ff ff ff ff 00 00 00 00 00 00 00 00 00 00 28 ff  ................
70:  32 20 00 01 00 00 00 00 00 00 02 00 00 00 28 ff  2...............
80:  32 00 00 01 00 00 00 01 00 00 00 00 00 00 28 ff  2...............
90:  34 40 00 01 84 78 2e 73 66 76 14 24 2f 97 00 00  4....x.sfv......
a0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
b0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
c0:  00 00 00 00 00 38 01 00 07 2a 04 73 65 6c 69 6e  .....8.....selin
d0:  75 78 75 6e 63 6f 6e 66 69 6e 65 64 5f 75 3a 6f  uxunconfined.u.o
e0:  62 6a 65 63 74 5f 72 3a 70 75 62 6c 69 63 5f 63  bject.r.public.c
f0:  6f 6e 74 65 6e 74 5f 74 3a 73 30 00 00 00 00 00  ontent.t.s0.....

Comment 27 Cristian Ciupitu 2009-12-30 06:14:10 UTC
Comment on attachment 380895 [details]
xfs_metadump /dev/sda3 /tmp/sda3.xfs_metadump (compressed with lzma)

The dump was made while the partition was mounted read-only.

Comment 28 chellwig@redhat.com 2010-02-06 18:25:04 UTC
This should be fixed by upstream commit e09f98606dcc156de1146c209d45a0d6d5f51c3f

Comment 29 Cristian Ciupitu 2010-02-07 21:00:59 UTC
Is xfsprogs-3.0.3-2.fc12.x86_64 (the latest) good enough or should I use a newer version?

Comment 30 Eric Sandeen 2010-02-07 21:20:45 UTC
It was actually a kernelside fix that avoids the problem you see.

I'll get that into F12 kernels today; will take a while 'til the next respin gets out.

-Eric

Comment 31 Eric Sandeen 2010-02-07 22:18:35 UTC
Ok, kylem was nice enough to put this in F11 and F12 for me.

* Sun Feb 07 2010 Kyle McMartin <kyle> 2.6.32.8-48.rc2
- xfs: xfs_swap_extents needs to handle dynamic fork offsets (rhbz#510823)
  from sandeen.

Comment 32 Kyle McMartin 2010-02-07 22:53:57 UTC
Builds submitted, should be in all kernels as of 6PM EST 2010-02-07.

cheers, Kyle

Comment 33 Cristian Ciupitu 2010-02-08 01:07:21 UTC
(In reply to comment #30)
> It was actually a kernelside fix that avoids the problem you see.

I knew that the kernel was changed, but I thought that xfsprogs might need to be updated, too. Thank you for all your help!

(In reply to comment #32)
Thank you for starting the build!

Comment 34 Fedora Update System 2010-02-09 22:14:37 UTC
kernel-2.6.31.12-174.2.17.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/kernel-2.6.31.12-174.2.17.fc12

Comment 35 Cristian Ciupitu 2010-02-10 19:00:48 UTC
I have booted with the new 2.6.32.7-37.fc12.x86_64 kernel, run xfs_fsr on two partitions and xfs_check hasn't reported any errors, so the fix seems to be working.

Comment 36 Fedora Update System 2010-02-11 12:03:44 UTC
kernel-2.6.31.12-174.2.19.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/kernel-2.6.31.12-174.2.19.fc12

Comment 37 Fedora Update System 2010-02-16 13:17:53 UTC
kernel-2.6.31.12-174.2.19.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 38 Eric Sandeen 2010-03-10 02:45:57 UTC
Not sure why bodhi didn't close this one... fix is pushed to stable.

Comment 39 Cristian Ciupitu 2010-03-28 21:52:08 UTC
Created attachment 403143 [details]
xfs_metadump /dev/mapper/luks-Videos /tmp/luks-Videos.xfs_metadump (compressed with xz)

Comment 40 Cristian Ciupitu 2010-03-28 22:03:41 UTC
I'm reopening the bug because I've encountered it again after resizing /dev/mapper/luks-Videos and running xfs_fsr on it.

[root@hermes ~]# xfs_repair -n -v /dev/mapper/luks-Videos 
Phase 1 - find and verify superblock...
        - block cache size set to 353136 entries
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
extent count for ino 3762551802 data fork too low (6) for file format
bad data fork in inode 3762551802
would have cleared inode 3762551802
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
entry "00000.m2ts" at block 0 offset 408 in directory inode 3762551786 references free inode 3762551802
	would clear inode number in entry at offset 408...
extent count for ino 3762551802 data fork too low (6) for file format
bad data fork in inode 3762551802
would have cleared inode 3762551802
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
entry "00000.m2ts" in directory inode 3762551786 points to free inode 3762551802, would junk entry
bad hash table for directory inode 3762551786 (no data entry): would rebuild
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

        XFS_REPAIR Summary    Mon Mar 29 00:14:06 2010

Phase		Start		End		Duration
Phase 1:	03/29 00:14:02	03/29 00:14:02	
Phase 2:	03/29 00:14:02	03/29 00:14:03	1 second
Phase 3:	03/29 00:14:03	03/29 00:14:04	1 second
Phase 4:	03/29 00:14:04	03/29 00:14:06	2 seconds
Phase 5:	Skipped
Phase 6:	03/29 00:14:06	03/29 00:14:06	
Phase 7:	03/29 00:14:06	03/29 00:14:06	

Total run time: 4 seconds

Comment 41 Dave Chinner 2010-03-29 01:35:42 UTC
I can't really tell what went wrong from the metadump - it only tells me what repair has already told us. Do you have the output of xfs_fsr from that run? Are there any other log entries along the lines of "format incompatible for exchange" in the logs during the fsr run?

FWIW, can you run xfs_fsr with the "-v -d" options in future and save the output so we might be able to correlate the actions of fsr with the resultant corruption?

Comment 42 Cristian Ciupitu 2010-03-29 07:48:50 UTC
Unfortunately I don't have any xfs_fsr logs. By the way, I'm running 2.6.32.9-70.fc12.x86_64 right now. I have the following entries in /var/log/messages, but I think that dm-14 is another filesystem (I don't know how partitions are mapped to dms):

Mar 28 05:15:40 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x57739b format is incompatible for exchanging.
Mar 28 05:16:28 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x57746a format is incompatible for exchanging.
Mar 28 05:26:37 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x626f56 format is incompatible for exchanging.
Mar 28 05:27:11 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x626f73 format is incompatible for exchanging.
Mar 28 05:27:38 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x626f78 format is incompatible for exchanging.
Mar 28 05:28:07 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x626f79 format is incompatible for exchanging.
Mar 28 05:28:24 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x6982f6 format is incompatible for exchanging.
Mar 28 05:28:43 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x802d06d4 format is incompatible for exchanging.
Mar 28 05:31:32 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x802d0b0e format is incompatible for exchanging.
Mar 28 05:45:19 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x80305be6 format is incompatible for exchanging.
Mar 28 05:45:40 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x80305bfe format is incompatible for exchanging.
Mar 28 05:45:58 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x80305c12 format is incompatible for exchanging.
Mar 28 05:46:07 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x80306dda format is incompatible for exchanging.
Mar 28 05:46:26 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x811a8273 format is incompatible for exchanging.
Mar 28 05:55:04 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x57739b format is incompatible for exchanging.
Mar 28 05:55:07 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x57746a format is incompatible for exchanging.
Mar 28 05:55:26 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x626f56 format is incompatible for exchanging.
Mar 28 05:56:01 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x626f73 format is incompatible for exchanging.
Mar 28 05:56:27 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x626f78 format is incompatible for exchanging.
Mar 28 05:56:55 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x626f79 format is incompatible for exchanging.
Mar 28 05:57:12 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x6982f6 format is incompatible for exchanging.
Mar 28 05:57:31 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x802d06d4 format is incompatible for exchanging.
Mar 28 05:57:55 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x802d0b0e format is incompatible for exchanging.
Mar 28 05:58:03 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x80305be6 format is incompatible for exchanging.
Mar 28 05:58:23 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x80305bfe format is incompatible for exchanging.
Mar 28 05:58:43 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x80305c12 format is incompatible for exchanging.
Mar 28 05:58:51 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x80306dda format is incompatible for exchanging.
Mar 28 05:59:11 hermes kernel: Filesystem "dm-14": fs/xfs/xfs_dfrag.c: inode 0x811a8273 format is incompatible for exchanging.

I'll try to keep logs of xfs_fsr runs from now on.

Comment 43 Eric Sandeen 2010-11-01 18:45:33 UTC
Have you seen further problems?  Not sure what to do with this bug at this point.

Comment 44 Cristian Ciupitu 2010-11-01 22:06:59 UTC
I'm using kernel-2.6.34.7-61.fc13.x86_64 and xfsprogs-3.1.1-7.fc13.x86_64 right now and everything seems to be fine after running xfs_fsr.

Though xfs_fsr didn't do too much. I got a lot "insufficient freespace for: " because my partitions are almost full.

Comment 45 Eric Sandeen 2010-11-01 22:17:34 UTC
I don't know that we can do anything further with this bug; if you come up with problems again in the future, please re-open, or file a new one, and we'll pursue it.

Thanks,
-Eric


Note You need to log in before you can comment on or make changes to this bug.