Bug 1211084 - [XFS] kernel: XFS (dm-0): xfs_log_force: error -5 returned [NEEDINFO]
Summary: [XFS] kernel: XFS (dm-0): xfs_log_force: error -5 returned
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 20
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: fs-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-12 21:23 UTC by Cristian Ciupitu
Modified: 2015-06-30 00:20 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-06-30 00:20:30 UTC
Type: Bug
Embargoed:
kernel-team: needinfo?


Attachments (Terms of Use)

Description Cristian Ciupitu 2015-04-12 21:23:45 UTC
Description of problem:
I encountered an error on an XFS partition while the partition was mounted. I umounted it and run `xfs_repair -n` on it.

If it matters, the partition has very little free space on it (a couple of MBs) and I've had to remove or hardlink a couple of files during the last week because it kept filling up with new files.

Version-Release number of selected component (if applicable):
kernel-3.19.3-100.fc20.x86_64

How reproducible:
Only once.

Steps to Reproduce:
I have no idea.

Actual results:
kernel: XFS (dm-0): xfs_log_force: error -5 returned

Also:

    # xfs_repair -n /dev/mapper/oldHermesVG-homeVol
    Phase 1 - find and verify superblock...
    Phase 2 - using internal log
            - scan filesystem freespace and inode maps...
    sb_ifree 185804, counted 185877
    sb_fdblocks 8399, counted 24
            - found root inode chunk
    Phase 3 - for each AG...
            - scan (but don't clear) agi unlinked lists...
            - process known inodes and perform inode discovery...
            - agno = 0
            - agno = 1
            - agno = 2
            - agno = 3
            - agno = 4
            - agno = 5
            - process newly discovered inodes...
    Phase 4 - check for duplicate blocks...
            - setting up duplicate extent list...
            - check for inodes claiming duplicate blocks...
            - agno = 0
            - agno = 3
            - agno = 2
            - agno = 4
            - agno = 5
            - agno = 1
    No modify flag set, skipping phase 5
    Phase 6 - check inode connectivity...
            - traversing filesystem ...
            - traversal finished ...
            - moving disconnected inodes to lost+found ...
    Phase 7 - verify link counts...
    No modify flag set, skipping filesystem flush and exiting.


Expected results:
No errors.

Additional info:

Comment 1 Eric Sandeen 2015-04-13 18:54:26 UTC
There's nothing to go on here.

> [XFS] kernel: XFS (dm-0): xfs_log_force: error -5 returned

is a very generic message essentially meaning that the filesystem has encountered a runtime error and has shut down.  Do you have the dmesg or kernel logs leading to this event?

Thanks,
-Eric

Comment 2 Cristian Ciupitu 2015-04-14 09:59:53 UTC
I've run `journalctl --dmesg` and there's stuff like this:

03:05:01 hermes kernel: XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 1411 of file fs/xfs/libxfs/xfs_alloc.c.  Caller xfs_alloc_ag_vextent+0xad/0x120 [xfs]
03:05:01 hermes kernel: CPU: 1 PID: 3234 Comm: kworker/u16:1 Tainted: G           OE  3.19.3-100.fc20.x86_64 #1
03:05:01 hermes kernel: Hardware name:                  /DZ77BH-55K, BIOS BHZ7710H.86A.0100.2013.0517.0942 05/17/2013
03:05:01 hermes kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:0)
03:05:01 hermes kernel:  0000000000000000 0000000096fc3795 ffff8805bb25f418 ffffffff8175affa
03:05:01 hermes kernel:  0000000000000000 ffff8805bb25f578 ffff8805bb25f438 ffffffffa01e87ce
03:05:01 hermes kernel:  ffffffffa01a8b1d ffffffffa01a5a6b ffff8805bb25f4a8 ffffffffa01a74dd
03:05:01 hermes kernel: Call Trace:
03:05:01 hermes kernel:  [<ffffffff8175affa>] dump_stack+0x45/0x57
03:05:01 hermes kernel:  [<ffffffffa01e87ce>] xfs_error_report+0x3e/0x40 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01a8b1d>] ? xfs_alloc_ag_vextent+0xad/0x120 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01a5a6b>] ? xfs_alloc_lookup_eq+0x1b/0x20 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01a74dd>] xfs_alloc_ag_vextent_size+0x2fd/0x620 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01a8b1d>] xfs_alloc_ag_vextent+0xad/0x120 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01a98c9>] xfs_alloc_vextent+0x429/0x5e0 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01b9d1f>] xfs_bmap_btalloc+0x3af/0x710 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01ba08e>] xfs_bmap_alloc+0xe/0x10 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01bab45>] xfs_bmapi_write+0x4f5/0xb10 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01f31ab>] xfs_iomap_write_allocate+0x14b/0x360 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01de656>] xfs_map_blocks+0x1c6/0x230 [xfs]
03:05:01 hermes kernel:  [<ffffffffa01df8d3>] xfs_vm_writepage+0x1a3/0x5f0 [xfs]
03:05:01 hermes kernel:  [<ffffffff811a2087>] __writepage+0x17/0x50
03:05:01 hermes kernel:  [<ffffffff811a29e5>] write_cache_pages+0x245/0x4b0
03:05:01 hermes kernel:  [<ffffffff811a2070>] ? global_dirtyable_memory+0x50/0x50
03:05:01 hermes kernel:  [<ffffffff811a2c9d>] generic_writepages+0x4d/0x80
03:05:01 hermes kernel:  [<ffffffffa01de462>] xfs_vm_writepages+0x42/0x50 [xfs]
03:05:01 hermes kernel:  [<ffffffff811a45ce>] do_writepages+0x1e/0x40
03:05:01 hermes kernel:  [<ffffffff8123ac80>] __writeback_single_inode+0x40/0x220
03:05:01 hermes kernel:  [<ffffffff8123b651>] writeback_sb_inodes+0x1c1/0x400
03:05:01 hermes kernel:  [<ffffffff8123b92f>] __writeback_inodes_wb+0x9f/0xd0
03:05:01 hermes kernel:  [<ffffffff8123c193>] wb_writeback+0x263/0x2f0
03:05:01 hermes kernel:  [<ffffffff8123e80e>] bdi_writeback_workfn+0x2de/0x470
03:05:01 hermes kernel:  [<ffffffff810b3878>] process_one_work+0x148/0x3d0
03:05:01 hermes kernel:  [<ffffffff810b3f3b>] worker_thread+0x11b/0x460
03:05:01 hermes kernel:  [<ffffffff810b3e20>] ? rescuer_thread+0x320/0x320
03:05:01 hermes kernel:  [<ffffffff810b9128>] kthread+0xd8/0xf0
03:05:01 hermes kernel:  [<ffffffff810b9050>] ? kthread_create_on_node+0x1b0/0x1b0
03:05:01 hermes kernel:  [<ffffffff817622d8>] ret_from_fork+0x58/0x90
03:05:01 hermes kernel:  [<ffffffff810b9050>] ? kthread_create_on_node+0x1b0/0x1b0
03:05:01 hermes kernel: XFS (dm-0): page discard on page ffffea001abe5440, inode 0x18236aef, offset 86016.

Dave Chinner told me on #xfs that the free space thing can sometimes happen if the filesystem is not cleanly unmounted, although if I remember correctly this was not my case. I run xfs_repair at his suggestion, and things seem to be back to normal. Nevertheless I have an xfs_metadump of the partition if you're interested in it.

Comment 3 Cristian Ciupitu 2015-04-14 12:17:28 UTC
I've also had the same issue with another ~500GB partition which has been extended in increments of GBs a few times:

Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
sb_ifree 22516, counted 22120
        - found root inode chunk

Comment 4 Eric Sandeen 2015-04-14 12:56:08 UTC
so that's here:

xfs_alloc_ag_vextent_size()
{
...
        XFS_WANT_CORRUPTED_GOTO(rlen <= flen, error0);

Was that the first/earliest corruption message you see?

Comment 5 Cristian Ciupitu 2015-04-14 13:41:11 UTC
Yes, at least for the current boot.

Comment 6 Fedora Kernel Team 2015-04-28 18:31:13 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs.

Fedora 20 has now been rebased to 3.19.5-100.fc20.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 21, and are still experiencing this issue, please change the version to Fedora 21.

If you experience different issues, please open a new bug report for those.

Comment 7 Fedora End Of Life 2015-05-29 13:47:29 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 8 Fedora End Of Life 2015-06-30 00:20:30 UTC
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.