1092853 – XFS (sdc6): xlog_write: reservation ran out. Need to up reservation

Bug 1092853 - XFS (sdc6): xlog_write: reservation ran out. Need to up reservation

Summary: XFS (sdc6): xlog_write: reservation ran out. Need to up reservation

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	20
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-04-30 06:27 UTC by Cristian Ciupitu
Modified:	2015-02-24 16:27 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-02-24 16:27:24 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Cristian Ciupitu 2014-04-30 06:27:35 UTC

Description of problem:
An XFS partition become unavailable after it became full.

Version-Release number of selected component (if applicable):
kernel-3.14.1-200.fc20.x86_64

How reproducible:
Once

Steps to Reproduce:
1. ABRT wrote something to /var/tmp making it full

Actual results:
Apr 30 00:18:11 hermes kernel: XFS (sdc6): xlog_write: reservation summary:
                                 trans type  = FSYNC_TS (36)
                                 unit res    = 9640 bytes
                                 current res = -4 bytes
                                 total reg   = 0 bytes (o/flow = 0 bytes)
                                 ophdrs      = 0 (ophdr space = 0 bytes)
                                 ophdr + reg = 0 bytes
                                 num regions = 0
Apr 30 00:18:11 hermes kernel: XFS (sdc6): xlog_write: reservation ran out. Need to up reservation
Apr 30 00:18:11 hermes kernel: XFS (sdc6): xfs_do_force_shutdown(0x2) called from line 1999 of file fs/xfs/xfs_log.c.  Return address = 0xffffffffa01c36e8
Apr 30 00:18:11 hermes abrt-hook-ccpp[10684]: Write error: Input/output error
Apr 30 00:18:11 hermes abrt-hook-ccpp[10684]: Error writing '/var/tmp/abrt/ccpp-2014-04-30-00:18:04-8659.new/coredump'
Apr 30 00:18:11 hermes kernel: XFS (sdc6): Log I/O Error Detected.  Shutting down filesystem
Apr 30 00:18:11 hermes kernel: XFS (sdc6): Please umount the filesystem and rectify the problem(s)
Apr 30 00:18:11 hermes kernel: Buffer I/O error on device sdc6, logical block 225044
Apr 30 00:18:11 hermes kernel: lost page write due to I/O error on sdc6
Apr 30 00:18:11 hermes kernel: Buffer I/O error on device sdc6, logical block 225009
Apr 30 00:18:11 hermes kernel: lost page write due to I/O error on sdc6
Apr 30 00:18:11 hermes kernel: Buffer I/O error on device sdc6, logical block 225010
Apr 30 00:18:11 hermes kernel: lost page write due to I/O error on sdc6
Apr 30 00:18:11 hermes kernel: Buffer I/O error on device sdc6, logical block 225011
Apr 30 00:18:11 hermes kernel: lost page write due to I/O error on sdc6
Apr 30 00:18:11 hermes kernel: Buffer I/O error on device sdc6, logical block 225012
Apr 30 00:18:11 hermes kernel: lost page write due to I/O error on sdc6
Apr 30 00:18:11 hermes kernel: Buffer I/O error on device sdc6, logical block 225053
Apr 30 00:18:11 hermes kernel: lost page write due to I/O error on sdc6
Apr 30 00:18:11 hermes kernel: Buffer I/O error on device sdc6, logical block 225054
Apr 30 00:18:11 hermes kernel: lost page write due to I/O error on sdc6
Apr 30 00:18:11 hermes kernel: Buffer I/O error on device sdc6, logical block 225055
Apr 30 00:18:11 hermes kernel: lost page write due to I/O error on sdc6
Apr 30 00:18:11 hermes kernel: Buffer I/O error on device sdc6, logical block 225056
Apr 30 00:18:11 hermes kernel: lost page write due to I/O error on sdc6
Apr 30 00:18:40 hermes kernel: XFS (sdc6): xfs_log_force: error 5 returned.
Apr 30 00:19:11 hermes kernel: XFS (sdc6): xfs_log_force: error 5 returned.
Apr 30 00:19:41 hermes kernel: XFS (sdc6): xfs_log_force: error 5 returned.
Apr 30 00:20:11 hermes kernel: XFS (sdc6): xfs_log_force: error 5 returned.
Apr 30 00:20:41 hermes kernel: XFS (sdc6): xfs_log_force: error 5 returned.


Expected results:
No errors

Additional info:
SMART showed no errors for the hard-disk.

Comment 1 Cristian Ciupitu 2014-04-30 06:29:22 UTC

dchinner: This commit in 3.15-rc1: fe4c224 xfs: inode log reservations are still too small

Comment 2 Paulo Fessel 2014-06-03 02:49:46 UTC

Description of problem:
An XFS partition become unavailable witout previous warning.

Version-Release number of selected component (if applicable):
kernel-3.14.4-200.fc20.x86_64

How reproducible:
Once

Steps to reproduce:
1. Was downloading some torrent files and then suddenly I was bitten by this problem too. In my case, the only difference was that my XFS file system had plenty of space:

[pfessel@wotan ~]$ LANG=C df -h /home
Filesystem      Size  Used Avail Use% Mounted on
/dev/md126p1    1.9T  1.2T  637G  66% /home

[pfessel@wotan ~]$ LANG=C df -i /home
Filesystem        Inodes  IUsed     IFree IUse% Mounted on
/dev/md126p1   390676224 412656 390263568    1% /home

[pfessel@wotan ~]$ mount | grep home
/dev/md126p1 on /home type xfs (rw,noatime,nodiratime,seclabel,attr2,inode64,sunit=512,swidth=512,noquota)

Jun  2 22:12:48 wotan kernel: [23518.275534] XFS (md126p1): xlog_write: reservation summary:
Jun  2 22:12:48 wotan kernel: [23518.275534]   trans type  = FSYNC_TS (36)
Jun  2 22:12:48 wotan kernel: [23518.275534]   unit res    = 9640 bytes
Jun  2 22:12:48 wotan kernel: [23518.275534]   current res = -4 bytes
Jun  2 22:12:48 wotan kernel: [23518.275534]   total reg   = 0 bytes (o/flow = 0 bytes)
Jun  2 22:12:48 wotan kernel: [23518.275534]   ophdrs      = 0 (ophdr space = 0 bytes)
Jun  2 22:12:48 wotan kernel: [23518.275534]   ophdr + reg = 0 bytes
Jun  2 22:12:48 wotan kernel: [23518.275534]   num regions = 0
Jun  2 22:12:48 wotan kernel: [23518.275534] 
Jun  2 22:12:48 wotan kernel: [23518.275556] XFS (md126p1): xlog_write: reservation ran out. Need to up reservation
Jun  2 22:12:48 wotan kernel: [23518.275566] XFS (md126p1): xfs_do_force_shutdown(0x2) called from line 1999 of file fs/xfs/xfs_log.c.  Return address = 0xffffffffa0d346e8
Jun  2 22:12:48 wotan kernel: [23518.275577] XFS (md126p1): Log I/O Error Detected.  Shutting down filesystem
Jun  2 22:12:48 wotan kernel: [23518.275582] XFS (md126p1): Please umount the filesystem and rectify the problem(s)
Jun  2 22:12:48 wotan kernel: XFS (md126p1): xlog_write: reservation summary:
  trans type  = FSYNC_TS (36)
  unit res    = 9640 bytes
  current res = -4 bytes
  total reg   = 0 bytes (o/flow = 0 bytes)
  ophdrs      = 0 (ophdr space = 0 bytes)
  ophdr + reg = 0 bytes
  num regions = 0

Jun  2 22:12:48 wotan kernel: XFS (md126p1): xlog_write: reservation ran out. Need to up reservation
Jun  2 22:12:48 wotan kernel: XFS (md126p1): xfs_do_force_shutdown(0x2) called from line 1999 of file fs/xfs/xfs_log.c.  Return address = 0xffffffffa0d346e8
Jun  2 22:12:48 wotan kernel: XFS (md126p1): Log I/O Error Detected.  Shutting down filesystem
Jun  2 22:12:48 wotan kernel: XFS (md126p1): Please umount the filesystem and rectify the problem(s)
Jun  2 22:13:04 wotan kernel: [23535.018581] device p22p1 left promiscuous mode
Jun  2 22:13:04 wotan kernel: device p22p1 left promiscuous mode
Jun  2 22:13:17 wotan kernel: [23548.129223] XFS (md126p1): xfs_log_force: error 5 returned.
Jun  2 22:13:17 wotan kernel: XFS (md126p1): xfs_log_force: error 5 returned.
Jun  2 22:13:48 wotan kernel: [23578.192475] XFS (md126p1): xfs_log_force: error 5 returned.
(...)

Dismounted fs, remounted to replay log and then xfs_repair'd it. It appears to have recovered after this. This volume resides on a RAID volume which had one of its disks replaced less than one week ago when I moved the contents from an ext4 fs to this new xfs fs. Disks are OK and have no SMART errors. Here is /proc/mdstat for /home:

md126 : active raid1 sde1[2] sdd1[0]
      1953382464 blocks super 1.2 [2/2] [UU]
      bitmap: 0/15 pages [0KB], 65536KB chunk

Comment 3 Cristian Ciupitu 2014-06-03 12:26:21 UTC

This was supposedly fixed in commit fe4c224aa1ffa4352849ac5f452de7132739bee2 *

* https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=fe4c224aa1ffa4352849ac5f452de7132739bee2

Comment 4 Paulo Fessel 2014-06-15 04:31:40 UTC

It just exploded on me again:

Jun 15 01:11:22 wotan kernel: [43299.625792] XFS (md126p1): xlog_write: reservation summary:
Jun 15 01:11:22 wotan kernel: [43299.625792]   trans type  = FSYNC_TS (36)
Jun 15 01:11:22 wotan kernel: [43299.625792]   unit res    = 9640 bytes
Jun 15 01:11:22 wotan kernel: [43299.625792]   current res = -4 bytes
Jun 15 01:11:22 wotan kernel: [43299.625792]   total reg   = 0 bytes (o/flow = 0 bytes)
Jun 15 01:11:22 wotan kernel: [43299.625792]   ophdrs      = 0 (ophdr space = 0 bytes)
Jun 15 01:11:22 wotan kernel: [43299.625792]   ophdr + reg = 0 bytes
Jun 15 01:11:22 wotan kernel: [43299.625792]   num regions = 0
Jun 15 01:11:22 wotan kernel: [43299.625792] 
Jun 15 01:11:22 wotan kernel: [43299.625811] XFS (md126p1): xlog_write: reservation ran out. Need to up reservation
Jun 15 01:11:22 wotan kernel: [43299.625817] XFS (md126p1): xfs_do_force_shutdown(0x2) called from line 1999 of file fs/xfs/xfs_log.c.  R
eturn address = 0xffffffffa044b6e8
Jun 15 01:11:22 wotan kernel: [43299.625823] XFS (md126p1): Log I/O Error Detected.  Shutting down filesystem
Jun 15 01:11:22 wotan kernel: [43299.625825] XFS (md126p1): Please umount the filesystem and rectify the problem(s)
Jun 15 01:11:22 wotan kernel: XFS (md126p1): xlog_write: reservation summary:
  trans type  = FSYNC_TS (36)
  unit res    = 9640 bytes
  current res = -4 bytes
  total reg   = 0 bytes (o/flow = 0 bytes)
  ophdrs      = 0 (ophdr space = 0 bytes)
  ophdr + reg = 0 bytes
  num regions = 0

Since Cristian reports that it's been fixed on the above commit, is it possible to backport it to the current mainline fc20 kernel? Until now it seems to be harmless, but I do really fear losing something one of these days.

Comment 5 Netbulae 2014-06-15 20:12:04 UTC

I have the same problem 3 times now on my /home. Anything I can do except compiling kernel 3.15.7? 

I use xfs_repair to fix it for a couple of minutes and removed some directories/files but it keeps happening and makes my system unworkable.

kernel-3.14.7-200.fc20.x86_64

Jun 15 21:01:22 mobieltje kernel: [45049.282218] XFS (sdb1): xlog_write: reservation summary:
Jun 15 21:01:22 mobieltje kernel: [45049.282218]   trans type  = FSYNC_TS (36)
Jun 15 21:01:22 mobieltje kernel: [45049.282218]   unit res    = 9640 bytes
Jun 15 21:01:22 mobieltje kernel: [45049.282218]   current res = -4 bytes
Jun 15 21:01:22 mobieltje kernel: [45049.282218]   total reg   = 0 bytes (o/flow = 0 bytes)
Jun 15 21:01:22 mobieltje kernel: [45049.282218]   ophdrs      = 0 (ophdr space = 0 bytes)
Jun 15 21:01:22 mobieltje kernel: [45049.282218]   ophdr + reg = 0 bytes
Jun 15 21:01:22 mobieltje kernel: [45049.282218]   num regions = 0
Jun 15 21:01:22 mobieltje kernel: [45049.282218]
Jun 15 21:01:22 mobieltje kernel: [45049.282238] XFS (sdb1): xlog_write: reservation ran out. Need to up reservation
Jun 15 21:01:22 mobieltje kernel: [45049.282247] XFS (sdb1): xfs_do_force_shutdown(0x2) called from line 1999 of file fs/xfs/xfs_log.c.  Return address = 0xffffffffa080c6e8
Jun 15 21:01:22 mobieltje kernel: [45049.282364] XFS (sdb1): Log I/O Error Detected.  Shutting down filesystem
Jun 15 21:01:22 mobieltje kernel: [45049.282368] XFS (sdb1): Please umount the filesystem and rectify the problem(s)
Jun 15 21:01:22 mobieltje kernel: XFS (sdb1): xlog_write: reservation summary:
  trans type  = FSYNC_TS (36)
  unit res    = 9640 bytes
  current res = -4 bytes
  total reg   = 0 bytes (o/flow = 0 bytes)
  ophdrs      = 0 (ophdr space = 0 bytes)
  ophdr + reg = 0 bytes
  num regions = 0

Jun 15 21:01:22 mobieltje kernel: XFS (sdb1): xlog_write: reservation ran out. Need to up reservation
Jun 15 21:01:22 mobieltje kernel: XFS (sdb1): xfs_do_force_shutdown(0x2) called from line 1999 of file fs/xfs/xfs_log.c.  Return address = 0xffffffffa080c6e8
Jun 15 21:01:22 mobieltje kernel: XFS (sdb1): Log I/O Error Detected.  Shutting down filesystem
Jun 15 21:01:22 mobieltje kernel: XFS (sdb1): Please umount the filesystem and rectify the problem(s)
Jun 15 21:01:25 mobieltje kernel: [45052.507045] XFS (sdb1): xfs_log_force: error 5 returned.
Jun 15 21:01:25 mobieltje kernel: XFS (sdb1): xfs_log_force: error 5 returned.

Comment 6 Paulo Fessel 2014-06-15 21:10:39 UTC

Netbulae, good to see we aren't alone on this. What was your workload at the time when you've got the errors? Was it something write-intensive or just normal usage? In my case, I don't know but I always seem to get it when download torrents which have gigabyte-sized files inside them.

Comment 7 Netbulae 2014-06-16 09:34:42 UTC

I was torrenting Fedora dvd iso's and when I stopped downloading, I don't get the error.

Comment 8 Justin M. Forbes 2014-11-13 16:00:34 UTC

*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs.

Fedora 20 has now been rebased to 3.17.2-200.fc20.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 21, and are still experiencing this issue, please change the version to Fedora 21.

If you experience different issues, please open a new bug report for those.

Comment 9 Fedora Kernel Team 2015-02-24 16:21:28 UTC

*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs.

Fedora 20 has now been rebased to 3.18.7-100.fc20.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 21, and are still experiencing this issue, please change the version to Fedora 21.

If you experience different issues, please open a new bug report for those.

Comment 10 GV 2015-02-24 16:26:22 UTC

This bug seems to be fixed in F21.

Comment 11 Josh Boyer 2015-02-24 16:27:24 UTC

Thanks!

Note You need to log in before you can comment on or make changes to this bug.