Bug 1042968

Summary: xfs filesystem reports no space left on device when there is plenty of space free
Product: [Fedora] Fedora Reporter: Andrew J. Schorr <aschorr>
Component: kernelAssignee: fs-maint
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: dchinner, esandeen, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---Flags: jforbes: needinfo?
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-03-10 14:44:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Andrew J. Schorr 2013-12-13 16:49:07 UTC
Description of problem: An XFS filesystem has 17GB of free space according to df, but an attempt to write a small file says "No space left on device".


Version-Release number of selected component (if applicable):
kernel-3.10.11-200.fc19.x86_64
xfsprogs-3.1.10-2.fc19.x86_64


How reproducible:
Here is what I observed:
bash-4.2$ cd /extra_disk/tmp
bash-4.2$ df -h .
Filesystem                    Size  Used Avail Use% Mounted on
/dev/mapper/vg_os-extra_disk   57G   40G   17G  71% /extra_disk
bash-4.2$ stat -f .
  File: "."
    ID: fd0200000000 Namelen: 255     Type: xfs
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 14758368   Free: 4319293    Available: 4319293
Inodes: Total: 59113472   Free: 59020864
bash-4.2$ echo hello > fubar
bash: fubar: No space left on device
[root@ti130 ~]# umount /extra_disk
[root@ti130 ~]# xfs_repair /dev/vg_os/extra_disk && echo YES
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 5
        - agno = 8
        - agno = 2
        - agno = 6
        - agno = 7
        - agno = 3
        - agno = 4
        - agno = 9
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done
YES
[root@ti130 ~]# mount /extra_disk
bash-4.2$ echo hello > /extra_disk/tmp/fubar

So the xfs_repair somehow fixes the problem.  Or maybe just unmounting and remounting the disk solved it.  

I then tested with dd, and I am currently able to write a 17GB file to
that filesystem.  So somehow the remounting and/or xfs_repair fixed
the problem, at least for the moment.

I also note that the performance on the system gets terrible as dd gets close to filling up the filesystem.  It became unresponsive as partition was close to being full.


Steps to Reproduce:
1. I am not certain.
2.
3.

Actual results:
df shows 17GB free, but I cannot write a file.

Expected results:
I should be able to write a file if there is lots of free space.

Additional info:

Comment 1 Josh Boyer 2013-12-13 16:54:11 UTC
Now that you've fixed it with xfs_repair I'm not sure there's much that can be done here.  Assigning to the fs people just in case.

Comment 2 Andrew J. Schorr 2013-12-13 16:59:43 UTC
I'm not certain it's actually fixed.  The xfs_repair output did not seem to indicate that it was actually fixing anything, did it?  We shall see...

Comment 3 Eric Sandeen 2013-12-13 18:57:44 UTC
Yeah, repair seemed to find no problems.

This could maybe be preallocation behavior; my other thought is whether it's terribly fragmented freespace, so much so that there's no room to allocate a contiguous chunk of blocks for new inodes.  (In that case it would have been interesting to try to simply "touch" a new file w/ no content.)

Can you try:

# xfs_db -r -c "sb 0" -c "p" -c "freesp" /dev/vg_os/extra_disk

(preferably when unmounted) & add the results here?

Comment 4 Andrew J. Schorr 2013-12-13 20:14:42 UTC
[root@ti130 ajs]# umount /extra_disk
[root@ti130 ajs]# xfs_db -r -c "sb 0" -c "p" -c "freesp" /dev/vg_os/extra_disk
magicnum = 0x58465342
blocksize = 4096
dblocks = 14778368
rblocks = 0
rextents = 0
uuid = e0e51123-d4f5-4691-990c-d2ff83f8bb80
logstart = 4194308
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 1
agblocks = 1531904
agcount = 10
rbmblocks = 0
logblocks = 20000
versionnum = 0xb4a4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "\000\000\000\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 21
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 92608
ifree = 0
fdblocks = 4327529
frextents = 0
uquotino = 0
gquotino = 0
qflags = 0
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 1
features2 = 0xa
bad_features2 = 0xa
   from      to extents  blocks    pct
      1       1      24      24   0.00
      2       3      14      38   0.00
    256     511       1     489   0.01
    512    1023       1     592   0.01
   1024    2047       5    7592   0.18
   2048    4095       7   19538   0.45
   4096    8191       7   42593   0.98
   8192   16383       2   23105   0.53
  16384   32767       2   53075   1.23
  32768   65535       2   96505   2.23
  65536  131071       3  268774   6.21
 131072  262143       1  189100   4.37
 262144  524287       1  435462  10.06
 524288 1048575       2 1739155  40.19
1048576 1531904       1 1451487  33.54

Comment 5 Eric Sandeen 2013-12-13 21:26:38 UTC
Freespace doesn't look fragmented at this point; you have some very large extents of free blocks (the table at the end).

ifree = 0, meaning (IIRC) all clusters of inodes are fully in use, and a new cluster needs to be allocated for the next inode ... but other than that, I don't see that there should be any problem.

I'm not sure how to investigate this further since it looks ok and is working now.  If you hit it again, try the "touch" test, perhaps.

And/or try tracing the simple test (touch or echo); in one console:

# trace-cmd record -e xfs\*
<tracing starts>

In a 2nd console:
# touch foo; echo bar > baz # or whatever fails

In the first console:
<Ctrl-C>
# trace-cmd report > trace_report.txt

(Also - I guess it's not a preallocation issue since df showed free space).

Comment 6 Justin M. Forbes 2014-01-03 22:11:37 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.12.6-200.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.

Comment 7 Justin M. Forbes 2014-03-10 14:44:45 UTC
*********** MASS BUG UPDATE **************

This bug has been in a needinfo state for more than 1 month and is being closed with insufficient data due to inactivity. If this is still an issue with Fedora 19, please feel free to reopen the bug and provide the additional information requested.