Bug 509671 - hugetlbfs: bad .st_blocks after failed over-write()
hugetlbfs: bad .st_blocks after failed over-write()
Status: CLOSED UPSTREAM
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
11
All Linux
low Severity medium
: ---
: ---
Assigned To: Eric Sandeen
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-07-04 18:18 EDT by John Reiser
Modified: 2009-07-30 16:58 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-07-10 23:04:58 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description John Reiser 2009-07-04 18:18:07 EDT
Description of problem: .st_blocks is (-1K) or (-2K) or (-4K) after a write to an existing zero-length file that resides on hugetlbfs.  All write()s fail because hugetlbfs does not support write(), but that's no reason to clobber .st_blocks, which must always be zero.


Version-Release number of selected component (if applicable):
kernel-2.6.29.5-191.fc11.x86_64

How reproducible: always


Steps to Reproduce:
1. Create and mount a hugetlbfs filesystem; for example (as superuser):
    mkdir -p /huge
    chmod a+rwx /huge
    mount -t hugetlbfs nodev /huge
    echo 5  > /proc/sys/vm/nr_hugepages
2. Create an empty file in the hugetlbfs file system; for example:
    cd /huge
    > foo    ## bash-ism creates empty file foo
Then attempt to extend the file to non-zero length:
    date  > foo
3. Check the file statistics:  
    /usr/bin/stat foo

Actual results:
   $ > foo
   $ date  > foo
   date: write error: Invalid argument
   $ /usr/bin/stat foo
     File: `foo'
     Size: 0         	Blocks: 18446744073709547520 IO Block: 2097152 regular empty file
where the "Blocks:  18446744073709547520" is (-1K) as two's-complement 64-bit integer.

Expected results:
  Size: 0         	Blocks: 0          IO Block: 2097152 regular empty file

Additional info: Writing the file at the same time as creation works:
   $ date  > bar
   date: write error: Invalid argument
   $ /usr/bin/stat bar
     Size: 0         	Blocks: 0          IO Block: 2097152 regular empty file
Comment 1 John Reiser 2009-07-04 18:25:49 EDT
strace can be used to verify that /usr/bin/stat reports the actual value of .st_blocks:
  $ strace -v -e trace=lstat /usr/bin/stat foo
  lstat("foo", {st_dev=makedev(0, 20), st_ino=110497, st_mode=S_IFREG|0664, st_nlink=1, st_uid=500, st_gid=500, st_blksize=2097152, st_blocks=18446744073709547520, st_size=0, st_atime=2009/07/04-15:08:55, st_mtime=2009/07/04-15:08:59, st_ctime=2009/07/04-15:08:59}) = 0

/bin/ls also corroborates the broken .st_blocks:
  $ ls -l
   total 9223372036854773760     ##### bad space utilization
   -rw-rw-r--. 1 jreiser jreiser 0 2009-07-04 15:12 bar
   -rw-rw-r--. 1 jreiser jreiser 0 2009-07-04 15:08 foo
Comment 2 Eric Sandeen 2009-07-08 15:38:24 EDT
Ok, this was a regression upstream, I'll send a patch soon as my smtp server comes back ;)  Here's the fix.  Is this something you need fixed in F11, or is it just an oddity and you can wait 'til it comes around via upstream?

Thanks,
-Eric

---- patch ----

As reported in Red Hat bz #509671, i_blocks for files on hugetlbfs
get accounting wrong when doing something like:

   $ > foo
   $ date  > foo
   date: write error: Invalid argument
   $ /usr/bin/stat foo
     File: `foo'
     Size: 0          Blocks: 18446744073709547520 IO Block: 2097152 regular
...

This is because hugetlb_unreserve_pages() is unconditionally 
removing blocks_per_huge_page(h) on each call rather than using
the freed amount.  If there were 0 blocks, it goes negative,
resulting in the above.

This is a regression from commit
a5516438959d90b071ff0a484ce4f3f523dc3152

which did:

-	inode->i_blocks -= BLOCKS_PER_HUGEPAGE * freed;
+	inode->i_blocks -= blocks_per_huge_page(h);

so just put back the freed multiplier, and it's all happy again.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
---

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d0351e3..cafdcee 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2370,7 +2370,7 @@ void hugetlb_unreserve_pages(struct inode *inode,
 	long chg = region_truncate(&inode->i_mapping->private_list, offset);
 
 	spin_lock(&inode->i_lock);
-	inode->i_blocks -= blocks_per_huge_page(h);
+	inode->i_blocks -= (blocks_per_huge_page(h) * freed);
 	spin_unlock(&inode->i_lock);
 
 	hugetlb_put_quota(inode->i_mapping, (chg - freed));
Comment 3 John Reiser 2009-07-10 22:45:20 EDT
(In reply to comment #2)
> Is this something you need fixed in F11, or is
> it just an oddity and you can wait 'til it comes around via upstream?

I can wait until the next upstream release, or until Fedora 12, whichever is sooner.
Comment 4 Eric Sandeen 2009-07-10 23:04:58 EDT
Ok, thanks, I've sent the patch upstream and Andrew said he'd pick it up.  I won't churn the F11 kernel if it's not critical to you.

Thanks for the report and the nice reproducer!

-Eric

Note You need to log in before you can comment on or make changes to this bug.