Bug 720876 - [ext4/xfstests 204] fails of ENOSPC blocking tests on small filesystem
Summary: [ext4/xfstests 204] fails of ENOSPC blocking tests on small filesystem
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.8
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Eric Sandeen
QA Contact: Filesystem QE
URL:
Whiteboard:
Depends On: 660638
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-07-13 04:13 UTC by Eryu Guan
Modified: 2015-12-28 12:59 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 660638
Environment:
Last Closed: 2014-06-02 13:05:16 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Eryu Guan 2011-07-13 04:13:08 UTC
Clone for ext4, 204 fails on ext4 sometimes, ext3 doesn't have this issue.
It seems it fails more reliable on debug kernel.

It's not a regression, we've seen this failure before but cannot reproduce it easily.

Kernel version is 2.6.18-273.el5

+++ This bug was initially created as a clone of Bug #660638 +++

Description of problem:
Xfstests 204 and 205 both fail on xfs filesystem that they created for testing because dd reports 'No space left on device'.

Version-Release number of selected component (if applicable):
kernel-2.6.18-233, 2.6.18-235

How reproducible:
Always

Steps to Reproduce:
1. Run the mentioned test 204 and 205 for xfs:
TEST_PARAM_RUNTESTS="204 205" make
2. Watch the output
  
Actual results:
No space left on device or disk full (that is just masked No space left on device)

Expected results:
Both tests pass

Additional info:
Examples in beaker:
https://beaker.engineering.redhat.com/logs/2010/11/349/34951/68915/773568/2418515///test_log--kernel-filesystems-xfs-xfstests-204.log
https://beaker.engineering.redhat.com/logs/2010/11/349/34951/68915/773568/2418554///test_log--kernel-filesystems-xfs-xfstests-205.log

--- Additional comment from dchinner on 2010-12-07 18:44:38 EST ---

Is this a regression due to commit c8786c0c ("xfs: Fix speculative allocation beyond eof")?

--- Additional comment from branto on 2010-12-08 13:37:12 EST ---

No, it is reproducible in kernel-2.6.18-227 that should be without the fix. It is even reproducible in kernel-2.6.18-194.
I guess it was not reported earlier because it was considered to be a bug of test case.

--- Additional comment from dchinner on 2010-12-09 23:07:59 EST ---

(In reply to comment #2)
> No, it is reproducible in kernel-2.6.18-227 that should be without the fix. It
> is even reproducible in kernel-2.6.18-194.
> I guess it was not reported earlier because it was considered to be a bug of
> test case.

Ok, so this has already been considered a WONTFIX issue for RHEL 5 codebase?

--- Additional comment from branto on 2010-12-10 05:50:50 EST ---

(In reply to comment #3)
> Ok, so this has already been considered a WONTFIX issue for RHEL 5 codebase?

I'm not entirely sure but my search in bugzilla for WONTFIX rhel 5 bugs containing word "xfs" in comments didn't return anything similar to this so I guess this was not considered a WONTFIX, yet.

Comment 1 Eric Sandeen 2011-07-13 19:41:06 UTC
Do you have any links to the ext4 test failures?

Thanks,
-Eric

Comment 2 Eric Sandeen 2011-07-13 21:05:05 UTC
Ok, I can occasionally hit this:

+./204: line 53: echo: write error: No space left on device
+./204: line 53: echo: write error: No space left on device
 *** done
Ran: 204
Failures: 204
Failed 1 of 1 tests

The fs is full:

  File: "/mnt/scratch"
    ID: 3289da11f01b3f29 Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 25788      Free: 0          Available: 0
Inodes: Total: 26624      Free: 4113

but after a bit it's got a little space again:

  File: "/mnt/scratch/"
    ID: 3289da11f01b3f29 Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 25788      Free: 2130       Available: 799
Inodes: Total: 26624      Free: 4113

so we're not doing as much flushing as we need to at ENOSPC time I guess.  We did get pretty close to full, but not all the way there.



When the test completes normally, it ends with:

  File: "/mnt/scratch/"
    ID: 5e6c61cb40b13e Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 25788      Free: 753        Available: 0
Inodes: Total: 26624      Free: 4113

and a bit later, it has freed still more:

  File: "/mnt/scratch/"
    ID: 5e6c61cb40b13e Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 25788      Free: 2107       Available: 776
Inodes: Total: 26624      Free: 4113

I don't think this is terribly critical; running full is ill-advised, and transient ENOSPC is somewhat acceptable, but it's probably worth looking into a bit.

Comment 3 Eric Sandeen 2011-07-13 21:53:28 UTC
ext4 makes an attempt to sync data as it nears ENOSPC to clear out any speculative allocation.  In rhel6 & upstream we have:

        if (free_blocks < 2 * dirty_blocks)
                writeback_inodes_sb_if_idle(sb);


but in rhel5 we have:

        if (free_blocks < 4 * dirty_blocks) {
                struct backing_dev_info *bdi;
                /* writeback_inodes_sb_if_idle() upstream */
                bdi = &bdev_get_queue(sb->s_bdev)->backing_dev_info;
                if (!writeback_in_progress(bdi))
                        sync_inodes_sb(sb, 0);

because we don't have that upstream infrastructure, I guess it's not working quite the same, because I can't make this test fail on RHEL6.

Comment 4 Eryu Guan 2011-07-23 06:43:53 UTC
(In reply to comment #3)
> because we don't have that upstream infrastructure, I guess it's not working
> quite the same, because I can't make this test fail on RHEL6.

RHEL6 has this problem too, but it's harder to hit.
I'll file a new bug to track the RHEL6 issue when I hit this failure next time on RHEL6.

Comment 5 Eryu Guan 2011-07-24 04:16:13 UTC
(In reply to comment #4)
> RHEL6 has this problem too, but it's harder to hit.
> I'll file a new bug to track the RHEL6 issue when I hit this failure next time
> on RHEL6.

I hit it again today on RHEL6.1, please see Bug 725201

Comment 6 RHEL Program Management 2012-01-09 14:35:19 UTC
This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.8 and Red Hat does not plan to fix this issue the currently developed update.

Contact your manager or support representative in case you need to escalate this bug.

Comment 8 RHEL Program Management 2012-10-30 05:49:47 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 12 RHEL Program Management 2014-03-07 13:36:58 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 13 RHEL Program Management 2014-06-02 13:05:16 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).


Note You need to log in before you can comment on or make changes to this bug.