Bug 745576

Summary: libguestfs (or qemu?) hangs if sparse file runs out of disk space
Product: [Community] Virtualization Tools Reporter: Richard W.M. Jones <rjones>
Component: libguestfsAssignee: Richard W.M. Jones <rjones>
Status: NEW --- QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: collura, esandeen, virt-maint
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-10-17 00:09:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Richard W.M. Jones 2011-10-12 18:32:07 UTC
Description of problem:

On this machine, /tmp has 6.6G of space free.

If I create a larger sparse file, then fill it up using
libguestfs, instead of getting an error when I run out of
space I get a hang.

Here is how to reproduce this:

$ cd /tmp
$ df -h /tmp
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/vg_pin-lv_root   45G   38G  6.6G  86% /
                                        ^^^^
  # 6.6G free, so create a sparse file bigger than this.
  # In the example below, I'm using 10G disk:

$ rm -f test1.img
$ truncate -s 10G test1.img
$ guestfish -a test1.img -x <<EOF
run
part-disk /dev/sda mbr
mkfs ext2 /dev/sda1
mount-options "" /dev/sda1 /
upload /dev/zero /zero
EOF

Eventually this uses up all available space, but instead
of getting an error, it just hangs.

Version-Release number of selected component (if applicable):

1.13.20

How reproducible:

100%

Comment 1 Richard W.M. Jones 2011-10-21 14:21:17 UTC
This is not at all trivial to fix.

We can pass the -drive ...,werror=report option to qemu.
However this doesn't do anything useful.

ENOSPC errors on the host are passed up to the guest as
I/O errors.

When writing, ext4 simply does not pass I/O errors up to
userspace.  The write(2) and close(2) system calls return
OK as if nothing was happening, while the kernel message
log fills up with "Buffer I/O error on device vda" errors.

Adding the -o errors=panic mount option also does precisely
nothing.  No panic, behaves same as above.

Comment 2 Eric Sandeen 2013-07-26 16:13:05 UTC
Just saw this one.

ext4 errors=XXX only handles metadata errors; data IO errors just look like i.e. a bad block or something, and there's no reason to abort the fs.  (although I'm not sure why we don't hit some metadata errors in this case...; detected inconsistencies trip it, but now that I think of it, I'm not sure if metadata IO errors do, I need to look)

Anyway, as far as the inner fs is concerned we're just pushing data to the buffer cache, which succeeds just fine:

# dd if=/dev/zero of=file3 bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.147922 s, 70.9 MB/s

If we tried direct IO it'd fail, though, because block allocation fails:

# dd if=/dev/zero of=file3 bs=1M count=10 oflag=direct
dd: writing `file3': Input/output error

As far as the inner fs is concerned, there's still space left in its 4G.  Neither write nor close would be expected to return an error on a buffered write.  fsync, OTOH, should and does return an error:

# xfs_io -f -c "pwrite 0 16m" -c "fsync" mytestfile
wrote 16777216/16777216 bytes at offset 0
16 MiB, 4096 ops; 0.0000 sec (63.332 MiB/sec and 16213.0496 ops/sec)
fsync: Input/output error


ext4 has a mount option to treat data errors more severely:

data_err=ignore(*)      Just print an error message if an error occurs
                        in a file data buffer in ordered mode.
data_err=abort          Abort the journal if an error occurs in a file
                        data buffer in ordered mode.


Anyway, handling ENOSPC errors on thinly provisioned storage is definitely something that still needs work...

Comment 3 Red Hat Bugzilla 2025-10-17 00:09:56 UTC
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.

Comment 4 Alasdair Kergon 2025-10-17 12:52:13 UTC
Reopening because Virtualization Tools has not been discontinued.