Description of problem: On this machine, /tmp has 6.6G of space free. If I create a larger sparse file, then fill it up using libguestfs, instead of getting an error when I run out of space I get a hang. Here is how to reproduce this: $ cd /tmp $ df -h /tmp Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_pin-lv_root 45G 38G 6.6G 86% / ^^^^ # 6.6G free, so create a sparse file bigger than this. # In the example below, I'm using 10G disk: $ rm -f test1.img $ truncate -s 10G test1.img $ guestfish -a test1.img -x <<EOF run part-disk /dev/sda mbr mkfs ext2 /dev/sda1 mount-options "" /dev/sda1 / upload /dev/zero /zero EOF Eventually this uses up all available space, but instead of getting an error, it just hangs. Version-Release number of selected component (if applicable): 1.13.20 How reproducible: 100%
This is not at all trivial to fix. We can pass the -drive ...,werror=report option to qemu. However this doesn't do anything useful. ENOSPC errors on the host are passed up to the guest as I/O errors. When writing, ext4 simply does not pass I/O errors up to userspace. The write(2) and close(2) system calls return OK as if nothing was happening, while the kernel message log fills up with "Buffer I/O error on device vda" errors. Adding the -o errors=panic mount option also does precisely nothing. No panic, behaves same as above.
Just saw this one. ext4 errors=XXX only handles metadata errors; data IO errors just look like i.e. a bad block or something, and there's no reason to abort the fs. (although I'm not sure why we don't hit some metadata errors in this case...; detected inconsistencies trip it, but now that I think of it, I'm not sure if metadata IO errors do, I need to look) Anyway, as far as the inner fs is concerned we're just pushing data to the buffer cache, which succeeds just fine: # dd if=/dev/zero of=file3 bs=1M count=10 10+0 records in 10+0 records out 10485760 bytes (10 MB) copied, 0.147922 s, 70.9 MB/s If we tried direct IO it'd fail, though, because block allocation fails: # dd if=/dev/zero of=file3 bs=1M count=10 oflag=direct dd: writing `file3': Input/output error As far as the inner fs is concerned, there's still space left in its 4G. Neither write nor close would be expected to return an error on a buffered write. fsync, OTOH, should and does return an error: # xfs_io -f -c "pwrite 0 16m" -c "fsync" mytestfile wrote 16777216/16777216 bytes at offset 0 16 MiB, 4096 ops; 0.0000 sec (63.332 MiB/sec and 16213.0496 ops/sec) fsync: Input/output error ext4 has a mount option to treat data errors more severely: data_err=ignore(*) Just print an error message if an error occurs in a file data buffer in ordered mode. data_err=abort Abort the journal if an error occurs in a file data buffer in ordered mode. Anyway, handling ENOSPC errors on thinly provisioned storage is definitely something that still needs work...