Bug 495683

Summary:

[btrfs] hopeless ENOSPC handling and excessive administration costs [Was: yum and rpm crash with SIGBUS]

Product:

[Fedora] Fedora

Reporter:

Matěj Cepl <mcepl>

Component:

kernel

Assignee:

Josef Bacik <jbacik>

Status:

CLOSED UPSTREAM

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

medium

Docs Contact:

Priority:

low

Version:

CC:

farrellj, ffesti, itamar, jbacik, kernel-maint, mcepl, pmatilai, tomek

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2010-05-05 17:47:53 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
backtrace	none
/var/log/messages	none
/var/log/dmesg	none
screenshot of kernel crash	none
right side of the screenshot	none

Description Matěj Cepl 2009-04-14 12:07:28 UTC

Created attachment 339471 [details]
backtrace

Description of problem:
yum crashes with SIGBUS signal and the attached backtrace
it helps to reboot
sometimes it crashes claiming there is not enough space on disk, which is not true (there are still at least 1GB on each partition of my machine).
Version-Release number of selected component (if applicable):
upgraded to the latest Rawdhide (but cannot find you correct versions of rpm, because rpm crashes ;))

How reproducible:
when it happens, then always, reboot needed

Steps to Reproduce:
1.yum upgrade
2.
3.
  
Actual results:
crash

Expected results:


Additional info:

Comment 1 Panu Matilainen 2009-04-14 12:52:48 UTC

What platform and what filesystem is /var on? When did it start, eg did it work with F11-beta and start crashing only later or ...? You should be able to dig at least some of version information from yum.log and such.

Comment 2 Jindrich Novy 2009-04-14 13:12:47 UTC

Reporter said his /var partition is btrfs. There is no apparent bug from the first look in RPM and from the nature of the error it seems FS related. Let's ask kernel/btrfs guys if this can be btrfs related.

Comment 3 Matěj Cepl 2009-04-14 13:28:21 UTC

Yes, that could be interesting as well. /var is BTRFS, attaching /var/log/messages, /var/log/dmesg. Anything else?

Comment 4 Matěj Cepl 2009-04-14 13:33:51 UTC

Created attachment 339488 [details]
/var/log/messages

Comment 5 Matěj Cepl 2009-04-14 13:37:55 UTC

Created attachment 339489 [details]
/var/log/dmesg

Comment 6 Matěj Cepl 2009-04-14 13:40:06 UTC

... and yes this worked pretty well, since F11Beta with constant updates until Friday or sometimes like that.

Comment 7 Matěj Cepl 2009-04-14 13:43:04 UTC

... and yes df -h claims that there are 2.1GB free space on /var

Comment 8 Panu Matilainen 2009-04-14 13:55:18 UTC

Whatever df says, but your /var/log/messages has a pile of "no space left" errors, including:
Apr 14 15:09:18 viklef auditd[3737]: Record was not written to disk (No space left on device)
Apr 14 15:09:18 viklef auditd[3737]: Audit daemon has no space left on logging partition
Apr 14 15:09:18 viklef auditd[3737]: Audit daemon is suspending logging due to no space left on logging partition.

Other "fun" in there includes:
Apr 14 15:08:12 viklef kernel: btrfsck[3597]: segfault at 110 ip 0000000000402c49 sp 00007ffffbc3dfa0 error 4 in btrfsck[400000+21000]

Comment 9 Josef Bacik 2009-04-14 14:58:54 UTC

This is one of the gotcha's with btrfs, it doesn't handle ENOSPC well at all.  They way btrfs works is it carves chunks of your disk out for different things, mainly metadata and data.  So df will say you have 2gb of space left, but thats likely all data space, and the metadata area is all used, which from what your messages say it looks like that is the case.  If you are going to use btrfs, make sure you don't go past 75% full just to be on the safe side.

Comment 10 Josef Bacik 2009-04-14 20:50:19 UTC

ok my previous post was not clear.  i'm speaking as btrfs stands right now, in all of its buggy glory.  btrfs does not have proper ENOSPC handling, so bad/unexpected things will happen if you start going past the 75% full mark.  This will change in the future, but currently you do this at your own risk.

Comment 11 Matěj Cepl 2009-04-14 21:24:27 UTC

Created attachment 339581 [details]
screenshot of kernel crash

So, switched to runlevel 1 and tried btrfsck /dev/vg00/lvVar and that worked flawlessly, so supported by this success I tried btrfs-vol -b /var and I got this ... :)

Comment 12 Matěj Cepl 2009-04-14 21:27:36 UTC

Created attachment 339583 [details]
right side of the screenshot

Comment 13 Matěj Cepl 2009-04-15 04:57:04 UTC

Well, after next reboot btrfs-vol -b /var didn't crash, but it didn't work either ... after 6+ hours of running on 10GB LVM LV (without any report about anything ... it could really use some --verbose option) I rebooted, because I needed a computer. Is it supposed to be THAT slow, or did I get into some kind of endless loop or something? What information could help you? strace?

Comment 14 Matěj Cepl 2009-04-16 07:34:15 UTC

Just want to confirm that when I did rm /var/cache/yum/*/packages/*.rpm then system behaves sanely (including yum).

Comment 15 Josef Bacik 2009-04-16 12:24:25 UTC

hey thats good news, I would have expected rm to crap its pants.

Comment 16 Chuck Ebbert 2009-04-20 17:02:58 UTC

Why is this marked as an F11 release blocker? btrfs is still highly experimental...

Comment 17 Bug Zapper 2009-06-09 13:47:50 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 18 Jason Farrell 2009-10-20 19:57:07 UTC

I just ran into this problem during a modestly-sized f12beta yum update where there was enough space to finish, but apparently not enough for btrfs in low-diskspace situations.

kernel claims I'm out of space with 27% still free:

# df -hT /
Filesystem    Type    Size  Used Avail Use% Mounted on
/dev/sda2    btrfs    3.7G  3.1G  659M  83% /
# echo asdf > asdf
bash: echo: write error: No space left on device
# tail -1 /var/log/messages
Oct 20 14:43:20 localhost kernel: no space left, need Oct 20 14:49:44 localhost kernel: no space left, need 4096, 0 delalloc bytes, 3101159424 bytes_used, 4096 bytes_reserved, 0 bytes_pinned, 0 bytes_readonly, 34668544 may use 3135832064 total

I'll be using ext4 again on my netbook for f12 final, since I can't really justify putting a larger ssd in it that btrfs won't puke on (currently)

Comment 19 Jason Farrell 2009-10-20 20:03:27 UTC

correction: 17% free (not 27%), but still, with 659M free, it shouldn't be claiming to be out of space. not until near-zero, like ext.

Comment 20 Jason Farrell 2009-10-20 20:06:04 UTC

and as far as I can tell, btrfs has no equivalent of ext3/4's default 5% reserved blocks.

Comment 21 Josef Bacik 2009-10-20 20:07:00 UTC

what kernel are you running?  i just recently pushed a whole host of enospc fixes that should fix this problem.

Comment 22 Josef Bacik 2009-10-20 20:12:30 UTC

well shit it looks like a new kernel hasn't been built since I committed those changes.  Whenever a new kernel gets built it should fix this problem.  Please note tho, the way btrfs works is it carves out chunks for data and metadata, but df only reports data.  So even with the fully functioning enospc patches you will still see df saying you have say 3-5% free, but really its in use by metadata.

Comment 23 Jason Farrell 2009-10-20 20:40:56 UTC

I'll keep an eye out for btrfs in future kernel changelogs, and test the lowlimit again, thanks.

IMO, if a filesystem reports it has 5% free space, it should mean very close to that *5%* is free for user data, as this is what everyone expects. If this isn't possible, for whatever reason, then perhaps there *should* be a way to tune a reserved metadata percentage as offlimits (even to root, as opposed to ext).

Comment 24 Josef Bacik 2009-10-20 20:44:21 UTC

Yes well welcome to using a development fs.  We're still trying to decide whether to tell the user they have 100% used if they've used all their data space or not, since its a much more complex question to answer than df allows.

Comment 25 Jason Farrell 2009-10-22 16:14:20 UTC

Unfortunately the new kernel (2.6.31.5-91.rc1.fc12.i686.PAE) with the ENOSPC btrfs patches didn't help matters much. I'm still running out of space at about the same point, with 617M still being reported as "free" (out of 3.7G). Is that much space really being reserved for metadata as a minimum?

[root@aao ~]# tail -1 /var/log/messages
Oct 22 12:10:19 localhost kernel: no space left, need 131072, 8192 delalloc bytes, 3135758336 bytes_used, 0 bytes_reserved, 0 bytes_pinned, 0 bytes_readonly, 0 may use 3135832064 total
[root@aao ~]# df -hT / ; mount|grep btr
Filesystem    Type    Size  Used Avail Use% Mounted on
/dev/sda2    btrfs    3.7G  3.1G  617M  84% /
/dev/sda2 on / type btrfs (rw,ssd_spread)

(If it makes a difference, I am using the ssd_spread mount option (old, slow ssd))

Comment 26 Josef Bacik 2009-10-22 16:40:06 UTC

Ok you really are out of space then, it's working as expected.  That last 17% is reserved for metadata.

Comment 27 Josef Bacik 2009-10-22 16:40:52 UTC

btw you can run btrfs-vol -b on your device to rebalance it and you will be able to reclaim some of that space, but only so much as is free and not actually being used by metadata.

Comment 28 Jason Farrell 2009-10-22 20:57:51 UTC

Thanks Josef... My initial "btrfs-vol -b /" resulted in a kernel panic. I subsequently freed a bunch of space that was just there to find the limit, then btrfsck'd, rebalanced again, and it completed without error in 37mins.

In any case, it's a shame that btrfs isn't currently suited very well for small devices -- mostly in 1st-gen netbooks or CF-IDE adapters -- where every byte counts. btrfs' purported performance edge on SSDs doesn't quite outweigh the loss of 650M out of 3.7G to metadata. I'll probably to return to ext4 for f12 final, and save btrfs for when I eventually buy a more recent 80GB+ indilinx or intel SSD.

Comment 29 Matěj Cepl 2009-10-23 17:25:59 UTC

I tried btrfs-vol -b /
on my testing Rawhide machine and this
http://mcepl.fedorapeople.org/tmp/btrfs-vol-crap/ is the result :)

Using
kernel 2.6.31.1-56.fc12
btrfs-progs 0.19-9

Comment 30 Matěj Cepl 2009-10-24 17:05:41 UTC

So, after surviving collision with bug 530108, I have managed to install kernel-2.6.31.5-91.rc1.fc12 and behavior of BTRFS is significantly better -- btrfsfsck ends without errors.

However, how long it should take for btrfs-vol -b to finish on 6GB drive living on LVM in KVM virtual machine? Mine didn't finish even after 11 hours of occupying almost 100% CPU of the virtual machine and almost that for the host.

I tried to attach strace to it, but it failed with the error message saying that pattach was forbidden (even when strace was run as root with SELinux in the Permissive mode), and gdb just froze on "Attaching to the process <number>".

Next orders, sir?

Comment 31 Bug Zapper 2010-04-27 13:40:35 UTC

This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 32 Josef Bacik 2010-05-05 17:47:53 UTC

Ok things have changed greatly upstream, I'm going to close this bz.  Try F13, it should work much better when it comes to -ENOSPC.