Bug 759304

Summary:

btrfs: poor performance

Product:

[Fedora] Fedora

Reporter:

Jan Kratochvil <jan.kratochvil>

Component:

kernel

Assignee:

Zach Brown <zab>

Status:

CLOSED INSUFFICIENT_DATA

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

CC:

blair, gansalmon, itamar, jan.kratochvil, jforbes, jonathan, kernel-maint, kxra, madhu.chinakonda, sweil

Target Milestone:

---

Keywords:

Reopened

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2012-11-14 14:58:35 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
xz of libbfd.a	none

Description Jan Kratochvil 2011-12-01 23:23:45 UTC

Description of problem:
Installed my test server on top of btrfs but it is not usable as the testsuites are timing out.

Version-Release number of selected component (if applicable):
kernel-3.1.2-1.fc16.x86_64

How reproducible:
Steadily.

Steps to Reproduce:
nice -n20 ionice -c3 du &>/dev/null &
time ls -l /bin

Actual results:
real	0m12.114s
Just waiting for a command 10+ seconds commonly.

Also the btrfs processes seem to use CPU a lot:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1441 root      20   0     0    0    0 R 56.9  0.0  28:49.98 btrfs-endio-wri
  629 root      20   0     0    0    0 R 54.9  0.0  23:18.88 btrfs-delayed-m
29126 root      20   0     0    0    0 R 53.0  0.0   9:30.76 flush-btrfs-4

FYI using it on top of LUKS on HDD.
My other machine with SSD runs fine but it is not the test server.

Expected results:
less than 1 sec

Additional info:
I was using ext3 on F14 before and with nice/ionice the system was perfectly usable even for interactive work.  The machine was better usable even without nice/ionice.
GDB testsuite run (3 parallel runs) now gets 13 (!) timeout results.
I will have to reinstall it into ext4 as the results are not usable this way.
Sure thanks for all the work, this is FYI if you have some patches.

Comment 1 Jan Kratochvil 2011-12-04 12:48:24 UTC

I found even just under single user mode
  cp -ax --sparse=always / /new/ &
runs ~5MB/s (the disk does ~100MB/s) copying my 600GB btrfs->ext4 for 2 days(!).

top - 14:45:33 up 14:43,  2 users,  load average: 2.75, 2.93, 2.97
Mem:   5982924k total,  5718436k used,   264488k free,   381000k buffers
Swap:        0k total,        0k used,        0k free,  4793812k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1529 root      20   0     0    0    0 R 27.3  0.0 204:52.57 btrfs-delayed-m
  618 root      20   0     0    0    0 R 10.3  0.0  41:29.13 btrfs-transacti
 4106 root      20   0     0    0    0 S  7.7  0.0   0:46.03 kworker/6:3
 1483 root      20   0  127m  11m  500 S  2.6  0.2  21:07.71 cp
  610 root      20   0     0    0    0 S  2.3  0.0  11:39.47 btrfs-endio-0

Comment 2 Dave Jones 2012-03-22 16:42:15 UTC

[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 3 Dave Jones 2012-03-22 16:46:49 UTC

[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 4 Dave Jones 2012-03-22 16:56:14 UTC

[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 5 Jan Kratochvil 2012-06-03 19:19:34 UTC

kernel-3.3.6-3.fc16.x86_64
Still poor performance, this time mock --install of many packages making machine unresponsive for several minutes on X220 Intel SSD.

top - 21:17:05 up 12 days,  5:19, 24 users,  load average: 3.52, 1.88, 1.24
Tasks: 359 total,   2 running, 355 sleeping,   2 stopped,   0 zombie
Cpu0  :  0.0%us,  4.9%sy,  0.0%ni, 94.2%id,  0.0%wa,  0.0%hi,  1.0%si,  0.0%st
Cpu1  :  1.0%us,  4.9%sy,  0.0%ni, 93.1%id,  0.0%wa,  0.0%hi,  1.0%si,  0.0%st
Cpu2  :  0.0%us, 84.2%sy,  0.0%ni,  0.0%id, 14.9%wa,  1.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  5.9%sy,  0.0%ni, 94.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   7928160k total,  7644644k used,   283516k free,      104k buffers
Swap:  8910844k total,  1056624k used,  7854220k free,  5729560k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3112 root      20   0     0    0    0 R 65.2  0.0   3:10.81 flush-btrfs-4
30864 root      20   0     0    0    0 S 18.8  0.0   0:01.83 kworker/2:0
  397 root      20   0 15504 1552 1016 R  2.0  0.0   0:00.07 top
31031 root      20   0     0    0    0 S  1.0  0.0   0:00.81 kworker/1:1

Comment 6 Josef Bacik 2012-06-04 13:51:45 UTC

Can you try this git tree and tell me if it works better


git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git

Comment 7 Jan Kratochvil 2012-06-04 14:04:39 UTC

It would be easier with kernel-next.rpm build.

Comment 8 Jan Kratochvil 2012-06-10 16:45:45 UTC

Tried:
commit af1e297c1beea9d424c09ec0b120226c6b21680d
Author: Josef Bacik <josef>
Date:   Fri Jun 8 15:26:47 2012 -0400

and it works pretty good now like kernel-3.3.6-3.fc16.x86_64, I do not see a difference.  Sometimes there are lock ups only up to ~3 seconds.

But kernel-3.3.6-3.fc16.x86_64 has worked pretty well now compared to Comment 5.
Comment 5 was 12 days running machine using swap etc., on freshly booted box I cannot reproduce it.

kernel-3.3.6-3.fc16.x86_64 seems to be better than Comment 0
kernel-3.1.2-1.fc16.x86_64.

Also I do not have comparable ext4 box.

Comment 9 Jan Kratochvil 2012-06-10 16:53:12 UTC

Two cases where the system was not well responsive during mock --install on the upstream GIT snapshot kernel:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
19881 root      20   0     0    0    0 D  4.0  0.0   0:04.40 kworker/1:0
13981 root      20   0     0    0    0 S  2.6  0.0   0:02.22 kworker/2:1
 1656 root      20   0     0    0    0 S  1.4  0.0   0:12.56 flush-btrfs-4
12012 root      20   0     0    0    0 D  1.2  0.0   0:03.61 btrfs-endio-wri
16584 root      20   0  118m 1284 1052 D  1.2  0.0   0:00.94 tar
18564 root      20   0     0    0    0 S  1.2  0.0   0:07.15 kworker/0:2
18732 root      20   0     0    0    0 D  1.2  0.0   0:01.62 btrfs-endio-wri
16348 root      20   0     0    0    0 S  1.1  0.0   0:03.27 kworker/3:2
12510 root      20   0  123m  19m 7668 S  1.0  0.2   0:39.05 Xorg
18733 root      20   0     0    0    0 D  1.0  0.0   0:01.40 btrfs-endio-wri
16585 root      20   0 35724  568  460 S  0.8  0.0   0:02.31 pigz
18734 root      20   0     0    0    0 D  0.8  0.0   0:01.38 btrfs-endio-wri
12007 root      20   0     0    0    0 S  0.7  0.0   0:02.52 btrfs-worker-2
14005 root      20   0     0    0    0 D  0.7  0.0   0:01.90 btrfs-endio-wri
16355 root      20   0     0    0    0 D  0.4  0.0   0:01.80 btrfs-endio-wri
  554 root      20   0     0    0    0 S  0.3  0.0   0:03.26 btrfs-submit-1

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
15007 root      20   0     0    0    0 S  6.5  0.0   0:01.13 kworker/2:0
12012 root      20   0     0    0    0 S  2.0  0.0   0:04.02 btrfs-endio-wri

Comment 10 Jan Kratochvil 2012-08-08 15:50:58 UTC

Tried to write out 2GB file on Intel SSD with mencoder and it was locked up for ~15 minutes, maybe 20 minutes:

kernel-3.4.7-1.fc16.x86_64

top - 17:36:53 up  8:40, 21 users,  load average: 14.25, 7.59, 3.28
Tasks: 352 total,   4 running, 348 sleeping,   0 stopped,   0 zombie
Cpu0  :  4.0%us, 19.8%sy,  0.0%ni, 40.6%id, 32.7%wa,  0.0%hi,  3.0%si,  0.0%st
Cpu1  :  0.0%us, 15.8%sy,  0.0%ni, 56.4%id, 26.7%wa,  0.0%hi,  1.0%si,  0.0%st
Cpu2  :  0.0%us, 50.0%sy,  0.0%ni, 37.0%id, 13.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us, 16.0%sy,  0.0%ni, 75.0%id,  9.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   7928068k total,  7726608k used,   201460k free,       48k buffers
Swap:  8910844k total,   112608k used,  8798236k free,  5940344k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1169 root      20   0     0    0    0 R 22.8  0.0   0:58.49 flush-btrfs-4
 5436 root      20   0     0    0    0 R  9.9  0.0   0:06.05 kworker/2:0
 5446 lace      20   0  326m  35m 4752 D  4.0  0.5   0:20.03 mencoder
 5408 root      20   0     0    0    0 S  3.0  0.0   0:05.45 kworker/3:1

Moreover even after the file was closed/written at 17:41:20.200789259 and the system was still unusable until 17:47:34, for ~6 minutes.

All applications were locked up most of the time.

Comment 11 Jan Kratochvil 2012-08-20 14:03:32 UTC

GDB builds are also too slow:

$ cat /proc/`pidof ranlib`/stack
[<ffffffffa0172715>] wait_current_trans+0xa5/0x110 [btrfs]
[<ffffffffa0173cc8>] start_transaction+0x128/0x330 [btrfs]
[<ffffffffa01741b3>] btrfs_start_transaction+0x13/0x20 [btrfs]
[<ffffffffa017f70d>] btrfs_rename+0x15d/0x680 [btrfs]
[<ffffffff81190aa6>] vfs_rename+0x2f6/0x4b0
[<ffffffff81193d33>] sys_renameat+0x1f3/0x220
[<ffffffff81193d7b>] sys_rename+0x1b/0x20
[<ffffffff816047e9>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

kernel-3.4.7-1.openvpn.fc16.x86_64
(rebuilt with Bug 834808 Comment 4 patch for nfs-openvpn lock ups)

Comment 12 Jan Kratochvil 2012-10-06 20:54:13 UTC

Created attachment 622824 [details]
xz of libbfd.a

Reproducer:
kernel-3.5.4-2.fc17.x86_64
$ sync; time cp -p /tmp/libbfd.a /tmp/libbfd2.a; time sync
btrfs SSD i7-2620M: real 0m0.076s + real 1m7.290s
ext4  HDD i7-920  : real 0m0.343s + real 0m3.190s
$ sync; time ranlib /tmp/libbfd.a; time sync
btrfs SSD i7-2620M: real 1m6.679s + real 0m1.291s
ext4  HDD i7-920  : real 0m0.364s + real 0m2.335s

Comment 13 Jan Kratochvil 2012-10-07 13:08:35 UTC

ranlib hangs and stops the whole system on:
rename("/tmp/stbkRwSX", "/tmp/libbfd.a"

Comment 14 Dave Jones 2012-10-23 15:30:45 UTC

# Mass update to all open bugs.

Kernel 3.6.2-1.fc16 has just been pushed to updates.
This update is a significant rebase from the previous version.

Please retest with this kernel, and let us know if your problem has been fixed.

In the event that you have upgraded to a newer release and the bug you reported
is still present, please change the version field to the newest release you have
encountered the issue with.  Before doing so, please ensure you are testing the
latest kernel update in that release and attach any new and relevant information
you may have gathered.

If you are not the original bug reporter and you still experience this bug,
please file a new report, as it is possible that you may be seeing a
different problem. 
(Please don't clone this bug, a fresh bug referencing this bug in the comment is sufficient).

Comment 15 Justin M. Forbes 2012-11-14 14:58:35 UTC

With no response, we are closing this bug under the assumption that it is no longer an issue. If you still experience this bug, please feel free to reopen the bug report.

Comment 16 kxra 2013-04-02 14:46:59 UTC

In this mailing list thread you claim that this bug was closed "without any human reply" but the only reason it was closed is because you did not post to say whether or not it is still a bug.
https://lists.fedoraproject.org/pipermail/devel/2013-January/176334.html

If it is still a bug, it definitely needs to be fixed. Can you please retest?

Comment 17 Jan Kratochvil 2013-04-02 15:00:48 UTC

I was retesting it in Comment 5 and Comment 8 and there was no change.

Automatically asking for mass retesting of all the open Bugs (see for example Comment 3) is "not friendly" from the kernel maintainers.  Retesting makes sense if the maintainer is aware of a specific fix (which s/he should mention) which could fix the bug.

And I even provided in Comment 12 easy self-contained reproducer so that the maintainer can use it for retesting it on his/her own.  So there should be no retest request from the reporter until the maintainer believes s/he has fixed it.  S/he needs a reproducer for the fix anyway.  Here was not even a confirmation Comment 12 makes the bug reproducible for the maintainer.

And I even cannot (easily) test it anymore as I had to switch both my boxes back to ext4 as I do not have time to wait 15+ minutes for simple compilation.