Bug 984236
Summary: | out of memory failure on lvm2 thinp volume when creating filesystems | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Chris Murphy <bugzilla> | ||||||||||||
Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> | ||||||||||||
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | 19 | CC: | agk, bmarzins, bmr, bugzilla, dwysocha, gansalmon, heinzm, itamar, jonathan, kernel-maint, lvm-team, madhu.chinakonda, msnitzer, prajnoha, prockai, zkabelac | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2015-02-18 13:59:54 UTC | Type: | Bug | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Created attachment 773172 [details]
journalctl -xb
Before mkfs.btrfs: [root@f19s ~]# free -m total used free shared buffers cached Mem: 3936 1416 2519 0 33 334 -/+ buffers/cache: 1048 2888 Swap: 7999 0 7999 Within a minute after mkfs.btrfs: [root@f19s ~]# free -m total used free shared buffers cached Mem: 3936 3848 87 0 0 8 -/+ buffers/cache: 3840 95 Swap: 7999 53 7946 Shell hangs, and then ssh connection is closed by remote host shortly thereafter. Created attachment 773194 [details]
journalctl -xb multi-user.target
Attachments 171, 172 are single user command-line boot. This is multi-user.target, which takes quite a bit longer to recover from.
Created attachment 773207 [details]
journalctl -b (mkfs.xfs)
multi-user.target, I'm able to get it to happen with mkfs.xfs also. So I don't think this is btrfs specific. I think it could be an LVM thinp issue.
This isn't reproducing in qemu-kvm pointed to a 16TB (thin provisioned) qcow2 file on the same baremetal machine as prior tests. Works with both xfs and btrfs, and the host free memory doesn't go below 1500MB, including what's consumed for the VM. So I think this is an LVM thinp bug. Either it should work as well as a qcow2 file for a VM, or I should get some kind of "not possible" message when creating the 16TB virtual LV. Hmm looks like dmeventd memory leak (some leaks were fixed in upstream) Please could you try to reproduce without monitoring (lvm.conf monitoring = 0) (and make sure dmeventd is not running). (Also what have been the parameters for thin pool - 'lvs -a -o all') [root@f19s ~]# lvcreate -l 102900 -T vg1/thinp device-mapper: remove ioctl on failed: Device or resource busy Logical volume "thinp" created [ 267.473860] bio: create slab <bio-1> at 1 [ 267.660102] device-mapper: ioctl: unable to remove open device vg1-thinp [ 268.025393] bio: create slab <bio-1> at 1 [ 268.071759] device-mapper: thin: Data device (dm-1) discard unsupported: Disabling discard passdown. [root@f19s ~]# lvs LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert brick1 vg1 Vwi-a-tz- 16.00t thinp 0.00 thinp vg1 twi-a-tz- 401.95g 0.00 /etc/lvm/lvm.conf monitoring set to 0 and confirmed dmeventd isn't running after a reboot. mkfs.btrfs still takes a very long time, I can't ssh into the computer. However, systemd doesn't kill mkfs.btrfs this time, and the formatting completes successfully. Attaching new dmesg showing errors. Created attachment 773396 [details]
dmesg monitoring=0
This is with the regular kernel, not the debug kernel. Is there any difference with the debug kernel for any of this?
So few issues: The ioctl errors is a known issue - there will be some workaround for non-clustered used deployed soon upstream - they are 100% harmless and could be ignored for now. Dmeventd seems to be consuming quite a lot of memory - hopefully this will be addressed with new lvm2 version release. As for kernel question - are you using distro debug kernel - or your own build ? There are dm thinp driver debug kernel options which are ONLY meant to be used by developers and they consume huuuuge amount of memory for various validation data structures - so unless you are dm thinp developer avoid using those options - so make sure they are not enabled in your kernel. Otherwise you would need either to use much more physical memory, or significantly reduce device sizes. (So yes there could be a huge difference between debug-nondebug kernel) (In reply to Zdenek Kabelac from comment #9) > As for kernel question - are you using distro debug kernel - or your own > build ? Distro debug kernel. > > There are dm thinp driver debug kernel options which are ONLY meant to be > used by developers and they consume huuuuge amount of memory for various > validation data structures - so unless you are dm thinp developer avoid > using those options - so make sure they are not enabled in your kernel. > Otherwise you would need either to use much more physical memory, or > significantly reduce device sizes. > (So yes there could be a huge difference between debug-nondebug kernel) Yeah I mean in terms of the quality of dmesg for the purpose of this bug. If it's no different, then I'll use the non-debug kernel. If there is no big need for a snapshots - i.e. the primary purpose is to provision the space - increasing --chunksize for thin pool creation might greatly reduce memory requirements - default size is 64K - so maybe using 512K (or even bigger) would mean big speedup in your case. (For a lot of snapshots - smaller chunk is advantage since a small pieces could be share - but for mostly space provisioning - larger chunks are reducing memory footprint significantly) There's no present need for snapshots. I'm just looking for an alternative to qcow2 files to present large virtual storage to two VM's for glusterfs familiarization. It seems like the host lvm presenting a virtual device to the VM is a bit more efficient than the VM writing btrfs to a qcow2 file on an XFS file system managed by the host. Then you should consider maybe even bigger chunk sizes - it just depends how big provisioning you need - since write of single byte obviously provisions whole chunk. Max supported chunk size is 1GiB - but this may have negative impact when zeroing is enabled at the same time - but again - if you do not need this feature on - then the large chunk sizes with disabled zeroing (default is enabled) will give you maximum speed. Upstream will support 'profiles' with same reasonable defaults for some typical use-cases. Is there still anything we can do better here ? Otherwise I think we should close this BZ. Uncertain, I haven't recently done significant testing on very large virtual sized LV's backed by such small amounts of physical storage. This message is a notice that Fedora 19 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 19. It is Fedora's policy to close all bug reports from releases that are no longer maintained. Approximately 4 (four) weeks from now this bug will be closed as EOL if it remains open with a Fedora 'version' of '19'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 19 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |
Created attachment 773171 [details] dmesg Description of problem: mkfs.btfs fails dramatically on a 16TB LVM thinp virtual volume. Version-Release number of selected component (if applicable): Up to date Fedora 19, plus kernel-3.10.0-1.fc20.x86_64.debug lvm2-2.02.98-9.fc19.x86_64 lvm2-libs-2.02.98-9.fc19.x86_64 btrfs-progs-0.20.rc1.20130308git704a08c-1.fc19.x86_64 How reproducible: Always Steps to Reproduce: 1. Create a 16TB virtual logical volume backed by a 400GB VG-thin-pool. i.e. 500GB HDD is partitioned, one partition is sda8, made into a PV, added to a VG, created a thin-pool from all VG extents, created a 16TB virtual LV named /dev/vg1/brick1. 2. mkfs.btrfs /dev/vg1/brick1 Actual results: Looks like a lot of out of memory errors: [ 335.751711] Out of memory: Kill process 164 (systemd-journal) score 0 or sacrifice child [ 335.754041] Killed process 164 (systemd-journal) total-vm:348984kB, anon-rss:0kB, file-rss:964kB [ 335.999673] Out of memory: Kill process 167 (lvmetad) score 0 or sacrifice child [ 336.001970] Killed process 167 (lvmetad) total-vm:100096kB, anon-rss:0kB, file-rss:1788kB [ 336.509314] Out of memory: Kill process 295 (bash) score 0 or sacrifice child [ 336.511635] Killed process 387 (mkfs.btrfs) total-vm:14044kB, anon-rss:0kB, file-rss:508kB Subsequently possible circular locking dependency detected. Expected results: Not this. Works with mkfs.xfs. Additional info: Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz 4GB memory 8GB swap