Additional info: reporter: libreport-2.1.9 list_del corruption. next->prev should be ffff88017cf31958, but was ffff88021f5dddb8 Modules linked in: bnep bluetooth rfkill snd_hda_codec_hdmi kvm_amd kvm crc32_pclmul crc32c_intel mxm_wmi ghash_clmulni_intel snd_hda_codec_realtek microcode serio_raw fam15h_power k10temp edac_core edac_mce_amd r8169 snd_hda_intel mii snd_hda_codec sp5100_tco i2c_piix4 snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore wmi shpchp acpi_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd sunrpc ata_generic pata_acpi btrfs libcrc32c xor zlib_deflate raid6_pq radeon i2c_algo_bit drm_kms_helper ttm pata_atiixp drm i2c_core CPU: 6 PID: 232 Comm: btrfs-transacti Not tainted 3.11.6-300.fc20.x86_64 #1 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./990FX Professional, BIOS L1.95A 07/04/2013 0000000000000009 ffff88021f5ddce8 ffffffff8164894b ffff88021f5ddd30 ffff88021f5ddd20 ffffffff8106715d ffff880220a5b000 ffff88021f5dddc8 ffff88017cf31a00 ffff88017ce36ea8 ffff88017cf31958 ffff88021f5ddd80 Call Trace: [<ffffffff8164894b>] dump_stack+0x45/0x56 [<ffffffff8106715d>] warn_slowpath_common+0x7d/0xa0 [<ffffffff810671cc>] warn_slowpath_fmt+0x4c/0x50 [<ffffffff81310f42>] __list_del_entry+0x82/0xd0 [<ffffffffa025fe4e>] btrfs_run_ordered_operations+0xce/0x2a0 [btrfs] [<ffffffffa02471eb>] btrfs_flush_all_pending_stuffs+0x3b/0x40 [btrfs] [<ffffffffa0247e4f>] btrfs_commit_transaction+0x20f/0x950 [btrfs] [<ffffffffa023f72d>] transaction_kthread+0x18d/0x220 [btrfs] [<ffffffffa023f5a0>] ? verify_parent_transid+0x150/0x150 [btrfs] [<ffffffff81088650>] kthread+0xc0/0xd0 [<ffffffff81088590>] ? insert_kthread_work+0x40/0x40 [<ffffffff81657aac>] ret_from_fork+0x7c/0xb0 [<ffffffff81088590>] ? insert_kthread_work+0x40/0x40
Created attachment 822051 [details] File: dmesg
I believe it's fixed in 931aa87791af46640a46b11fa503a119e36943ec. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=931aa87791af46640a46b11fa503a119e36943ec The fix is in Linux 3.13-rc3, but not in 3.12.5. There is no "cc: stable" in the commit. I think the patch should be backported.
What is the process to get this backported? My server is crashing about once a week with this error.
(In reply to Geert Jansen from comment #3) The official backporting requires the appropriate access. If you just want to recompile the kernel, here's the outline of the process (sorry, no time for detailed instructions). yumdownloader --source kernel rpm -i kernel*.src.rpm save the patch from the link put it to ~/rpmbuild/SOURCES list it in ~/rpmbuild/SPECS/kernel*.spec increment the kernel revision in ~/rpmbuild/SPECS/kernel*.spec rebuild the kernel package with "rpmbuild -ba" or (safer but slower) with mock install the recompiled kernel reboot and make sure the new kernel is being loaded by grub enjoy the result watch for kernel upgrades and don't reboot to the unfixed kernels
Created attachment 839241 [details] Patch from the Linux git repository
Did anyone actually test that patch on top of a 3.11 kernel? The upstream commit in comment #2 says it fixes an error that was introduced with commit b02441999efcc6152b87cd58e7970bb7843f76cf "Btrfs: don't wait for the completion of all the ordered extents". That referenced commit is in 3.13-rc3 as well. So the patch was fixing something that supposedly is only broken in 3.13, and that broken commit wasn't brought back to 3.11.y or 3.12.y. I'm not sure this patch will fix anything. Josef?
The patch does not apply to 3.12. The function btrfs_wait_all_ordered_extents has been renamed to btrfs_wait_ordered_roots and has gotten an extra "nr". So I have no idea if this patch still fixes the issue. I will ask on linux-btrfs.org.
Posted the question upstream: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg29914.html
As pointed out in the mailing list, a different patch was provided by Josef Bacik a few days ago. The patch is here: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg29917.html I've attached the patch to this bugzilla. I applies cleanly on the 3.12.5 kernel. I'm building it now on my system and will see if it resolves the issues.
Created attachment 840745 [details] New patch suggested by btrfs-linux mailing list
Did your test of the patch work?
I have been using kernel 3.12.6-300.fc20 from updates-testing for 11 days now, and no crashes so far.
OK, 3.12.6 contains: commit 486d1e163be2d32150a053c7ac3fc853ba6fd998 Author: Josef Bacik <jbacik> Date: Mon Oct 28 09:13:25 2013 -0400 Btrfs: take ordered root lock when removing ordered operations inode commit 93858769172c4e3678917810e9d5de360eb991cc upstream. which is the patch that was suggested. That's already in stable updates, so closing this out. Thanks much!