Created attachment 1783266 [details] dmesg from live usb Created attachment 1783266 [details] dmesg from live usb Description of problem: I use fedora 34 with btrfs on my notebook, then my friend brings me his hp mini 210 notebook. I replace his hdd with my ssd & try to boot from it. Unfortunatly, hp mini have some hardware malfunction. I return my ssd to my notebook and boot. Then i see some btrfs errors in dmesg. I make bootable usb with fedora 34 on another computer, boot my notebook & try to repair btrfs by mount my home partition with sudo mount /dev/sda3 /mnt then sudo btrfs scrub status /mnt on my system i have fedora 34 with 5.11.18 kernel, on fedora bootable usb kernel version is 5.11.12-300.fc34.x86_64 scrub gets 148867 uncorrectable errors. i run sudo smartctl --test=short /dev/sda without any errors then map network drive & try to copy my files on it. then i see some "tainted" messages on dmesg like ------------[ cut here ]------------ WARNING: CPU: 6 PID: 52636 at fs/btrfs/inode.c:1706 run_delalloc_nocow+0x6d7/0x950 Modules linked in: vfat fat mmc_block nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter cmac bnep snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation amdgpu snd_soc_core snd_compress snd_pcm_dmaengine soundwire_cadence snd_hda_codec uvcvideo intel_rapl_msr ath10k_pci snd_hda_core intel_rapl_common ath10k_core videobuf2_vmalloc ac97_bus videobuf2_memops snd_hwdep btusb edac_mce_amd snd_seq videobuf2_v4l2 mac80211 iommu_v2 btrtl kvm_amd videobuf2_common gpu_sched btbcm i2c_algo_bit snd_seq_device kvm btintel drm_ttm_helper videodev snd_pcm bluetooth ttm ath snd_timer mc drm_kms_helper snd cfg80211 irqbypass ecdh_generic ecc soundcore cec libarc4 acer_wmi sparse_keymap rapl rfkill sp5100_tco pcspkr joydev wmi_bmof i2c_piix4 k10temp acer_wireless acpi_cpufreq drm zram ip_tables nls_utf8 isofs squashfs rtsx_pci_sdmmc mmc_core hid_multitouch crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw uas usb_storage ccp r8169 rtsx_pci video wmi pinctrl_amd i2c_hid sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse CPU: 6 PID: 52636 Comm: btrfs-transacti Not tainted 5.11.12-300.fc34.x86_64 #1 Hardware name: Acer Aspire A315-41/Metapod_RR, BIOS V1.18 06/18/2020 RIP: 0010:run_delalloc_nocow+0x6d7/0x950 Code: 8b 78 40 e8 ab 95 ff ff 4c 8b 4c 24 48 4c 8b 54 24 58 85 c0 41 89 c5 74 49 0f 88 19 fe ff ff 80 7c 24 6e 00 0f 84 6b fb ff ff <0f> 0b e9 64 fb ff ff 48 8b 4c 24 40 48 8b 14 24 41 b9 0f 00 00 00 RSP: 0018:ffffba8cc3e57910 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000002 RCX: 0000000000040897 RDX: 0000000000040896 RSI: 7887d2e8f9dea8ec RDI: 000000000002f140 RBP: 000000011e706000 R08: ffffba8cc3e578c8 R09: 0000000000040000 R10: 0000000000040000 R11: 0000000000000000 R12: 0000000000040000 R13: 0000000000000001 R14: 0000000000000000 R15: ffff92589352fd20 FS: 0000000000000000(0000) GS:ffff9258a9d80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1a938c4000 CR3: 0000000145de4000 CR4: 00000000003506e0 Call Trace: btrfs_run_delalloc_range+0x60/0x5b0 ? find_lock_delalloc_range+0x1c3/0x1e0 writepage_delalloc+0x99/0x150 __extent_writepage+0xd1/0x2e0 extent_write_cache_pages.constprop.0+0x24c/0x3f0 extent_writepages+0x33/0x90 do_writepages+0x31/0xb0 ? __wake_up_common_lock+0x7a/0x90 __filemap_fdatawrite_range+0xa7/0xe0 btrfs_fdatawrite_range+0x1b/0x50 btrfs_write_out_cache+0x55a/0x5b0 btrfs_start_dirty_block_groups+0x212/0x540 btrfs_commit_transaction+0xb1/0xa80 ? start_transaction+0xce/0x580 transaction_kthread+0x12b/0x190 ? btrfs_cleanup_transaction.isra.0+0x560/0x560 kthread+0x11b/0x140 ? kthread_associate_blkcg+0xa0/0xa0 ret_from_fork+0x22/0x30 ---[ end trace 2a4974bb7fcdff95 ]--- ------------[ cut here ]------------ WARNING: CPU: 6 PID: 52636 at fs/btrfs/inode.c:1048 cow_file_range+0x2e0/0x400 Modules linked in: vfat fat mmc_block nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter cmac bnep snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation amdgpu snd_soc_core snd_compress snd_pcm_dmaengine soundwire_cadence snd_hda_codec uvcvideo intel_rapl_msr ath10k_pci snd_hda_core intel_rapl_common ath10k_core videobuf2_vmalloc ac97_bus videobuf2_memops snd_hwdep btusb edac_mce_amd snd_seq videobuf2_v4l2 mac80211 iommu_v2 btrtl kvm_amd videobuf2_common gpu_sched btbcm i2c_algo_bit snd_seq_device kvm btintel drm_ttm_helper videodev snd_pcm bluetooth ttm ath snd_timer mc [ 9156.206119] drm_kms_helper snd cfg80211 irqbypass ecdh_generic ecc soundcore cec libarc4 acer_wmi sparse_keymap rapl rfkill sp5100_tco pcspkr joydev wmi_bmof i2c_piix4 k10temp acer_wireless acpi_cpufreq drm zram ip_tables nls_utf8 isofs squashfs rtsx_pci_sdmmc mmc_core hid_multitouch crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw uas usb_storage ccp r8169 rtsx_pci video wmi pinctrl_amd i2c_hid sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi loop fuse CPU: 6 PID: 52636 Comm: btrfs-transacti Tainted: G W 5.11.12-300.fc34.x86_64 #1 Hardware name: Acer Aspire A315-41/Metapod_RR, BIOS V1.18 06/18/2020 RIP: 0010:cow_file_range+0x2e0/0x400 Code: 4c 89 e7 e8 e2 c8 01 00 44 8b 7c 24 08 eb 27 48 85 d2 0f 85 a7 00 00 00 49 8b 8c 24 48 02 00 00 48 83 f9 01 0f 84 6f fd ff ff <0f> 0b 48 89 2c 24 41 bf ea ff ff ff 48 8b 4c 24 10 48 8b 54 24 28 RSP: 0018:ffffba8cc3e57818 EFLAGS: 00010206 RAX: 0000000000001000 RBX: 0000000000040000 RCX: 000000000000012e RDX: 000000000000012e RSI: fffff7634460e540 RDI: ffff92588cb96e28 RBP: 0000000000000000 R08: ffffba8cc3e57a9c R09: ffffba8cc3e57af0 R10: ffff92584e862000 R11: 0000000000000000 R12: ffff92588cb96e28 R13: ffff925810cee000 R14: ffff92588cb96e70 R15: ffffba8cc3e57af0 FS: 0000000000000000(0000) GS:ffff9258a9d80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1a938c4000 CR3: 0000000145de4000 CR4: 00000000003506e0 Call Trace: ? handle_bug+0x3a/0xa0 ? count_range_bits+0x14c/0x1b0 fallback_to_cow+0xb2/0x2a0 run_delalloc_nocow+0x4dd/0x950 btrfs_run_delalloc_range+0x60/0x5b0 ? find_lock_delalloc_range+0x1c3/0x1e0 writepage_delalloc+0x99/0x150 __extent_writepage+0xd1/0x2e0 extent_write_cache_pages.constprop.0+0x24c/0x3f0 extent_writepages+0x33/0x90 do_writepages+0x31/0xb0 ? __wake_up_common_lock+0x7a/0x90 __filemap_fdatawrite_range+0xa7/0xe0 btrfs_fdatawrite_range+0x1b/0x50 btrfs_write_out_cache+0x55a/0x5b0 btrfs_start_dirty_block_groups+0x212/0x540 btrfs_commit_transaction+0xb1/0xa80 ? start_transaction+0xce/0x580 transaction_kthread+0x12b/0x190 ? btrfs_cleanup_transaction.isra.0+0x560/0x560 kthread+0x11b/0x140 ? kthread_associate_blkcg+0xa0/0xa0 ret_from_fork+0x22/0x30 ---[ end trace 2a4974bb7fcdff96 ]--- Version-Release number of selected component (if applicable): # Fedora-LXDE-Live-x86_64-34-1.2.iso: 1371701248 bytes SHA256 (Fedora-LXDE-Live-x86_64-34-1.2.iso) = ac166d49f404a315cd3922a8e014e4cba19cf8f37242c1a4fd9c3ecba4bc269a How reproducible: Steps to Reproduce: 1. 2. 3. Actual results:copying files stops with trace in dmesg. Expected results: 1. copy process may skip broken files but copy all not broken ones without stop & error messages in dmesg 2. if problem solves on latest kernel, is it possible to make updated fedora live usb with this code Additional info:
The problem happened before the call trace above. It would be best to have a complete dmesg for this. Use 'journalctl -b $bootnum' or --since=, along with -k option to extract kernel messages from the journal for prior boots. >scrub gets 148867 uncorrectable errors Scrub reports details of these errors in dmesg as well, so without that we're not sure whether the problem is metadata or data related or some combination of it. Best to attache this as a file to the bug report. Please do 'btrfs check --readonly' while booted from a Fedora installation image/USB stick, and attach it as a file to the bug report. Thanks.
Created attachment 1784169 [details] result of sudo btrfs check --readonly /dev/sda3 after booting from live usb 1. Before you answer i did yesterday "sudo mount /dev/sda3", then several "sudo btrfs scrub start /mnt", then extracted and deleted files from dmesg after "(path: " to ")". Now i have 48327 uncorrectable errors according to btrfs scrub check. 2. According to "Use 'journalctl -b $bootnum'". I don't know how to do it on current system before repair btrfs filesystem. If you know how to do it after boot from usb stick, please tell me. My main btrfs partition is on /dev/sda3.
i reconnect ssd with broken btrfs to second computer with fedora 34 & 5.12.6 kernel after backup all needed information i perform 30 or 40 "sudo btrfs scrub start" command, then delete all files from dmesg and try again & again until i get only 2 unrecovererrors then i try sudo btrfs check --repair /dev/sdb3 > btrfs_check_repair_sdb3.txt but it can't repair all errors. Is it possible to repair btrfs file system or i need to reformat partition to get btrfs without errors?
Created attachment 1787624 [details] after several btrfs scrub start & delete all files with errors from dmesg output i get only 2 uncorrectable errors
Created attachment 1787625 [details] dmesg with 2 errors after last btrfs scrub
Created attachment 1787626 [details] btrfs check --repair can't repair all errors from 1st run, errors exists
Created attachment 1787627 [details] brtfs check --repair can't repair all errors from 2nd run, errors still exists
in dmesg now i only have this BTRFS info (device sdb3): disk space caching is enabled BTRFS info (device sdb3): has skinny extents BTRFS info (device sdb3): bdev /dev/sdb3 errs: wr 0, rd 0, flush 0, corrupt 32152762, gen 0 BTRFS info (device sdb3): enabling ssd optimizations BTRFS info (device sdb3): checking UUID tree
From the last btrfs check, I think the real problems are fixed, but I'm not sure about the many isize messages. Could you update to btrfs-progs 5.12.1 and do: btrfs check --readonly /dev/sdb3 btrfs check --readonly --mode=lowmem /dev/sdb3 And report the output of both? These are currently different check implementations, so we might get more information about residual issues. If you do still see "reset isize for dir" messages, please take a btrfs-image of the file system: btrfs-image -c9 -t4 --ss /dev/sdb3 /path/to/bug1960738-btrfs.image The -ss command will hash filenames; short filenames result in messages about hash collisions, don't worry about it. This is a metadata only image, it doesn't contain any file contents, and is just used by developers for debugging.
Created attachment 1787824 [details] txt file after execute "sudo btrfs check --readonly /dev/sdb3 > btrfs_check_readonly_sdb3.txt"
Created attachment 1787825 [details] output of "sudo btrfs check --readonly /dev/sdb3 > btrfs_check_readonly_sdb3.txt"
Created attachment 1787826 [details] txt file after execute "sudo btrfs check --readonly --mode=lowmem /dev/sdb3 > btrfs_check_readonly_lowmem_sdb3.txt"
Created attachment 1787827 [details] output of "sudo btrfs check --readonly --mode=lowmem /dev/sdb3 > btrfs_check_readonly_lowmem_sdb3.txt"
(In reply to Chris Murphy from comment #9) > From the last btrfs check, I think the real problems are fixed, but I'm not > sure about the many isize messages. Could you update to btrfs-progs 5.12.1 Thank you for advice. 1.btrfs-progs already updates to 5.12.1 sudo dnf install btrfs-progs Последняя проверка окончания срока действия метаданных: 4:03:15 назад, Чт 27 мая 2021 20:12:22. Пакет btrfs-progs-5.12.1-1.fc34.x86_64 уже установлен. Зависимости разрешены. Отсутствуют действия для выполнения Выполнено! > and do: > > btrfs check --readonly /dev/sdb3 > btrfs check --readonly --mode=lowmem /dev/sdb3 2. I update kernel to kernel-core-5.12.7-300.fc34.x86_64 and execute sudo btrfs check --readonly /dev/sdb3 sudo btrfs check --readonly --mode=lowmem /dev/sdb3 Partial logs & txt files in attachments. They say "cache and super generation don't match, space cache will be invalidated" sudo btrfs scrub start /dev/sdb3 say "no errors found" scrub started on /dev/sdb3, fsid 3b47ef70-9dd5-4e57-8003-570eaa7f514a (pid=2229) [xx@yyyy 20210514_bad_btrfs]$ sudo btrfs scrub status /dev/sdb3 [sudo] пароль для al: UUID: 3b47ef70-9dd5-4e57-8003-570eaa7f514a Scrub started: Fri May 28 14:48:30 2021 Status: finished Duration: 0:01:16 Total to scrub: 19.69GiB Rate: 265.41MiB/s Error summary: no errors found Is it ok or i must do some repair??? How to do it? > And report the output of both? These are currently different check > implementations, so we might get more information about residual issues. If > you do still see "reset isize for dir" messages, please take a btrfs-image > of the file system: There are not "reset isize for dir" messages in the logs any more. Only "cache and super generation don't match, space cache will be invalidated" messages exists.
OK my mistake. I thought that "reset isize" messages were persisting after --repair, but they are gone now. You file system is OK, nothing more needs to be done. You can optionally reset the statistics by using the -z option with 'btrfs device stats' command; 'man btrfs device' for more info. >cache and super generation don't match, space cache will be invalidated This message is bogus, it's a known bug that will be fixed soon.
OK hold on, I'm completely confused by the two attachments with "partial_log" in their file names that seemed to have come from the btrfs check command being redirected to std out? I am unfamiliar with this usage that results in two outputs for one command. But it looks like the partial_log files show problems still, even after you did 'btrfs check --repair' in comment 3. So it does seem there are unfixed problems still. Please create a btrfs image per comment 9. And we'll go from there. I've started an upstream thread. https://lore.kernel.org/linux-btrfs/CAJCQCtRnxq2mKOkjQzOedjnh9oxsNOFKoP92pjxDGwuUw1AOYg@mail.gmail.com/T/#u
(In reply to Chris Murphy from comment #9) > btrfs-image -c9 -t4 --ss /dev/sdb3 /path/to/bug1960738-btrfs.image Ok. Thank you for your time. It's typo in "--ss". i do sudo btrfs-image -c9 -t4 -ss /dev/sdb3 /home/xx/bug/20210514_bad_btrfs/bug1960738-btrfs.image and get some warnings like WARNING: cannot find a hash collision for '8618', generating garbage, it won't match indexes WARNING: cannot find a hash collision for '1843', generating garbage, it won't match indexes
Please try to upload image from https://cloud.mail.ru/public/Uf3R/paBECvbN2 . It's 82.3MB
>WARNING: cannot find a hash collision for '8618', generating garbage, it won't match indexes These are expected for short filenames. There's nothing wrong with the image.
The remaining errors following --repair take this form for original mode: unresolved ref dir 81792 index 0 namelen 33 name sd_bus_message_get_signature.3.gz filetype 1 errors 6, no dir index, no inode ref root 257 inode 1637367 errors 2001, no inode item, link count wrong And take this form for lowmem mode: ERROR: root 257 DIR INODE [69956] size 56 not equal to 39 Honestly, I can't make heads or tails of this. Since the btrfs image is taken, the developers have all the info they need to discover the problem; and you can just backup, mkfs, and restore. I don't think your data is at risk so it probably isn't urgent. But I think you'd rather have a clean file system at some point sooner than later. You can backup/restore however convenient, including using btrfs send/receive. Btrfs doesn't propagate corruption, but I do recommend doing `journalctl -fk` to follow kernel messages during the backup so you can see if there are any btrfs related errors, it'll indicate what files *aren't* backed up if they happen to be corrupt.
> Honestly, I can't make heads or tails of this. Since the btrfs image is > taken, the developers have all the info they need to discover the problem; > and you can just backup, mkfs, and restore. Thank you very much, Chris! > I don't think your data is at risk so it probably isn't urgent. Yes, there are no any urgent on this. I only want to get usb-based iso of fedora 34 with fresh kernel & btrfs-progs at the moment of resolve this issue. > system at some point sooner than later. You can backup/restore however > convenient, including using btrfs send/receive. Btrfs doesn't propagate > corruption, but I do recommend doing `journalctl -fk` to follow kernel > messages during the backup Thank you for tip & tricks.
This message is a reminder that Fedora Linux 34 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '34'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 34 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 34 entered end-of-life (EOL) status on 2022-06-07. Fedora Linux 34 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. Thank you for reporting this bug and we are sorry it could not be fixed.