Hide Forgot
As part of btrfs evaluation test, I am trying to remove a defective disk from a btrfs filesystem using the command "btrfs dev delete /dev/sde1 /". I am pretty sure this worked under el7.0 or el7.1, but with the el7.2, this command has been grinding away for 20 hours now without any sign of progress. From btrfs documentation it is not clear how to tell if it is making progress or if it is completely stuck or somewhere in between. I do see some I/O happening between disks, but without any clear pattern (like 100% busy moving data or 100% busy trying to read the disk-to-be-removed). Actually I am pretty sure I do not see any write activity at all. I also do see a severe slowdown of the machine, ssh root@daq11 takes several minutes instead of a few seconds. This is not the expected behaviour, BTW, with this btrfs filesystem being fully RAID1, I expected the bad disk to be released immediately (it does not contain any unique data), followed by a rebalance to re-raid1 (re-duplicate) the data. (Ok, rebalance first, release the bad disk second). (If instead, btrfs is trying to rebalance the data by reading it from the bad disk - there is a reason I am removing it - it is defective - with growing bad sectors, and severe SMART warnings - then the btrfs developers should be fired - clearly they did not consider the most obvious use cases). (To remember, I am evaluating btrfs as replacement for RAID1+ext4 for using in high-availability systems - main requirement is uninterrupted operation if any 1 disk completely fails) Some configuration details (disk /dev/sde is the one being removed): BTW, I expected the "GiB Used" counters to change as device removal and implied rebalance is makes progress, but I do not see any numbers change at all. [root@daq11 ~]# btrfs fi df / Data, RAID1: total=656.00GiB, used=643.80GiB System, RAID1: total=32.00MiB, used=160.00KiB Metadata, RAID1: total=60.00GiB, used=51.16GiB GlobalReserve, single: total=512.00MiB, used=0.00B [root@daq11 ~]# [root@daq11 ~]# btrfs fi show Label: 'centos_daq11' uuid: 8ef30d1e-8671-4f99-9032-3fb1ca9ccf99 Total devices 6 FS bytes used 694.96GiB devid 1 size 1.75TiB used 263.00GiB path /dev/sda3 devid 2 size 1.75TiB used 264.00GiB path /dev/sdb3 devid 5 size 0.00B used 318.03GiB path /dev/sde1 devid 6 size 1.82TiB used 318.03GiB path /dev/sdf1 devid 8 size 1.75TiB used 263.00GiB path /dev/sdd3 devid 9 size 1.75TiB used 6.00GiB path /dev/sdc3 btrfs-progs v3.19.1 [root@daq11 ~]# btrfs dev usage / /dev/sda3, ID: 1 Device size: 1.75TiB Data,RAID1: 236.00GiB Metadata,RAID1: 27.00GiB Unallocated: 1.49TiB /dev/sdb3, ID: 2 Device size: 1.75TiB Data,RAID1: 241.00GiB Metadata,RAID1: 23.00GiB Unallocated: 1.49TiB /dev/sdc3, ID: 9 Device size: 1.75TiB Data,RAID1: 6.00GiB Unallocated: 1.74TiB /dev/sdd3, ID: 8 Device size: 1.75TiB Data,RAID1: 247.00GiB Metadata,RAID1: 16.00GiB Unallocated: 1.49TiB /dev/sde1, ID: 5 Device size: 1.82TiB Data,RAID1: 290.00GiB Metadata,RAID1: 28.00GiB System,RAID1: 32.00MiB Unallocated: 16.00EiB /dev/sdf1, ID: 6 Device size: 1.82TiB Data,RAID1: 292.00GiB Metadata,RAID1: 26.00GiB System,RAID1: 32.00MiB Unallocated: 1.51TiB [root@daq11 ~]# [root@daq11 ~]# rpm -q btrfs-progs btrfs-progs-3.19.1-1.el7.x86_64 [root@daq11 ~]# uname -a Linux daq11.triumf.ca 3.10.0-327.3.1.el7.x86_64 #1 SMP Wed Dec 9 14:09:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux [root@daq11 ~]# K.O.
Btrfs doesn't yet have device 'faulty' state like md/mdadm, even upstream. It will try to read/write to defective devices indefinitely, and maybe the resulting flood of retries is what's slowing things down. https://btrfs.wiki.kernel.org/index.php/Project_ideas#Take_device_with_heavy_IO_errors_offline_or_mark_as_.22unreliable.22 'dev delete <dev>' does not consider the specified device actually deleted (or ignorable) until all of its data is replicated on other devices, i.e. a 3rd copy must be created before sde1 is considered no longer necessary and the device is released. Instead, physically remove the device, or issue 'echo 1 > /sys/block/device-name/device/delete' and then use 'btrfs dev delete missing' to initiate the replication of missing data that was on the bad device to remaining devices. Alternatively, when replacing the bad device, it's better to use 'btrfs replace' either with -r option (mostly ignore the bad device unless needed), or physically remove or sysfs delete it first.
I do not think your instructions will work. a) If I physically remove the disk, it will not become "missing" in btrfs, instead the syslog will fill with disk errors. b) If I "echo 1 > /sys/block/.../delete", I think the same thing will happen. I suspect the only way to mark a disk as missing (permitting "btrfs dev delete missing") is through a reboot, but as we already know, RHEL7.2 will not boot from a degraded btrfs filesystem. A catch22 if there is one. K.O. P.S. I see all this as a very bad sign. Obviously, btrfs authors failed to think though the most simple failure scenario (a dead disk). Makes one wonder what other failure modes they ignored or dismissed as "an exercise for the user" (as in "restore from backup and start from scratch" - I do read the btrfs mailing lists). P.P.S. As for my machine with the stuck "btrfs dev delete", after 4 days of "maybe it just takes a very very very long time, let's wait", the machine died (no ping). K.O.
(In reply to Konstantin Olchanski from comment #3) > a) If I physically remove the disk, it will not become "missing" in btrfs, > instead the syslog will fill with disk errors. > b) If I "echo 1 > /sys/block/.../delete", I think the same thing will happen. Every time I've tried either of these, 'btrfs fi show' has always immediately displayed the missing device as missing. > I suspect the only way to mark a disk as missing (permitting "btrfs dev > delete missing") is through a reboot, but as we already know, RHEL7.2 will > not boot from a degraded btrfs filesystem. Try it first? I've done this a bunch of times and in the normal case it does work. When it doesn't work it's because something else is wrong, and thus an edge case, and requires supplying a lot of state information because "it doesn't work" is just totally not revealing.
I only tried to simulate disk failure by disconnecting the disk under el7.0, did not try with el7.1 and el7.2. I am pretty sure I did not see the disconnected disk go "missing" then. I will try again with el7.2 early next week when I can physically access the machine. BTW, what you say is inconsistent with the BTRFS documentation: https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices says: "btrfs device delete missing tells btrfs to remove the first device that is described by the filesystem metadata but not present when the FS was mounted." Which I read as: to use "btrfs dev delete missing", the btrfs fileystem has to be unmounted, then remounted in degraded mode. For the "/" filesystem it means the machine has to be rebooted. A search for "missing" in the btrfs wiki (https://btrfs.wiki.kernel.org/index.php?title=Special%3ASearch&search=missing&go=Go) does not show any additional information on what "missing" means, does and how one gets there. K.O.
We are both right. Physical disconnect of the disk does make "btrfs fi show /" report "some devices missing" (as Chris M. says) and all other commands still see this disconnected device and btrfs still tries to write to it. (as I remember). Now from this state, I confirm that "delete missing" does not work: [root@daq11 ~]# btrfs dev delete missing / ERROR: error removing the device 'missing' - no missing devices found to remove [root@daq11 ~]# [root@daq11 ~]# btrfs dev delete /dev/sde1 / ERROR: error removing the device '/dev/sde1' - No such file or directory [root@daq11 ~]# Here is additional information: [root@daq11 ~]# btrfs fi show / Label: 'centos_daq11' uuid: 8ef30d1e-8671-4f99-9032-3fb1ca9ccf99 Total devices 6 FS bytes used 699.64GiB devid 1 size 1.75TiB used 263.00GiB path /dev/sda3 devid 2 size 1.75TiB used 264.00GiB path /dev/sdb3 devid 6 size 1.82TiB used 310.03GiB path /dev/sdf1 devid 8 size 1.75TiB used 263.00GiB path /dev/sdd3 devid 9 size 1.75TiB used 26.00GiB path /dev/sdc3 *** Some devices missing btrfs-progs v3.19.1 [root@daq11 ~]# [root@daq11 ~]# btrfs dev usage / /dev/sda3, ID: 1 Device size: 1.75TiB Data,RAID1: 236.00GiB Metadata,RAID1: 27.00GiB Unallocated: 1.49TiB /dev/sdb3, ID: 2 Device size: 1.75TiB Data,RAID1: 241.00GiB Metadata,RAID1: 23.00GiB Unallocated: 1.49TiB /dev/sdc3, ID: 9 Device size: 1.75TiB Data,RAID1: 26.00GiB Unallocated: 1.72TiB /dev/sdd3, ID: 8 Device size: 1.75TiB Data,RAID1: 247.00GiB Metadata,RAID1: 16.00GiB Unallocated: 1.49TiB /dev/sde1, ID: 5 Device size: 0.00B Data,RAID1: 276.00GiB Metadata,RAID1: 28.00GiB System,RAID1: 32.00MiB Unallocated: 1.52TiB /dev/sdf1, ID: 6 Device size: 1.82TiB Data,RAID1: 284.00GiB Metadata,RAID1: 26.00GiB System,RAID1: 32.00MiB Unallocated: 1.52TiB [root@daq11 ~]#
With help from Chris M., my catch-22 is resolved: a) disconnect disk that will be removed from btrfs b) reboot with "rd.shell" and "rd.break=pre-init" (I type them in the grub editor from the grub menu) c) get the "emergency shell" (looks like right before the infinite wait for btrfs uuid) d) # mount -o degraded /dev/sdb3 /sysroot e) btrfs dev delete missing /sysroot f) watch progress of btrfs data balancer, will take some time. Would be nice if normal "btrfs dev delete" were fixed some day. K.O.
Made a typo in previous message: "rd.break=pre-mount", not "pre-init". K.O.
Additional information. Back on January 5th, I booted the machine in single-user mode and it was running "btrfs delete missing /" ever since. Today "btrfs delete" finally completed - around 300 GB of data rearranged in 20 days - this must a speed record of sorts - 15 GB per day, 0.2 Mbytes/sec. With btrfs no longer degraded, rebooted the machine in multi-user mode (degraded btrfs will not boot, remember?), rebooted latest kernel. [root@daq11 ~]# uname -a Linux daq11.triumf.ca 3.10.0-327.4.5.el7.x86_64 #1 SMP Mon Jan 25 22:07:14 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Now running "btrfs dev delete" to liberate one more disk, expect an update in 20 days. Impressive! K.O.
For the record. "btrfs dev delete" never completed. After 1 month (I am patient), I ended up reinstalling the OS (to move "/" from btrfs on 6xHDD to xfs on SSD) and erasing the btrfs disks (complete data loss, if this were actual data). As summary, btrfs in el7.2 is useless junk. (and I do not care if it works oh so well on the ssd on your laptop). K.O.
ok to close this bug, I do not see how I can close it myself. K.O.
My btrfs evaluation is complete, btrfs in el7.2 is unusable, will be using zfs instead. K.O.
close this bug already. nobody but bots left at red hat? K.O.
(In reply to Konstantin Olchanski from comment #13) > close this bug already. nobody but bots left at red hat? K.O. Apologies for the lack of attention on this bug, it had been mis-assigned. However, I'm afraid that btrfs did not exit tech preview in RHEL7 and has been deprecated. No further fixes will be provided.