Bug 1307091
Summary: | fstrim failing on mdadm raid 5 device | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | John Pittman <jpittman> | |
Component: | kernel | Assignee: | Jes Sorensen <Jes.Sorensen> | |
kernel sub component: | Multiple Devices (MD) | QA Contact: | Zhang Xiaotian <xiaotzha> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | medium | |||
Priority: | unspecified | CC: | jshortt, meverett, xiaotzha, xni, yizhan | |
Version: | 7.2 | Keywords: | Reopened | |
Target Milestone: | rc | |||
Target Release: | 7.3 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | kernel-3.10.0-429.el7 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1318231 (view as bug list) | Environment: | ||
Last Closed: | 2016-11-03 15:32:07 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1203710, 1274397, 1313485 |
Description
John Pittman
2016-02-12 17:02:05 UTC
John, I don''t see any mention of you setting devices_handle_discard_safely here when loading the raid5 module. Per upstream, discard isn't enabled per default on raid5, so the behavior you list here is normal. I don't see any problem here. Jes Data from smartctl, 6 identical drives. Device Model: Crucial_CT1024M550SSD1 Serial Number: 14110C0B0346 LU WWN Device Id: 5 00a075 10c0b0346 Firmware Version: MU01 User Capacity: 1 024 209 543 168 bytes [1,02 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: Solid State Device Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 6 SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Mon Feb 15 07:49:53 2016 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled Does not work with less than 5 disks: Created a 4 disk raid5 mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 mkfs.xfs -f /dev/md0 mount /dev/md0 /storage fstrim -v /storage fstrim: /storage1: the discard operation is not supported With the 2 spare drives I created a raid0 mdadm --create --verbose /dev/md1 --level=0 --raid-devices=2 /dev/sdi1 /dev/sdj1 mkfs.xfs /dev/md1 mkdir /storage1 mount /dev/md1 /storage1/ fstrim -v /storage1 fstrim: /storage1: the discard operation is not supported ..and a raid 1 umount /storage1 mdadm --stop /dev/md1 mdadm --zero-superblock /dev/sdi1 /dev/sdj1 mdadm --create --verbose /dev/md1 --level=1 --raid-devices=2 /dev/sdi1 /dev/sdj1 mkfs.xfs -f /dev/md1 mount /dev/md1 /storage1/ fstrim -v /storage1 /storage1: 953,3 GiB (1023574020096 bytes) trimmed For raid0 you need to enable this module parameter: devices_discard_performance We had a problem with bad devices suffering from extremely slow discard speeds, and there is no way to test for it. Jes Zhang Yi did some tests and his results are as follows: RAID5: 8 partitions: FAIL RAID5: 7 partitions: FAIL RAID5: 6 partitions: FAIL RAID5: 5 partitions: PASS RAID5: 4 partitions: PASS I did some more digging, and I can reproduce this with the upstream kernel as well, so this is not RHEL specific, Need to investigate why the number of devices impacts this. I think I know what is wrong here. The kernel validates various limits before enabling discard support. However in one of the cases it compares sectors with bytes, instead of sectors with sectors. I have posted a patch upstream and will follow up once I hear back from the upstream maintainer. Jes Patch has been accepted upstream. It should get pulled in automatically once we sync md for 7.3. Jes Hi Jes, thanks again. I'm adding this bug as public since there is no identifying information present. Doing this so the customer can follow along. Also, once the changes get added into 7.3 and the customer upgrades, will the raid device need to be rebuilt? Or will we be fine just updating and rebooting into the new kernel? Best Regards, John Pittman Global Support Services Red Hat Inc. Patch(es) committed on kernel repository and an interim kernel build is undergoing testing Patch(es) available on kernel-3.10.0-429.el7 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2574.html |