Bug 1307091

Summary: fstrim failing on mdadm raid 5 device
Product: Red Hat Enterprise Linux 7 Reporter: John Pittman <jpittman>
Component: kernelAssignee: Jes Sorensen <Jes.Sorensen>
kernel sub component: Multiple Devices (MD) QA Contact: Zhang Xiaotian <xiaotzha>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: jshortt, meverett, xiaotzha, xni, yizhan
Version: 7.2Keywords: Reopened
Target Milestone: rc   
Target Release: 7.3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-3.10.0-429.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1318231 (view as bug list) Environment:
Last Closed: 2016-11-03 15:32:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1203710, 1274397, 1313485    

Description John Pittman 2016-02-12 17:02:05 UTC
Description of problem:

fstrim command failing on mdadm raid 5 device.

Version-Release number of selected component (if applicable):

kernel-3.10.0-327.4.5.el7.x86_64

How reproducible:

Create mdadm dev at levels mentioned; mount; run fstrim on mount point

- Successful
#vgcreate testtrim /dev/sdg1

#lvcreate -l 100%FREE -n lvtrimtest testtrim

#mkfs.xfs /dev/testtrim/lvtrimtest

#mount /dev/testtrim/lvtrimtest /storage

#fstrim -v /storage
/storage: 953,4 GiB (1023706796032 bytes) trimmed

- Failed
#mdadm --create --verbose /dev/md0 --level=5 --raid-devices=6 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1

#mkfs.xfs /dev/md0

#mount /dev/md0 /storage

#fstrim -v /storage
fstrim: /storage: the discard operation is not supported

Actual results:

command complains that fstrim isn't supported

Expected results:

command should succeed

Additional info:

Host: scsi0 Channel: 02 Id: 00 Lun: 00
  Vendor: DELL     Model: PERC H730        Rev: 4.24
  Type:   Direct-Access                    ANSI  SCSI revision: 05

Host: scsi0 Channel: 00 Id: 06 Lun: 00
  Vendor: ATA      Model: Crucial_CT1024M5 Rev: MU01
  Type:   Direct-Access                    ANSI  SCSI revision: 06

[john@host]$ cat etc/modprobe.d/raid456.conf 
options raid456 devices_handle_discard_safely=Y

md0, discard_granularity: 4194304
sde, discard_granularity: 4096
sdf, discard_granularity: 4096
sdg, discard_granularity: 4096
sdh, discard_granularity: 4096
sdi, discard_granularity: 4096
sdj, discard_granularity: 4096
md0, discard_max_bytes: 134217216
sde, discard_max_bytes: 134217216
sdf, discard_max_bytes: 134217216
sdg, discard_max_bytes: 134217216
sdh, discard_max_bytes: 134217216
sdi, discard_max_bytes: 134217216
sdj, discard_max_bytes: 134217216
md0, discard_zeroes_data: 0
sde, discard_zeroes_data: 1
sdf, discard_zeroes_data: 1
sdg, discard_zeroes_data: 1
sdh, discard_zeroes_data: 1
sdi, discard_zeroes_data: 1
sdj, discard_zeroes_data: 1

 cat /sys/module/raid456/parameters/devices_handle_discard_safely
Y

[john@host]$ cat proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 sdf1[1] sdi1[4] sdh1[3] sde1[0] sdg1[2] sdj1[6]
      5000360960 blocks super 1.2 level 5, 512k chunk, algorithm 2 [6/6] [UUUUUU]
      bitmap: 0/8 pages [0KB], 65536KB chunk

unused devices: <none>

/dev/md0: UUID="<omitted>" TYPE="xfs" 

/dev/sde1: UUID="<omitted>" UUID_SUB="<omitted>" LABEL="0" TYPE="linux_raid_member" 
/dev/sdi1: UUID="<omitted>" UUID_SUB="<omitted>" LABEL="0" TYPE="linux_raid_member" 
/dev/sdg1: UUID="<omitted>" UUID_SUB="<omitted>" LABEL="0" TYPE="linux_raid_member" 
/dev/sdf1: UUID="<omitted>" UUID_SUB="<omitted>" LABEL="0" TYPE="linux_raid_member" 
/dev/sdj1: UUID="<omitted>" UUID_SUB="<omitted>" LABEL="0" TYPE="linux_raid_member" 
/dev/sdh1: UUID="<omitted>" UUID_SUB="<omitted>" LABEL="0" TYPE="linux_raid_member"

[john@host]$ cat mount | grep md0
/dev/md0 on /storage type xfs (rw,relatime,attr2,inode64,sunit=1024,swidth=5120,noquota)

Please let me know if I can be of assistance.

John

Comment 2 Jes Sorensen 2016-02-13 03:51:34 UTC
John,

I don''t see any mention of you setting devices_handle_discard_safely here
when loading the raid5 module. Per upstream, discard isn't enabled per default
on raid5, so the behavior you list here is normal.

I don't see any problem here.

Jes

Comment 6 jdeffenb 2016-02-15 14:35:44 UTC
Data from smartctl, 6 identical drives.

Device Model:     Crucial_CT1024M550SSD1
Serial Number:    14110C0B0346
LU WWN Device Id: 5 00a075 10c0b0346
Firmware Version: MU01
User Capacity:    1 024 209 543 168 bytes [1,02 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Feb 15 07:49:53 2016 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


Does not work with less than 5 disks:

Created a 4 disk raid5
mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1
mkfs.xfs -f /dev/md0
mount /dev/md0 /storage
fstrim -v /storage
fstrim: /storage1: the discard operation is not supported

With the 2 spare drives I created a raid0
mdadm --create --verbose /dev/md1 --level=0 --raid-devices=2 /dev/sdi1 /dev/sdj1
mkfs.xfs /dev/md1
mkdir /storage1
mount /dev/md1 /storage1/
fstrim -v /storage1
fstrim: /storage1: the discard operation is not supported

..and a raid 1
umount /storage1
mdadm --stop /dev/md1
mdadm --zero-superblock /dev/sdi1 /dev/sdj1
mdadm --create --verbose /dev/md1 --level=1 --raid-devices=2 /dev/sdi1 /dev/sdj1
mkfs.xfs -f /dev/md1 
mount /dev/md1 /storage1/
fstrim -v /storage1
/storage1: 953,3 GiB (1023574020096 bytes) trimmed

Comment 7 Jes Sorensen 2016-02-15 15:48:25 UTC
For raid0 you need to enable this module parameter:
devices_discard_performance

We had a problem with bad devices suffering from extremely slow discard
speeds, and there is no way to test for it.

Jes

Comment 11 Jes Sorensen 2016-02-16 20:44:15 UTC
Zhang Yi did some tests and his results are as follows:

RAID5: 8 partitions: FAIL
RAID5: 7 partitions: FAIL
RAID5: 6 partitions: FAIL
RAID5: 5 partitions: PASS
RAID5: 4 partitions: PASS

I did some more digging, and I can reproduce this with the upstream kernel
as well, so this is not RHEL specific,

Need to investigate why the number of devices impacts this.

Comment 12 Jes Sorensen 2016-02-16 21:46:39 UTC
I think I know what is wrong here.

The kernel validates various limits before enabling discard support. However
in one of the cases it compares sectors with bytes, instead of sectors with
sectors.

I have posted a patch upstream and will follow up once I hear back from the
upstream maintainer.

Jes

Comment 13 Jes Sorensen 2016-02-17 18:34:46 UTC
Patch has been accepted upstream. It should get pulled in automatically
once we sync md for 7.3.

Jes

Comment 14 John Pittman 2016-02-23 13:42:09 UTC
Hi Jes, thanks again. I'm adding this bug as public since there is no identifying information present.  Doing this so the customer can follow along.

Also, once the changes get added into 7.3 and the customer upgrades, will the raid device need to be rebuilt?  Or will we be fine just updating and rebooting into the new kernel?

Best Regards,

John Pittman
Global Support Services
Red Hat Inc.

Comment 15 Rafael Aquini 2016-06-09 02:21:45 UTC
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing

Comment 17 Rafael Aquini 2016-06-10 15:06:09 UTC
Patch(es) available on kernel-3.10.0-429.el7

Comment 21 errata-xmlrpc 2016-11-03 15:32:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2574.html