Bug 958096 - TRIM 'discard' option not working on SSD LVM partition
Summary: TRIM 'discard' option not working on SSD LVM partition
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 19
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: LVM and device-mapper development team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-04-30 11:42 UTC by Stephen Gallagher
Modified: 2013-04-30 13:26 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-04-30 13:19:05 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Complete dmesg output (69.22 KB, text/plain)
2013-04-30 11:42 UTC, Stephen Gallagher
no flags Details

Description Stephen Gallagher 2013-04-30 11:42:05 UTC
Description of problem:
'dmesg' reports that dm-2 does not support the 'discard' option that it was mounted with, though the device does support it.

Version-Release number of selected component (if applicable):
kernel-3.9.0-0.rc8.git0.2.fc19.x86_64

How reproducible:
Every time

Steps to Reproduce:
1. Add the 'discard' mount option to /etc/fstab
2. Reboot
3. Check 'dmesg'
  
Actual results:
'dmesg' reports:
[  +0.035082] EXT4-fs (dm-3): mounting with "discard" option, but the device does not support discard


Expected results:
The partitions should be properly mounted with 'discard' to avoid SSD degradation over time.

Additional info:
lsblk:

NAME                                                 MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                                                    8:0    0 238.5G  0 disk  
├─sda1                                                 8:1    0   500M  0 part  /boot
├─sda2                                                 8:2    0  81.9G  0 part  
│ └─luks-d6ac6d01-397e-4b7d-b715-9f98ccfe88d6 (dm-0) 253:0    0  81.9G  0 crypt 
│   ├─fedora_sgallagh520-swap (dm-1)                 253:1    0   3.8G  0 lvm   [SWAP]
│   ├─fedora_sgallagh520-root (dm-2)                 253:2    0  39.3G  0 lvm   /
│   └─fedora_sgallagh520-home (dm-3)                 253:3    0  48.8G  0 lvm   /home
└─sda3                                                 8:3    0 156.1G  0 part  
  └─fedora_sgallagh520-root (dm-2)                   253:2    0  39.3G  0 lvm   /
sr0                                                   11:0    1  1024M  0 rom   

/etc/fstab (comments trimmed):
/dev/mapper/fedora_sgallagh520-root /                       ext4 defaults,x-systemd.device-timeout=0,noatime,nodiratime,discard 1 1
UUID=6d99bfa9-b81e-47bd-a62e-0a92baf2f7d0 /boot                   ext4    defaults,noatime,nodiratime,discard        1 2
/dev/mapper/fedora_sgallagh520-home /home                   ext4    defaults,x-systemd.device-timeout=0,noatime,nodiratime,discard 1 2
/dev/mapper/fedora_sgallagh520-swap swap                    swap    defaults,x-systemd.device-timeout=0,noatime,nodiratime,discard 0 0


`mount`:

proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=1960036k,nr_inodes=490009,mode=755)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,seclabel,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
/dev/mapper/fedora_sgallagh520-root on / type ext4 (rw,noatime,nodiratime,seclabel,discard,data=ordered)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=34,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel)
tmpfs on /tmp type tmpfs (rw,seclabel)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel)
configfs on /sys/kernel/config type configfs (rw,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
/dev/sda1 on /boot type ext4 (rw,noatime,nodiratime,seclabel,discard,data=ordered)
/dev/mapper/fedora_sgallagh520-home on /home type ext4 (rw,noatime,nodiratime,seclabel,discard,data=ordered)

pvs/vgs/lvs:
  PV                                                    VG                 Fmt  Attr PSize   PFree  
  /dev/mapper/luks-d6ac6d01-397e-4b7d-b715-9f98ccfe88d6 fedora_sgallagh520 lvm2 a--   81.89g      0 
  /dev/sda3                                             fedora_sgallagh520 lvm2 a--  156.09g 146.09g
  VG                 #PV #LV #SN Attr   VSize   VFree  
  fedora_sgallagh520   2   3   0 wz--n- 237.98g 146.09g
  LV   VG                 Attr      LSize  Pool Origin Data%  Move Log Copy%  Convert
  home fedora_sgallagh520 -wi-ao--- 48.83g                                           
  root fedora_sgallagh520 -wi-ao--- 39.29g                                           
  swap fedora_sgallagh520 -wi-ao---  3.77g 

[sgallagh@sgallagh520:~]$ sudo hdparm -i /dev/sda

/dev/sda:

 Model=M4-CT256M4SSD2, FwRev=040H, SerialNo=0000000012500920B519
 Config={ Fixed DTR>10Mbs }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=0
 BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16
 CurCHS=65535/1/63, CurSects=16515009, LBA=yes, LBAsects=500118192
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio3 pio4 
 DMA modes:  mdma0 mdma1 mdma2 
 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 
 AdvancedPM=yes: unknown setting WriteCache=enabled
 Drive conforms to: unknown:  ATA/ATAPI-3,4,5,6,7

 * signifies the current active mode


[sgallagh@sgallagh520:~]$ sudo hdparm -I /dev/sda

/dev/sda:

ATA device, with non-removable media
        Model Number:       M4-CT256M4SSD2                          
        Serial Number:      0000000012500920B519
        Firmware Revision:  040H    
        Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
        Used: unknown (minor revision code 0x0028) 
        Supported: 9 8 7 6 5 
        Likely used: 9
Configuration:
        Logical         max     current
        cylinders       16383   65535
        heads           16      1
        sectors/track   63      63
        --
        CHS current addressable sectors:   16515009
        LBA    user addressable sectors:  268435455
        LBA48  user addressable sectors:  500118192
        Logical  Sector size:                   512 bytes
        Physical Sector size:                   512 bytes
        Logical Sector-0 offset:                  0 bytes
        device size with M = 1024*1024:      244198 MBytes
        device size with M = 1000*1000:      256060 MBytes (256 GB)
        cache/buffer size  = unknown
        Form Factor: 2.5 inch
        Nominal Media Rotation Rate: Solid State Device
Capabilities:
        LBA, IORDY(can be disabled)
        Queue depth: 32
        Standby timer values: spec'd by Standard, with device specific minimum
        R/W multiple sector transfer: Max = 16  Current = 16
        Advanced power management level: 254
        DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4 
             Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
           *    SMART feature set
                Security Mode feature set
           *    Power Management feature set
           *    Write cache
           *    Look-ahead
           *    Host Protected Area feature set
           *    WRITE_BUFFER command
           *    READ_BUFFER command
           *    NOP cmd
           *    DOWNLOAD_MICROCODE
           *    Advanced Power Management feature set
                SET_MAX security extension
           *    48-bit Address feature set
           *    Device Configuration Overlay feature set
           *    Mandatory FLUSH_CACHE
           *    FLUSH_CACHE_EXT
           *    SMART error logging
           *    SMART self-test
           *    General Purpose Logging feature set
           *    WRITE_{DMA|MULTIPLE}_FUA_EXT
           *    64-bit World wide name
           *    IDLE_IMMEDIATE with UNLOAD
                Write-Read-Verify feature set
           *    WRITE_UNCORRECTABLE_EXT command
           *    {READ,WRITE}_DMA_EXT_GPL commands
           *    Segmented DOWNLOAD_MICROCODE
           *    Gen1 signaling speed (1.5Gb/s)
           *    Gen2 signaling speed (3.0Gb/s)
           *    Gen3 signaling speed (6.0Gb/s)
           *    Native Command Queueing (NCQ)
           *    Phy event counters
           *    NCQ priority information
           *    DMA Setup Auto-Activate optimization
                Device-initiated interface power management
           *    Software settings preservation
           *    SMART Command Transport (SCT) feature set
           *    SCT Write Same (AC2)
           *    SCT Error Recovery Control (AC3)
           *    SCT Features Control (AC4)
           *    SCT Data Tables (AC5)
           *    Data Set Management TRIM supported (limit 8 blocks)
           *    Deterministic read data after TRIM
Security: 
        Master password revision code = 65534
                supported
        not     enabled
        not     locked
                frozen
        not     expired: security count
                supported: enhanced erase
        2min for SECURITY ERASE UNIT. 2min for ENHANCED SECURITY ERASE UNIT. 
Logical Unit WWN Device Identifier: 500a07510920b519
        NAA             : 5
        IEEE OUI        : 00a075
        Unique ID       : 10920b519
Checksum: correct

Comment 1 Stephen Gallagher 2013-04-30 11:42:57 UTC
Created attachment 741839 [details]
Complete dmesg output

Comment 2 Josh Boyer 2013-04-30 12:45:19 UTC
So the SSD might support TRIM, but as far as I know LVM doesn't pass that down to the underlying block devices by default.  What is issue_discards set to in /etc/lvm/lvm.conf?

Comment 3 Stephen Gallagher 2013-04-30 12:52:21 UTC
It is set to zero. I take it I should set it to one?

Comment 4 Alasdair Kergon 2013-04-30 13:07:11 UTC
lvm.conf issue_discards controls LVM's *application-level* use:

    # Issue discards to a logical volumes's underlying physical volume(s) when
    # the logical volume is no longer using the physical volumes' space (e.g.
    # lvremove, lvreduce, etc).

Comment 5 Zdenek Kabelac 2013-04-30 13:13:34 UTC
(In reply to comment #2)
> So the SSD might support TRIM, but as far as I know LVM doesn't pass that
> down to the underlying block devices by default.  What is issue_discards set
> to in /etc/lvm/lvm.conf?

You are mixing here several different things together.

If you use your SSD as a PV - and you create an LV on such PV - your TRIM command will be passed (you may easily try this out - just create device and look for sysfs entries for your dm device.

lvm.conf  entry is purely for 'discarding' space when you remove LV - this operation is then irreversible - so user should be really careful if he wants to turn this on - when you create i.e. ext4 filesystem on an empty LV - it's discarded by default anyway - so  this is only useful in some special cases.


And now back to the original reported issues - the problem is not normal dm device - but your encrypted luks device -  obviously discarding data on encrypted device would reveal some info to the attacker (i.e. by default a lot of SSD return '0' on trimmed space) - so by default discard support is not enabled for encrypted storage.

See i.e. this email for more info about 'allow_discard' on dm crypt devices -
http://www.linux-archive.org/device-mapper-development/552400-dm-crypt-add-mapping-table-option-allowing-discard-requests.html

Comment 6 Alasdair Kergon 2013-04-30 13:17:31 UTC
If potential side-channel attacks do not bother you can --enable-discards.

http://asalor.blogspot.co.uk/2011/08/trim-dm-crypt-problems.html

Comment 7 Josh Boyer 2013-04-30 13:26:28 UTC
Thanks for chiming in and correcting my misunderstandings.  Really appreciate it.


Note You need to log in before you can comment on or make changes to this bug.