Bug 1487421 - PM961 NVME Controller Reset [NEEDINFO]
Summary: PM961 NVME Controller Reset
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 26
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-31 21:20 UTC by Dominic Robinson
Modified: 2019-04-22 11:01 UTC (History)
25 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-29 12:10:37 UTC
labbott: needinfo? (development-K9RvgheM1OmXW9pm)


Attachments (Terms of Use)
Turn off deepest power saving mode for pm961 drives (783 bytes, patch)
2017-09-01 00:20 UTC, Dominic Robinson
no flags Details | Diff
Disable pm961 deeper sleep #2 (783 bytes, patch)
2017-09-01 18:48 UTC, Dominic Robinson
no flags Details | Diff

Description Dominic Robinson 2017-08-31 21:20:01 UTC
Description of problem:
Since upgrading to the 4.11 kernel series (now on 4.12) I have been experiencing intermittent resets with my nvme drive, a Samsung pm961. This doesn't happen with a fresh Fedora 25 install, Windows or Ubuntu.

When this happens it's like the nvme drive has been put into a locked state because the drive no longer presents itself to the bus i.e. doesn't show in the bios, won't boot etc... Only a hard power off cures this, a reset won't do.

To me it looks like a case of this:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678184

Looking over various other bug reports it looks like you may have introduced a apst quirk for the sm961, but not the pm961 which is more common in Europe; essentially the 960, sm961 and pm961 are all the same drive.

When this happens the filesystem becomes read-only and then a kernel panic will set in shortly after - so cannot provide logs for obvious reasons.

Version-Release number of selected component (if applicable):
Fedora 26 4.12.8-300.fc26.x86_64

How reproducible:
This is an intermittent thing but usually occurs anywhere between 5 and 30 minutes from booting.


Steps to Reproduce:
1. Use a pm961 drive
2. Install Fedora
3. Boot, wait for the filesystem to become read-only then panic.

Actual results:
Drive disappears, filesystem becomes read only, kernel panic.

Expected results:
Drive stays connected, filesystem is writable, no kernel panic.

Additional info:

Comment 1 Dominic Robinson 2017-08-31 21:29:31 UTC
Also not sure how useful this is given the usefulness of smart data on nvme drives, but here's the output:

sudo smartctl /dev/nvme0n1 -a
[sudo] password for dominic: 
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.12.8-300.fc26.x86_64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       SAMSUNG MZVLW256HEHP-00000
Serial Number:                      XXXXXXXXXXXXXX
Firmware Version:                   CXB7001Q
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 256,060,514,304 [256 GB]
Unallocated NVM Capacity:           0
Controller ID:                      2
Number of Namespaces:               1
Namespace 1 Size/Capacity:          256,060,514,304 [256 GB]
Namespace 1 Utilization:            255,877,681,152 [255 GB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Thu Aug 31 22:27:43 2017 BST
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL *Other*
Optional NVM Commands (0x001f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Warning  Comp. Temp. Threshold:     77 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     7.60W       -        -    0  0  0  0        0       0
 1 +     6.00W       -        -    1  1  1  1        0       0
 2 +     5.10W       -        -    2  2  2  2        0       0
 3 -   0.0400W       -        -    3  3  3  3      210    1500
 4 -   0.0050W       -        -    4  4  4  4     2200    6000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0x1)
Critical Warning:                   0x00
Temperature:                        36 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    3%
Data Units Read:                    7,431,775 [3.80 TB]
Data Units Written:                 7,659,188 [3.92 TB]
Host Read Commands:                 74,514,720
Host Write Commands:                120,638,330
Controller Busy Time:               395
Power Cycles:                       1,246
Power On Hours:                     782
Unsafe Shutdowns:                   71
Media and Data Integrity Errors:    0
Error Information Log Entries:      30
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               36 Celsius

Error Information (NVMe Log 0x01, max 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0         30     0  0x0018  0x4004  0x02c            0     0     -
  1         29     0  0x0017  0x4004  0x02c            0     0     -
  2         28     0  0x0018  0x4004  0x02c            0     0     -
  3         27     0  0x0017  0x4004  0x02c            0     0     -
  4         26     0  0x0018  0x4004  0x02c            0     0     -
  5         25     0  0x0017  0x4004  0x02c            0     0     -
  6         24     0  0x0018  0x4004  0x02c            0     0     -
  7         23     0  0x0017  0x4004  0x02c            0     0     -
  8         22     0  0x0018  0x4004  0x02c            0     0     -
  9         21     0  0x0017  0x4004  0x02c            0     0     -
 10         20     0  0x009f  0x4004      -            0     0     -
 11         19     0  0x0094  0x4004      -            0     0     -
 12         18     0  0x005f  0x4004      -            0     0     -
 13         17     0  0x0016  0x4004  0x02c            0     0     -
 14         16     0  0x0015  0x4004  0x02c            0     0     -
 15         15     0  0x00c2  0x4004  0x02c            0     0     -
... (14 entries not shown)

Error count has not increased since building the machine a year ago.

Comment 2 Dominic Robinson 2017-08-31 21:51:09 UTC
I've been digging some more - definitely looks like the apst issue, looking at the steps taken to debug here: https://www.mail-archive.com/kernel-packages@lists.launchpad.net/msg236507.html

I've been able to get the following output:
[dominic@hell01-ws01 ~]$ sudo nvme get-feature -f 0x0c -H /dev/nvme0n1
[sudo] password for dominic: 
get-feature:0xc (Autonomous Power State Transition), Current value:0x000001
	Autonomous Power State Transition Enable (APSTE): Enabled
	Auto PST Entries	.................
	Entry[ 0]   
	.................
	Idle Time Prior to Transition (ITPT): 86 ms
	Idle Transition Power State   (ITPS): 3
	.................
	Entry[ 1]   
	.................
	Idle Time Prior to Transition (ITPT): 86 ms
	Idle Transition Power State   (ITPS): 3
	.................
	Entry[ 2]   
	.................
	Idle Time Prior to Transition (ITPT): 86 ms
	Idle Transition Power State   (ITPS): 3
	.................
	Entry[ 3]   
	.................
	Idle Time Prior to Transition (ITPT): 410 ms
	Idle Transition Power State   (ITPS): 4
	.................
	Entry[ 4]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[ 5]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[ 6]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[ 7]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[ 8]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[ 9]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[10]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[11]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[12]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[13]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[14]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[15]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[16]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[17]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[18]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[19]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[20]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[21]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[22]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[23]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[24]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[25]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[26]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[27]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[28]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[29]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[30]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[31]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................

As you can see many instances of power state transition with no idle time.

Comment 3 Dominic Robinson 2017-09-01 00:20:01 UTC
Created attachment 1320767 [details]
Turn off deepest power saving mode for pm961 drives

Compiling this now, will report back.

Comment 4 Dominic Robinson 2017-09-01 18:48:51 UTC
Created attachment 1321084 [details]
Disable pm961 deeper sleep #2

Ok late night - mistakes were made.

New patch attached - compiled successfully going to test out for a day or so to see if problem is fixed.

I'm not a kernel developer so have limited understanding of what's going on - would greatly appreciate any input.

Comment 5 Dominic Robinson 2017-09-01 19:01:41 UTC
Also worth noting that just build a production system utilising a couple of these drives with rhel - would prefer this bug not to filter down into any backports.

Comment 6 Dominic Robinson 2017-09-04 09:52:16 UTC
Can confirm that the attached patch, has fixed this issue for me on Fedora 26. 

Can we look at including this please?

Comment 7 Andy Lutomirski 2017-09-04 15:55:22 UTC
Upstream here :)

Can you give the relevant line of lspci -nn output?  More importantly, can you tell us what kind of computer this is and give dmidecode output?  The kernel logs when the device fails would be nice, too.

Comment 8 Dominic Robinson 2017-09-04 17:21:33 UTC
Hi Andy

Yes the lspci output (of which the device id and manufacturer id are included in the above patch) is:

01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961 [144d:a804]

dmidecode output here:
https://www.dcrdev.com/dmidecode.txt

Mainboard is a MSI B150I GAMING Pro, coupled with a i3 6100 cpu.

It's difficult for me to provide logs, I don't thing the relevant entries are getting written to disk before the drive goes offline and subsequently the file system becomes read only. The only indication that something is wrong from the logs is that fsck has been run frequently within a short space of time i.e. the hard resets I performed.

I mean this is really the only evidence of it happening:
https://www.dcrdev.com/17_08_29_13_40_24_0863.jpg

^ but that's symptomatic 

What I can say though - is it definitely appears to be this deeper sleep mode getting triggered; after applying my patch I'm not having this issue.

It seems like you're already implementing this workaround upstream: https://github.com/torvalds/linux/blob/v4.12/drivers/nvme/host/pci.c#L2074 Except only when coupled with select dell mainboards.

Comment 9 Andy Lutomirski 2017-09-05 22:40:01 UTC
Lovely.  I wonder how widespread this issue is.

Comment 10 Dominic Robinson 2017-09-05 23:02:22 UTC
I don't know, but atleast one other person has reported this issue against this drive on non dell hardware here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678184

I've got a couple of these drives in raid1 on my rhel server, they have different firmware. I built the machine fairly recently, but whilst I was setting up I had to use a Fedora live image to chroot into the system and encountered some strange issues around the filesystem; in hindsight it was probably this issue.
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-693.1.1.el7.x86_64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       SAMSUNG MZVLW256HEHP-00000
Serial Number:                      XXXXXXXX
Firmware Version:                   CXB7401Q
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 256,060,514,304 [256 GB]
Unallocated NVM Capacity:           0
Controller ID:                      2
Number of Namespaces:               1
Namespace 1 Size/Capacity:          256,060,514,304 [256 GB]
Namespace 1 Utilization:            184,719,413,248 [184 GB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Tue Sep  5 23:44:31 2017 BST
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL *Other*
Optional NVM Commands (0x001f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Warning  Comp. Temp. Threshold:     68 Celsius
Critical Comp. Temp. Threshold:     71 Celsius


Fortunately nvme apst isn't implemented on rhel yet, but as you can see similar entries:
nvme get-feature -f 0x0c -H /dev/nvme0n1
get-feature:0xc (Autonomous Power State Transition), Current value:00000000
	Autonomous Power State Transition Enable (APSTE): Disabled
	Auto PST Entries	.................
	Entry[ 0]   
	.................
	Idle Time Prior to Transition (ITPT): 60 ms
	Idle Transition Power State   (ITPS): 3
	.................
	Entry[ 1]   
	.................
	Idle Time Prior to Transition (ITPT): 60 ms
	Idle Transition Power State   (ITPS): 3
	.................
	Entry[ 2]   
	.................
	Idle Time Prior to Transition (ITPT): 60 ms
	Idle Transition Power State   (ITPS): 3
	.................
	Entry[ 3]   
	.................
	Idle Time Prior to Transition (ITPT): 9940 ms
	Idle Transition Power State   (ITPS): 4
	.................
	Entry[ 4]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[ 5]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[ 6]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[ 7]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[ 8]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[ 9]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[10]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[11]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[12]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[13]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[14]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[15]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[16]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[17]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[18]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[19]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[20]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[21]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[22]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[23]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[24]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[25]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[26]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[27]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[28]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[29]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[30]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................
	Entry[31]   
	.................
	Idle Time Prior to Transition (ITPT): 0 ms
	Idle Transition Power State   (ITPS): 0
	.................

Do you think there's any chance, the workaround I posted could be merged? I appreciate it's not necessarily the best thing to have lots of tiny hardcoded workarounds for specific hardware. I'm happy to offer my time to get to the root cause, if that is what's needed. Unfortunately I've got quite a lot invested in these drives, having set up several systems.

Comment 11 Dominic Robinson 2017-09-05 23:08:41 UTC
^ Also the rhel server I was referring to is on a completely different platform ASRock Rack E3C236D2I C236 mainboard / Xeon E3-1245v6 cpu.

Comment 12 Andy Lutomirski 2017-09-05 23:10:26 UTC
I've reached out to Samsung.  The problem with applying your patch is that it would cause a fairly large power consumption regression on laptops.

In the mean time, you should be able to work around the issue by booting with nvme_core.default_ps_max_latency_us=5500 or so.

Comment 13 Dominic Robinson 2017-09-06 00:11:40 UTC
Thanks &

FYI - Installed the stock Fedora kernel, added that parameter and within minutes had the same issue. I've just disabled apst altogether for now by setting it to 0; which works as expected.

Comment 14 Laura Abbott 2018-02-28 03:47:00 UTC
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale. The kernel moves very fast so bugs may get fixed as part of a kernel update. Due to this, we are doing a mass bug update across all of the Fedora 26 kernel bugs.
 
Fedora 26 has now been rebased to 4.15.4-200.fc26.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 27, and are still experiencing this issue, please change the version to Fedora 27.
 
If you experience different issues, please open a new bug report for those.

Comment 15 Fedora End Of Life 2018-05-03 09:02:20 UTC
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 16 Fedora End Of Life 2018-05-29 12:10:37 UTC
Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26
is no longer maintained, which means that it will not receive any
further security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 17 Denis Auroux 2018-07-14 13:20:25 UTC
Problem still occurs with Fedora 28, especially worse with 4.17 kernels.

Comment 18 Denis Auroux 2018-07-14 13:25:40 UTC
See also 
https://bbs.archlinux.org/viewtopic.php?id=238547
This is not fedora specific.

My problem is on a Thinkpad X1 Yoga with a Samsung PM960 series 512GB NVMe SSD. 

The kernel option nvme_core.default_ps_max_latency_us=6000 found somewhere else helped avoid the issue with the later 4.16 kernels but the 4.17 kernels now break it again.  Will try changing to 5500 as suggested here, but not expecting any miracles.

Comment 19 Denis Auroux 2018-08-22 15:00:16 UTC
After a month of usage, nvme_core.default_ps_max_latency_us=200 is stable, no more ext4 errors and crashes... but battery life has shrunk quite a lot.

Would it be too much to ask for this to be reopened against Fedora 28 and looked into seriously?  It's a major kernel bug given how common those Samsung NVME SSDs are these days.  We shouldn't have to choose between random crashes that may or may not corrupt the filesystem, vs. decent battery life.

Comment 20 Denis Auroux 2018-09-02 11:26:13 UTC
I'm now having ext4 errors again with kernel 4.17.19, even with nvme_core.default_ps_max_latency_us=200.  Getting worse and worse!!

Comment 21 Andy Lutomirski 2018-09-02 15:51:03 UTC
It's not clear to me that there's anything the kernel can really do about this.  From some past experience with these issues, the root cause *seems* to be that there are a handful of laptops out there with bona fide electrical problems.  For whatever reason, they're exacerbated by NVMe APST, but that's not the root cause.  Further confounding anyone's ability to test anything, at least some of the affected laptops only seem to be affected depending on whether they're plugged in, which makes it extremely hard to tell what's going on.

It's also the case that it's basically impossible for a genuine kernel bug to exist here.  The kernel is merely asking the hardware politely to save power.  If the hardware screws it up, it is a hardware problem.  At best the kernel could try to work around it, but it's not clear when or how to do this.

Comment 22 Denis Auroux 2018-09-02 17:40:29 UTC
It's a serious bug that only occurs with recent versions of the kernel, on pretty common hardware (since Lenovo and Dell seem to both use Samsung SSDs commonly). Perhaps it's not strictly speaking a kernel bug, but if the kernel doesn't work properly on fairly widespread pieces of hardware then it's a problem for the kernel. For a production system, it's just not acceptable to have random filesystem crashes.

Pre-APST kernels were very power efficient on laptops with these Samsung SSDs -- my experience in terms of battery life was that APST support didn't improve battery life but caused crashes, and then working around with the latency parameter avoided crashes but degraded battery life significantly.  I'm sure that APST is meant the right way to manage power on most NVMe SSDs, but for the sake of everyone with a Samsung controller, it would be nice to have an option to just bypass the whole APST and return to older kernels' behavior [not sure what that means exactly], rather than having to tinker with latency parameters and hope that they're just right.

Or are we supposed to throw away a nearly new laptop that works well in all other settings? Return to kernel 4.8 or thereabouts (can't remember when this nonsense started exactly)? Switch to a different distribution?

Anyway -- I appreciate that this may not be easy to fix, but I want to make sure that developers are aware this is an ongoing issue and is further exacerbated by recent kernel changes -- I don't want this bug to be swept under the rug due to "Fedora 26 is EOL" and "the bug report is old" when in fact it is getting worse and worse with newer kernels (at least on my system).

Comment 23 Denis Auroux 2018-09-02 19:04:18 UTC
Update: discovered that Samsung has a firmware update for these SSDs. Hard to know what it covers (found no clear indication that it addresses the APST issue), but who knows.  I've just upgraded my Samsung 512 GB PM960 m.2 disk (model MZVLW512HMJP-000L7) from firmware 6L7QCXY7 to 7L7QCXY7.  We'll see if this helps.

(Of course a firmware fix on Samsung's end would be really the right way to deal with this. Doesn't mean it's happening, but keeping fingers crossed).

Sorry, I should have checked for this firmware upgrade before resuming my periodic screaming at the kernel over this issue.  For now I'll keep the very aggressive and power-hungry latency setting (200) because I *really* need my system to be stable in the coming weeks, but will take a chance and continue to boot 4.17.19 (or subsequent once available); will report again if crashes continue to occur with the new firmware and this setting.

Comment 24 Andy Lutomirski 2018-09-02 21:35:51 UTC
I'd appreciate a report on whether the firmware helps.

But "screaming at the kernel" won't get too far.  As the kernel person who implemented APST in the first place, I'm quite confident about this...

And, for what it's worth, despite your experience of not saving too much power, there are a lot of systems where APST makes a shockingly large difference.  It's not just the power saved in the SSD itself -- various systems seem to require that the SSD goes to sleep before the PCIe link goes into a deep ASPM state, and they require that the PCIe links all be in deep ASPM states before the CPU package goes into a deep PC state, and they need that deep PC state to get good battery life.  Apparently Intel also suggests that failing to use deep PC states may adversely affect the lifespan of the system as a whole, too.

Comment 25 Josh Harness 2018-09-19 14:12:37 UTC
Hey Denis - I'm having the same issue and would also like to try the firmware update. I'm having trouble finding this in Samsung's site. Where did you find it for your model. I'm using a PM951 Samsung NVMe SSD.

Comment 26 Denis Auroux 2018-09-19 15:19:27 UTC
I'm still using the power-hungry latency setting (200) as I can't afford extra crashes at the moment, so I'm not sure exactly how much the upgrade helped.  

Qualitatively it seems to have helped some, in that kernel 4.17.19 even with this very low latency setting crashed roughly twice a week before the firmeware upgrade, and ran for 11 days after the firmware upgrade before producing an APST-related ext4 filesystem crash. The fact that it still crashed, though, indicates that the firmware update didn't sort things out completely. I'm now running 4.18.5 which has been well-behaved for 5 days so far. 

I didn't get the firmware upgrade directly on Samsung's site, I got it from Lenovo (after rebooting in Windows). If you have a Thinkpad, look up Lenovo NVMe SSD firmware update utility.  I am under the impression that Samsung's firmware upgrade tool only works for the SSDs they sell directly to consumers; if your Samsung SSD was an OEM product (shipped with your machine) then you're expected get the upgrade from your machine's manufacturer. (But keep looking on both sides in case I'm wrong about this).

Denis

Comment 27 Josh Harness 2018-09-21 15:51:49 UTC
Very helpful - thanks Denis!

Comment 28 Volker Braun 2019-04-22 11:01:44 UTC
I'm also hitting this bug. Just updated the firmware, which can be done under Linux with nvme-cli. My old firmware version was 3L7QCXB7:

[root@zen ~]# nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev  
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     S35ENX0HC04495       SAMSUNG MZVLW256HEHP-000L7               1          62.89  GB / 256.06  GB    512   B +  0 B   5L7QCXB7
[root@zen ~]# nvme id-ctrl /dev/nvme0 | grep fr
fr        : 3L7QCXB7
frmw      : 0x16

Download and unzip firmware from
https://pcsupport.lenovo.com/gb/en/products/laptops-and-netbooks/thinkpad-t-series-laptops/thinkpad-t470s/downloads/ds119265

Figure out the firmware for your model:

[root@zen ~]# grep MZVLW256HEHP-000L7 FWNV30/fwwinsd.pro 
"SAMSUNG MZVLW256HEHP-000L7","4L7QCXB7","5L7QCXB7","5L7QCXB7_NF_ENC.bin","RaidFWUpdate_V1_1_6.exe","","S","SAMSUNG"

Upload and commit the firmware:

[root@zen ~]# nvme fw-download /dev/nvme0 --fw=FWNV30/SAMSUNG/5L7QCXB7_NF_ENC.bin
[root@zen ~]# nvme fw-commit /dev/nvme0 --slot=0 --action=1

Now reboot your computer; Note: "echo 1 >
/sys/class/nvme/nvme0/reset_controller" as suggested in the
nvme-fw-commit manpage was not sufficient

After a reboot you have the new version:

[root@zen ~]# nvme id-ctrl /dev/nvme0 | grep fr
fr        : 5L7QCXB7
frmw      : 0x16


Note You need to log in before you can comment on or make changes to this bug.