Created attachment 1407934 [details] journal Description of problem: In some cases (I've tested 2 different desktops so far, issue appeared on 1 of them), Fedora can't resume after suspend when launched from DVD (from external USB drive). Version-Release number of selected component (if applicable): kernel-4.16.0-0.rc4.git0.1.fc28.x86_64 How reproducible: Always on affected systems Steps to Reproduce: 1. Run Fedora Live from DVD via an external USB drive 2. Suspend 3. Restore Actual results: No Applications can be launched after resuming, another suspend (which will not happen) would crash the GNOME Shell and prevent user from logging in again. Expected results: System should continue to work just fine after resuming from suspend.
Proposed as a Blocker for 28-final by Fedora user frantisekz using the blocker tracking app because: There is no particular criterion that this would violate, but it affects the ability to run the installer or reboot after the installer has finished. We don't know how many % of users would be affected by this. However, since Fedora now suspends after 20 minutes of idling by default, this issue will be visible more likely.
I am really not a fan of this. Suspend and hibernate tend to be flaky and the reasons why they fail tend to be very very hardware specific and may be linked to firmware. Unless suspend and hibernate are considered blocking features I'm against this being a blocker.
Laura: the other thing under discussion would be to resolve this by suppressing the auto-suspend behaviour for live sessions.
Discussed at 2018-03-19 Fedora 28 blocker review meeting: https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2018-03-19/f28-blocker-review.2018-03-19-16.02.html . We agreed to delay the decision on the blocker status of this bug. We're definitely worried about this situation, but we want to do two things before making a decision: 1) Call for wider testing of suspending live sessions, so we can get a better indication of just how broken it is in real-life use 2) Re-start the desktop@ discussion about whether auto-suspend of live sessions should be disabled 'on principle' We'll come back and look at this again in future meetings, hopefully with more data and input. Note, we're pretty solidly agreed that as well as "fix all problems with suspending live sessions", "disable auto-suspend of live sessions" would be a sufficient resolution of this issue.
Discussed during the 2018-03-26 blocker review meeting: [1] The decision to delay the classification of this bug as a blocker was made as we wish to gather more testing data and also restart the discussion on whether auto-suspend should be disabled on live images before making a decision here. [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2018-03-26/f28-blocker-review.2018-03-26-16.01.txt
On a laptop, suspend needs to work. I would consider that a blocking feature. Now, the uninstalled livecd scenario is all sorts of special, and I'd be in favor of getting rid of it, so we don't have to have these kinds of discussion so often.
Discussed during the 2018-04-02 blocker review meeting: [1] The decision to delay the classification of this as a bug was made as we are still gathering data on this bug and need more time to do so. We will make a decision once we have more information. [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2018-04-02/f28-blocker-review.2018-04-02-16.00.txt
(In reply to Matthias Clasen from comment #6) > On a laptop, suspend needs to work. I would consider that a blocking feature. Given Fedora kernels proximity to upstream, and upstream's lack of testing on laptop hardware let alone with suspend, I think a block on suspend policy will translate into many blocking requests for different model specific causes. Check out how deep in the weeds I was with my suspend bug with upstream. I lost count how many kernels I had to build. And it took over two months of back and forth with upstream. And then in the end they said even though a kernel regression revealed the problem, ultimately it's firmware even though the problem doesn't happen on non-OEM Windows 10. https://bugzilla.kernel.org/show_bug.cgi?id=185521
(In reply to Matthias Clasen from comment #6) > On a laptop, suspend needs to work. I would consider that a blocking feature. We never blocked on that, and kernel folks never wanted to block on that. Of course it would be nice to be able to make it work every time and block on it, I'm just describing the current state of things. > Now, the uninstalled livecd scenario is all sorts of special, and I'd be in > favor of getting rid of it, so we don't have to have these kinds of > discussion so often. I'm not sure I understand, are you OK with disabling autosuspend on Live images?
For me doesn't make too much sense to have autosuspend on live image by default, isn't live image used only for trying the system live in one shot and then install it? Anyone else uses live system as daily workstation usage? If you are installing the system and the autosuspend is enabled, it will suspend while installing? (could break the installation?) If there are problems on some systems, it's better to no enable autosuspend on live image (only on installed system, and if on battery only by default?). If the autosuspend only has problems on live image, doesn't make too much sense to promoto as blocker. Ofcourse is great if all this could work without issues, but not good to be a blocker for the release.
It's enabled on the beta so I'm sure we'll be seeing some feedback very soon now. Aside from such facts, I prefer autosuspend being a blocker bug because there's no Fedora policy, or resources to support suspend by default right now. Take the feature to FESCO and get it approved the usual way, in fact I'm wondering why this was changed without it being an explicit system wide feature change. It's a rather substantial change. But insofar as it's enabled now anyway, I think it's better to leave it enabled on the live, because otherwise it's bait and switch behavior to disable it on the live and then (essentially surreptitiously) have it enabled once installed.
"in fact I'm wondering why this was changed without it being an explicit system wide feature change. It's a rather substantial change." It's an upstream change that's part of GNOME 3.28. "because otherwise it's bait and switch behavior to disable it on the live and then (essentially surreptitiously) have it enabled once installed." That's a weird description of something we already do for lots of other things, like update notification, for instance. "If you are installing the system and the autosuspend is enabled, it will suspend while installing? (could break the installation?)" No, the installer inhibits suspend.
(In reply to Adam Williamson from comment #12) > "in fact I'm wondering why this was changed without it being an explicit > system wide feature change. It's a rather substantial change." > > It's an upstream change that's part of GNOME 3.28. So? There are plenty of upstream changes part of those projects that still must follow the change policy as far as I'm aware. Why is GNOME exempt from that change policy? Is it defact exempt? I don't see an exception list. > "because otherwise it's bait and switch behavior to disable it on the live > and then (essentially surreptitiously) have it enabled once installed." > > That's a weird description of something we already do for lots of other > things, like update notification, for instance. We're not going to put your computer in an unsupported mode while LiveOS booted for absolutely no good reason we can think of, but we will put it into an unsupported mode after you've installed it. The only thing this is missing is canned laughter. vs While LiveOS booted, we're going to download a bunch of repo metadata and software updates that you probably can't even install because the overlay will likely blow up if you try. One is fail danger with no coherency why LiveOS should be exempt from the default behavior. The other example is fail safe. Completely different things, not even remotely comparable.
>booted, we're going ^not
Test: Fedora live media - suspend / resume works fine on Dell Latitude E6420.
Matthias: "On a laptop, suspend needs to work. I would consider that a blocking feature." Just to amplify Kamil's and Chris's responses to this: we have never treated suspend in this way ("it's release blocking if suspend is known to fail on any laptop") and we do not have the resources to treat it that way. We have two generalist kernel engineers. They cannot be expected to fix any arbitrary suspend bug that pops up on our release timeframes. This is not a practical approach.
I've opened a FESCo ticket. https://pagure.io/fesco/issue/1873
Test case: Fedora live media - suspend / resume works fine on: Base Board Information Manufacturer: MSI Product Name: 990XA-GD55 (MS-7640) Processor Information Socket Designation: CPU 1 Type: Central Processor Family: FX Manufacturer: AMD Version: AMD FX(tm)-8150 Eight-Core Processor Memory Array Mapped Address Starting Address: 0x00000000000 Ending Address: 0x003FFFFFFFF Range Size: 16 GB lspci -nn 00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD/ATI] RD9x0/RX980 Host Bridge [1002:5a14] (rev 02) 00:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GFX port 0) [1002:5a16] 00:05.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 1) [1002:5a19] 00:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 3) [1002:5a1b] 00:11.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [IDE mode] [1002:4390] (rev 40) 00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller [1002:4385] (rev 42) 00:14.1 IDE interface [0101]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 IDE Controller [1002:439c] (rev 40) 00:14.2 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) [1002:4383] (rev 40) 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller [1002:439d] (rev 40) 00:14.4 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge [1002:4384] (rev 40) 00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399] 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0 [1022:1600] 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1 [1022:1601] 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2 [1022:1602] 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3 [1022:1603] 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4 [1022:1604] 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5 [1022:1605] 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Lexa PRO [Radeon RX 550] [1002:699f] (rev c7) 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:aae0] 02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 06) 03:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host Controller [1033:0194] (rev 04)
For those who want to report resume success rate here - please always specify how did you boot Fedora Workstation Live, whether using USB flash drive, or integrated (SATA) optical drive, or external (USB) optical drive. Thanks.
Sorry to not specify, in my tests I have tried with USB Flash drive only.
We agreed to automatic suspend only when on battery power (laptop unplugged) in F29, and disable it altogether in F28. We could opt to additionally disable it on live media. I'm not sure if that's really necessary, as users could just as well complain that their battery was drained because we didn't suspend when left unplugged, but it would certainly avoid cases like this. Chris seemed to object to treating live media differently up above. Other thoughts? (In reply to Adam Williamson from comment #16) > Matthias: "On a laptop, suspend needs to work. I would consider that a > blocking feature." > > Just to amplify Kamil's and Chris's responses to this: we have never treated > suspend in this way ("it's release blocking if suspend is known to fail on > any laptop") and we do not have the resources to treat it that way. We have > two generalist kernel engineers. They cannot be expected to fix any > arbitrary suspend bug that pops up on our release timeframes. This is not a > practical approach. Suspend should surely be a blocker on Workstation, I would class that under the "menu sanity" criterion... it's core functionality and can't be rationalized away. Hibernate should not be, because we know that it rarely works, and we don't expose it in the user interface. It can be treated the same way to do other blocker bugs: if the problem affects a large number of machines, then we have a blocker. If the problem only affects a few users, or specific hardware, or niche cases like when running from an external DVD drive that relies on USB power, that probably is not a blocker. I've never seen Fedora Workstation fail to suspend, and the complaints I've heard seem to be hardware-specific and certainly not widespread, so I think we're already in a good enough state here. Certainly "this does not work, but only on some particular hardware" should never be a release blocker. If a new kernel breaks suspend for a large number of users and nobody knows how to fix it, I think the Workstation WG would very likely insist on going back to the older kernel, but I'm not aware of any suspend problem of such severity ever occurring in the entire history of Fedora Workstation. I guess the kernel folks might have a different perspective from having to deal with the suspend issues, but from our high-level workstation view, I think we're already fairly satisfied with how things are going.
Discussed during the 2018-04-09 blocker review meeting: [1] The decision to classify this bug as a RejectedBlocker was made as the Workstation WG voted today to disable auto-suspend for F28 Final: https://pagure.io/fedora-workstation/issue/42 . As this was proposed as a blocker due to the auto-suspend behaviour and that will be taken out, we no longer have any grounds to accept it. [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2018-04-09/f28-blocker-review.2018-04-09-16.01.txt
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There are a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 28 kernel bugs. Fedora 28 has now been rebased to 4.17.7-200.fc28. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you experience different issues, please open a new bug report for those.
*********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 5 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.