Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1346908 - systemd-hibernate fails without explanation
Summary: systemd-hibernate fails without explanation
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 23
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-06-15 15:49 UTC by Patrick O'Callaghan
Modified: 2016-06-18 09:20 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-06-17 16:24:35 UTC
Type: Bug


Attachments (Terms of Use)
Journal entry from failed hibernation (6.84 KB, text/plain)
2016-06-15 15:49 UTC, Patrick O'Callaghan
no flags Details
Another journal extract (4.57 KB, text/plain)
2016-06-16 22:51 UTC, Patrick O'Callaghan
no flags Details

Description Patrick O'Callaghan 2016-06-15 15:49:57 UTC
Created attachment 1168431 [details]
Journal entry from failed hibernation

Description of problem:
When running some processes (e.g. Google Chrome) hibernation is inhibited but no explanation is given. If Chrome is killed, then hibernation works as expected.

Version-Release number of selected component (if applicable):
systemd-222-14.fc23.x86_64

How reproducible:
Every time

Steps to Reproduce:
1. Run Google Chrome
2. Attempt to hibernate the system (e.g. via DE menu selection)
3.

Actual results:
Screen blanks, then returns after a few seconds.

Expected results:
System hibernates.

Additional info:
(Reported originally at https://bugzilla.redhat.com/show_bug.cgi?id=1346811)

There seems to be a system policy that some non-privileged processes can inhibit hibernation (though it doesn't appear to be documented for the average user). However if such inhibition happens, there is no user-level indication aside from the screen blanking and restoring, and the journal entry appears to report the fact as a failure. See the attached file.

It should be pointed out that in identical circumstances, system suspend *does* work, so whether or not this is actually policy is even more unclear.

Comment 1 Rex Dieter 2016-06-15 16:04:12 UTC
Try running:

$ systemd-inhibit

and post the output

Comment 2 Rex Dieter 2016-06-15 16:07:42 UTC
That said, if inhibitors are active (and hibernation is currently blocked), I wouldn't expect systemd to say hibernate is available at the moment either.  For example, I'd expect this to not return "yes":

$ qdbus --system org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager.CanHibernate

 But, that may just be an implementation quirk.

Comment 3 Patrick O'Callaghan 2016-06-15 21:42:58 UTC
(In reply to Rex Dieter from comment #1)
> Try running:
> 
> $ systemd-inhibit
> 
> and post the output

$ systemd-inhibit
     Who: Telepathy (UID 1000/poc, PID 2508/mission-control)
    What: shutdown:sleep
     Why: Disconnecting IM accounts before suspend/shutdown...
    Mode: delay

     Who: Screen Locker (UID 1000/poc, PID 1899/ksmserver)
    What: sleep
     Why: Ensuring that the screen gets locked before going to sleep
    Mode: delay

     Who: PowerDevil (UID 1000/poc, PID 1722/kded5)
    What: handle-power-key:handle-suspend-key:handle-hibernate-key:handle-lid-switch
     Why: KDE handles power events
    Mode: block

     Who: NetworkManager (UID 0/root, PID 972/NetworkManager)
    What: sleep
     Why: NetworkManager needs to turn off networks
    Mode: delay

4 inhibitors listed.

Comment 4 Patrick O'Callaghan 2016-06-15 21:43:58 UTC
(In reply to Rex Dieter from comment #2)
> That said, if inhibitors are active (and hibernation is currently blocked),
> I wouldn't expect systemd to say hibernate is available at the moment
> either.  For example, I'd expect this to not return "yes":
> 
> $ qdbus --system org.freedesktop.login1 /org/freedesktop/login1
> org.freedesktop.login1.Manager.CanHibernate
> 
>  But, that may just be an implementation quirk.

$ qdbus --system org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager.CanHibernate
na

Comment 5 Patrick O'Callaghan 2016-06-15 22:38:38 UTC
As mentioned earlier, I also don't understand why inhibiting hibernation doesn't also inhibit suspension, or is that a separate bug?

Comment 6 Rex Dieter 2016-06-15 23:56:18 UTC
Given that systemd-inhibit is not listing chrome, that seems to be a false alarm as the cause.

The real issue appears to be why CanHibernate isn't returning 'yes', meaning systemd will refuse to hibernate, for whatever reason, possibly: it thinks you don't have permission, or the system is not capable, or <some better explanation>.

Comment 7 Patrick O'Callaghan 2016-06-16 10:56:34 UTC
(In reply to Rex Dieter from comment #6)
> Given that systemd-inhibit is not listing chrome, that seems to be a false
> alarm as the cause.

If the presence of <A> means that <B> always fails, and the absence of <A> means that <B> always succeeds, it's reasonable to conclude that <A> causes the failure of <B>, but of course it might be an indirect cause (A causes C which causes B). I have no idea why or how, I'm just stating what happens on my system. I'm out of ideas on what to try.

> The real issue appears to be why CanHibernate isn't returning 'yes', meaning
> systemd will refuse to hibernate, for whatever reason, possibly: it thinks
> you don't have permission, or the system is not capable, or <some better
> explanation>.

It's as if the presence of a Chrome process is affecting the result of CanHibernate, but is not being properly registered.

(and once again: suspend always works, whether or not Chrome is running)

Comment 8 Patrick O'Callaghan 2016-06-16 22:49:48 UTC
Just got the same failure but with a somewhat different journal entry (attached).

Comment 9 Patrick O'Callaghan 2016-06-16 22:51:23 UTC
Created attachment 1168858 [details]
Another journal extract

Same failure, but a different journal entry.

Comment 10 Zbigniew Jędrzejewski-Szmek 2016-06-16 23:58:43 UTC
When checking for hibernation (the result that CanHibernate call returns), inhibitors are *not* taken into account. CanHibernate only check if it would be possible to hibernate.

What CanHibernate checks is that free swap is enough to fit all data present in memory (i.e. ram - buffers - cache > swap size - swap used). So I'd guess that chrome consumes enough memory to go over this boundary, and when you kill it, memory usage goes down.

I guess we could extend the API to have a CanHibernate2 call which would try to return a reason and/or take inhibitors into account, but that'd require more discussion upstream, and buy in from gnome or other desktop people.

Comment 11 Zbigniew Jędrzejewski-Szmek 2016-06-17 00:06:53 UTC
I take my previous comment back after looking at the attached logs (it wasn't wrong, but it just doesn't apply here).

Hibernation starts, and fails in the kernel. So for some reason the kernel cannot finish hibernation successfully. What systemd does, is equivalent to writing "echo disk >/sys/power/state", so once the kernel starts writing memory to swap, things are out of our hands. Unfortunately I don't see anything in the logs which would explain the reason.

I'll reassign the bug to the kernel, maybe they have an idea why hibernation fails halfway without explanation.


To answer previous comment:
(In reply to Patrick O'Callaghan from comment #5)
> As mentioned earlier, I also don't understand why inhibiting hibernation
> doesn't also inhibit suspension, or is that a separate bug?
Hibernation and suspend have completely different properties. In particular it is possible to suspend without any swap. So hibernation and swap can be enabled/disabled separately.

Comment 12 Patrick O'Callaghan 2016-06-17 09:02:30 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #11)
> I take my previous comment back after looking at the attached logs (it
> wasn't wrong, but it just doesn't apply here).
> 
> Hibernation starts, and fails in the kernel. So for some reason the kernel
> cannot finish hibernation successfully. What systemd does, is equivalent to
> writing "echo disk >/sys/power/state", so once the kernel starts writing
> memory to swap, things are out of our hands. Unfortunately I don't see
> anything in the logs which would explain the reason.
> 
> I'll reassign the bug to the kernel, maybe they have an idea why hibernation
> fails halfway without explanation.

OK.

> To answer previous comment:
> (In reply to Patrick O'Callaghan from comment #5)
> > As mentioned earlier, I also don't understand why inhibiting hibernation
> > doesn't also inhibit suspension, or is that a separate bug?
> Hibernation and suspend have completely different properties. In particular
> it is possible to suspend without any swap. So hibernation and swap can be
> enabled/disabled separately.

That makes sense, thanks for the clarification. However if the problem in this case isn't being caused by lack of swap, the important difference is something else.

For the record, this is my current swap state (with Chrome running):

$ swapon -s
Filename                                Type            Size    Used    Priority
/dev/sda6                               partition       8200188 4008392 -1

The system has 16GB of RAM, so potentially swap could run out though it's never happened.

Comment 13 Zbigniew Jędrzejewski-Szmek 2016-06-17 12:18:12 UTC
Can you paste the output of 'free'?

Comment 14 Patrick O'Callaghan 2016-06-17 12:34:37 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #13)
> Can you paste the output of 'free'?

$ free
              total        used        free      shared  buff/cache   available
Mem:       16361232    10606076      496588      563580     5258568     4748624
Swap:       8200188     3678888     4521300

Comment 15 Zbigniew Jędrzejewski-Szmek 2016-06-17 13:20:41 UTC
You have more memory used than total swap, so there's no way you can hibernate.
In this state, what does org.freedesktop.login1.Manager.CanHibernate return?

Comment 16 Rex Dieter 2016-06-17 15:07:52 UTC
Asked (in comment #2 ) and answered the CanHibernate question in comment #4, it returns na (or maybe that was a typo for no).

So, it appears this is really just a case of (at least) insufficient swap to reliably support hibernation.

Comment 17 Patrick O'Callaghan 2016-06-17 15:41:10 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #15)
> You have more memory used than total swap, so there's no way you can
> hibernate.
> In this state, what does org.freedesktop.login1.Manager.CanHibernate return?

Good call, now that I see it. Here is the equivalent after killing Chrome:

$ free
              total        used        free      shared  buff/cache   available
Mem:       16361232     5038296     6568368       47364     4754568    10830256
Swap:       8200188     3710516     4489672

So in fact it's not a bug but a feature. Now if only some part of the system or DE had told me that directly, e.g. via a meaningful message in the journal instead of the bald error indication, I wouldn't have wasted everyone's time.

Thanks to you and Rex.

Comment 18 Rex Dieter 2016-06-17 15:59:07 UTC
How are you initiating hibernation?  In plasma at least, the code is supposed to check if CanHibernate != no, before offerring in the UI.

Comment 19 Patrick O'Callaghan 2016-06-17 16:13:50 UTC
(In reply to Rex Dieter from comment #18)
> How are you initiating hibernation?  In plasma at least, the code is
> supposed to check if CanHibernate != no, before offerring in the UI.

I've tried both the Plasma menu item (under Launcher->Leave) and directly via "systemctl hibernate".

If Plasma only checks on startup, them it will think hibernation is possible as Chrome isn't running yet. It would need to check periodically, or at least when the user selects the option, in which case it could give a useful message.

Comment 20 Rex Dieter 2016-06-17 16:24:35 UTC
OK, thanks.

Hrm... checking periodically... not sure if that's practical, but will consider poking upstream about that.


Given the facts gathered so far, may as well close->notabug at this point

Comment 21 Patrick O'Callaghan 2016-06-17 17:04:50 UTC
(In reply to Rex Dieter from comment #20)
> OK, thanks.
> 
> Hrm... checking periodically... not sure if that's practical, but will
> consider poking upstream about that.
> 
> 
> Given the facts gathered so far, may as well close->notabug at this point

Agreed, a periodic check doesn't make sense. However a warning when hibernation is attempted would be useful and doesn't appear to be difficult to achieve.

Comment 22 Patrick O'Callaghan 2016-06-17 17:12:36 UTC
I tried adding a swap file, but the problem is still there:

$ cat /proc/swaps 
Filename                                Type            Size    Used    Priority
/dev/sda6                               partition       8200188 3659972 -1
/swap-extra                             file            8388604 0       -2
[poc@bree ~]$ free
              total        used        free      shared  buff/cache   available
Mem:       16361232     9775080     1084560      523432     5501592     5613148
Swap:      16588792     3659972    12928820

i.e. hibernation still fails with Chrome running, so maybe the problem is elsewhere.

Comment 23 Rex Dieter 2016-06-17 17:13:39 UTC
depends, what does CanHibernate call say now?

Comment 24 Patrick O'Callaghan 2016-06-17 17:33:58 UTC
(In reply to Rex Dieter from comment #23)
> depends, what does CanHibernate call say now?

$ qdbus --system org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager.CanHibernate
na

(Note that the result is "na", not "no". This is as before).

Comment 25 Rex Dieter 2016-06-17 17:44:31 UTC
According to
https://www.freedesktop.org/wiki/Software/systemd/logind/

If "na" is returned the operation is not available because hardware, kernel or drivers do not support it.

Comment 26 Patrick O'Callaghan 2016-06-17 22:54:35 UTC
(In reply to Rex Dieter from comment #25)
> According to
> https://www.freedesktop.org/wiki/Software/systemd/logind/
> 
> If "na" is returned the operation is not available because hardware, kernel
> or drivers do not support it.

When Chrome is not running, the test returns "yes" and hibernation works. When it's not running, the result is "na" and hibernation doesn't work.

This cannot be related to hardware, kernel or drivers.

Comment 27 Zbigniew Jędrzejewski-Szmek 2016-06-18 00:47:24 UTC
Only one partition is used for hibernation, the kernel cannot resume from multiple swap devices. Adding a new swap device can still help, because if enough swapped-out pages are moved to the new device, there might be enough free space on the main swap partition to hibernate.

Comment 28 Patrick O'Callaghan 2016-06-18 09:20:55 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #27)
> Only one partition is used for hibernation, the kernel cannot resume from
> multiple swap devices. Adding a new swap device can still help, because if
> enough swapped-out pages are moved to the new device, there might be enough
> free space on the main swap partition to hibernate.

I see, so unless there is a way to move swapped pages around to create space, there is no solution to this problem other than killing processes until it works.

All that remains is to give the user enough feedback so he knows what's going on.


Note You need to log in before you can comment on or make changes to this bug.