Bug 984124 - Hibernate fails intermittently
Hibernate fails intermittently
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: systemd (Show other bugs)
19
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: systemd-maint
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-12 18:08 EDT by Dimitris
Modified: 2013-09-13 15:42 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-09-13 15:42:02 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Dimitris 2013-07-12 18:08:06 EDT
Description of problem:

Every second attempt to hibernate fails.

Version-Release number of selected component (if applicable):

systemd-204-9.fc19.x86_64

How reproducible:

Every (second) time.

Steps to Reproduce:
1. Press hibernate key (Thinkpad X200s here, so Fn-F12).

Actual results:

Alternates (exactly one of each) between:

- System goes into hibernation.

- Screen turns off, then back on with GNOME shell lock activated.  System stays running.  In the logs:

Jul 12 14:30:12 gaspode systemd[1]: Starting Sleep.
Jul 12 14:30:12 gaspode systemd[1]: Reached target Sleep.
Jul 12 14:30:12 gaspode systemd[1]: Starting Hibernate...
Jul 12 14:30:12 gaspode systemd[1]: Starting Sleep.
Jul 12 14:30:12 gaspode systemd[1]: Reached target Sleep.
Jul 12 14:30:12 gaspode systemd[1]: Starting Hibernate...
Jul 12 14:30:12 gaspode kernel: PM: Hibernation mode set to 'platform'
Jul 12 14:30:12 gaspode systemd-sleep[6746]: Suspending system...
Jul 12 14:30:12 gaspode kernel: PM: Marking nosave pages: [mem 0x0009e000-0x000fffff]
Jul 12 14:30:12 gaspode kernel: PM: Marking nosave pages: [mem 0xbd1a1000-0xbd1a6fff]
Jul 12 14:30:12 gaspode kernel: PM: Marking nosave pages: [mem 0xbd2b7000-0xbd30efff]
Jul 12 14:30:12 gaspode kernel: PM: Marking nosave pages: [mem 0xbd3c6000-0xbd6fefff]
Jul 12 14:30:12 gaspode kernel: PM: Marking nosave pages: [mem 0xbd700000-0xffffffff]
Jul 12 14:30:12 gaspode kernel: PM: Basic memory bitmaps created
Jul 12 14:30:14 gaspode kernel: PM: Syncing filesystems ... done.
Jul 12 14:30:14 gaspode systemd[1]: systemd-hibernate.service: main process exited, code=exited, status=1/FAILURE
Jul 12 14:30:14 gaspode systemd[1]: Failed to start Hibernate.
Jul 12 14:30:14 gaspode systemd[1]: Dependency failed for Hibernate.
Jul 12 14:30:14 gaspode systemd[1]: Service sleep.target is not needed anymore. Stopping.
Jul 12 14:30:14 gaspode systemd[1]: Unit systemd-hibernate.service entered failed state.
Jul 12 14:30:14 gaspode systemd[1]: Stopping Sleep.
Jul 12 14:30:14 gaspode systemd[1]: Stopped target Sleep.
Jul 12 14:30:14 gaspode systemd-logind[670]: Operation finished.
Jul 12 14:30:15 gaspode kernel: Freezing user space processes ... (elapsed 0.01 seconds) done.
Jul 12 14:30:14 gaspode NetworkManager[789]: <info> wake requested (sleeping: yes  enabled: yes)
Jul 12 14:30:14 gaspode NetworkManager[789]: <info> waking up and re-enabling...
Jul 12 14:30:14 gaspode NetworkManager[789]: <info> WWAN now enabled by management service
Jul 12 14:30:14 gaspode NetworkManager[789]: <info> (enp0s25): device state change: unmanaged -> unavailable (reason 'managed') [10 20 2]
Jul 12 14:30:14 gaspode NetworkManager[789]: <info> (enp0s25): bringing up device.
Jul 12 14:30:15 gaspode kernel: PM: Preallocating image memory... 
Jul 12 14:30:15 gaspode kernel: Restarting tasks ... done.
Jul 12 14:30:15 gaspode kernel: PM: Basic memory bitmaps freed


Expected results:

- Hibernate evert time.

Additional info:

Starting this on systemd, need some help (Fedora newbie) for additional debugging to move towards root cause.
Comment 1 Dimitris 2013-07-12 18:36:15 EDT
OK, changing summary as this isn't happening strictly with every other attempt;  In fact I've managed to elicit both consecutive failures and consecutive successes.
Comment 2 Michal Schmidt 2013-07-15 06:58:41 EDT
Is the problem reproducible when you invoke hibernation using this command as root?:
echo disk > /sys/power/state
Comment 3 Dimitris 2013-07-15 14:26:35 EDT
In the past couple of days I haven't been able to repdoruce this with any one of:

- Fn-F12
- systemctl hibernate
- echo disk > /sys/power/state

Since I reported this the kernel, among other potential locations of this bug, has been upgraded (currently 3.9.9-302.fc19.x86_64).

If I'm understanding the log snippet above correctly, the error was that /usr/lib/systemd/systemd-sleep failed, without anything related to its failure being logged. including by the kernel.  Is there any way to increase its log level?  Would increasing systemd's overall log level, currently default (info?) capture more debugging from the systemd-sleep binary?
Comment 4 Dimitris 2013-07-15 21:06:18 EDT
Of course the first time I try to hibernate after posting my previous comment Mr Murphy drops in:

Jul 15 15:44:53 gaspode systemd[1]: Starting Sleep.
Jul 15 15:44:53 gaspode systemd[1]: Reached target Sleep.
Jul 15 15:44:53 gaspode systemd[1]: Starting Hibernate...
Jul 15 15:44:53 gaspode kernel: PM: Hibernation mode set to 'platform'
Jul 15 15:44:53 gaspode systemd-sleep[9305]: Suspending system...
Jul 15 15:44:53 gaspode kernel: PM: Marking nosave pages: [mem 0x0009e000-0x000fffff]
Jul 15 15:44:53 gaspode kernel: PM: Marking nosave pages: [mem 0xbd1a1000-0xbd1a6fff]
Jul 15 15:44:53 gaspode kernel: PM: Marking nosave pages: [mem 0xbd2b7000-0xbd30efff]
Jul 15 15:44:53 gaspode kernel: PM: Marking nosave pages: [mem 0xbd3c6000-0xbd6fefff]
Jul 15 15:44:53 gaspode kernel: PM: Marking nosave pages: [mem 0xbd700000-0xffffffff]
Jul 15 15:44:53 gaspode kernel: PM: Basic memory bitmaps created
Jul 15 15:44:55 gaspode kernel: PM: Syncing filesystems ... done.
Jul 15 15:44:55 gaspode kernel: Freezing user space processes ... (elapsed 0.01 seconds) done.
Jul 15 15:44:55 gaspode systemd[1]: systemd-hibernate.service: main process exited, code=exited, status=1/FAILURE
Jul 15 15:44:55 gaspode systemd[1]: Failed to start Hibernate.
Jul 15 15:44:55 gaspode systemd[1]: Dependency failed for Hibernate.
Jul 15 15:44:55 gaspode systemd[1]: Service sleep.target is not needed anymore. Stopping.
Jul 15 15:44:55 gaspode systemd[1]: Unit systemd-hibernate.service entered failed state.
Jul 15 15:44:55 gaspode systemd[1]: Stopping Sleep.
Jul 15 15:44:55 gaspode systemd[1]: Stopped target Sleep.
Comment 5 Michal Schmidt 2013-07-16 05:28:22 EDT
(In reply to Dimitris from comment #3)
> If I'm understanding the log snippet above correctly, the error was that
> /usr/lib/systemd/systemd-sleep failed, without anything related to its
> failure being logged. including by the kernel.  Is there any way to increase
> its log level?

You can increase the debug level just for systemd-hibernate.service by saving this fragment as /etc/systemd/system/systemd-hibernate.service.d/debug.conf :

[Service]
Environment=SYSTEMD_LOG_LEVEL=debug

Then perform "systemctl daemon-reload" to make sure the change takes effect.
Comment 6 Dimitris 2013-07-16 13:36:43 EDT
> You can increase the debug level just for systemd-hibernate.service by
> saving this fragment as
> /etc/systemd/system/systemd-hibernate.service.d/debug.conf :
> 
> [Service]
> Environment=SYSTEMD_LOG_LEVEL=debug
> 
> Then perform "systemctl daemon-reload" to make sure the change takes effect.

Done, however looking at the log above and systemd-sleep's source, it looks to me like this is failing in the kernel:

> Jul 15 15:44:53 gaspode systemd-sleep[9305]: Suspending system...

http://cgit.freedesktop.org/systemd/systemd/tree/src/sleep/sleep.c#n112 says:

>        log_struct(LOG_INFO,
>                   MESSAGE_ID(SD_MESSAGE_SLEEP_START),
>                   "MESSAGE=Suspending system...",
>                   note,
>                   NULL);
>
>        r = write_state(f, states);
>        if (r < 0)
>                return r;
>
>        log_struct(LOG_INFO,
>                   MESSAGE_ID(SD_MESSAGE_SLEEP_STOP),
>                   "MESSAGE=System resumed.",
>                   note,
>                   NULL);
Comment 7 Michal Schmidt 2013-07-16 13:47:32 EDT
(In reply to Dimitris from comment #6)
> Done, however looking at the log above and systemd-sleep's source, it looks
> to me like this is failing in the kernel:

Yes. You should be seeing this additional message now:

                log_debug("Failed to write '%s' to /sys/power/state: %s",
                          *state, strerror(-k));

I am curious about the strerror printed.

Have you ever managed to reproduce it with the "echo disk > /sys/power/state"?
Comment 8 Dimitris 2013-07-16 14:15:38 EDT
> Yes. You should be seeing this additional message now:
> 
>                 log_debug("Failed to write '%s' to /sys/power/state: %s",
>                           *state, strerror(-k));
> 
> I am curious about the strerror printed.

Haven't reproduced it with debug yet.

> 
> Have you ever managed to reproduce it with the "echo disk >
> /sys/power/state"?

Not yet.  I'll keep trying both.
Comment 9 Dimitris 2013-09-13 13:35:14 EDT
I haven't been able to reproduce this for a while, at least since the 3.10-series F19 kernels (just upgraded to 3.10.11-200.fc19.x86_64).

It *seems* to correlate with the modularization of the mei_me driver and associated (on this machine anyway) muting of related log storms; see bug 917081.
Comment 10 Lennart Poettering 2013-09-13 15:42:02 EDT
OK, will close then, feel free to reopen if you manage to reproduce the issue!

Note You need to log in before you can comment on or make changes to this bug.