Bug 722234

Summary: PM: Error -12 creating hibernation image
Product: [Fedora] Fedora Reporter: Stefan Assmann <sassmann>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 15CC: bojan, gansalmon, itamar, jfeeney, jonathan, jskarvad, kernel-maint, madhu.chinakonda, nagyt, pknirsch, richard, stuart, wyverald
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-06 16:01:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
pm-suspend.log
none
dmesg none

Description Stefan Assmann 2011-07-14 17:28:18 UTC
Created attachment 513225 [details]
pm-suspend.log

Description of problem:
Sometimes hibernate fails with
[18537.067268] PM: freeze of devices complete after 655.199 msecs
[18537.068258] PM: late freeze of devices complete after 0.984 msecs
[18537.068470] ACPI: Preparing to enter system sleep state S4
[18537.089050] PM: Saving platform NVS memory
[18537.089655] Disabling non-boot CPUs ...
[18537.089900] PM: Creating hibernation image:
[18537.090006] PM: Need to copy 134895 pages
[18537.090006] PM: Normal pages needed: 119823 + 1024, available pages: 111693
[18537.090006] PM: Not enough free memory
[18537.090006] PM: Error -12 creating hibernation image

Seems to be there's not enough memory, how could it be avoided?

Version-Release number of selected component (if applicable):
pm-utils-1.4.1-8.fc15.i686
kernel-PAE-2.6.38.8-35.fc15.i686

Steps to Reproduce:
1. hibernate the system
  
Actual results:
does not reliably hibernate

# cat /proc/swaps 
Filename                                Type            Size    Used    Priority
/dev/sda4                               partition       2658752 160824  0

Comment 1 Stefan Assmann 2011-07-14 17:29:40 UTC
Created attachment 513227 [details]
dmesg

Comment 2 Stefan Assmann 2011-07-18 10:58:46 UTC
The PM: messages are from the kernel. Ccing mjg for any clues :)

Comment 3 Matthew Garrett 2011-07-18 11:48:21 UTC
You either need more swap, or alternatively may have some success playing with /sys/power/image_size.

Comment 4 Stefan Assmann 2011-07-18 11:54:58 UTC
I don't think it's a swap space issue since I have 2,5GB swap and only 1GB of RAM. Also this happens frequently with only a few applications running and swap space barely touched. It's weird.

cat /sys/power/image_size
414097408

Comment 5 Jaroslav Škarvada 2011-07-19 12:37:23 UTC
Stefan,

you can try to increase the /sys/power/image_size, but AFAIK it should work even if this setting is wrong, so it is probably not your case.

Maybe you are running processes that mlocked too enough memory. Try to kill them or increase your RAM.

According to logs this is not the pm-utils fault. Reassigning to kernel for further investigation.

Comment 6 Stefan Assmann 2011-07-27 10:28:48 UTC
linux-2.6 commit bea3864fb627d110933cfb8babe048b63c4fc76e fixed it for me.

Comment 7 YANG Xudong 2011-11-18 05:41:32 UTC
I'm having this issue too in Fedora 16. My laptop has 2GB RAM and 4GB swap in an LVM partition; the error message is exactly the same as Stefan's save the numbers. Half of the time the laptop successfully hibernates and the other half of the time it says PM: error -12.

$ rpm -q kernel
kernel-3.1.0-7.fc16.i686
$ rpm -q pm-utils
pm-utils-1.4.1-12.fc16-i686
$ cat /proc/swaps
Filename				Type		Size	Used	Priority
/dev/dm-0                               partition	4128764	61568	0


Excerpt of /var/log/messages:

[14332.563315] PM: freeze of devices complete after 4808.362 msecs
[14332.564559] PM: late freeze of devices complete after 1.240 msecs
[14332.564726] ACPI: Preparing to enter system sleep state S4
[14332.580583] PM: Saving platform NVS memory
[14332.580966] Disabling non-boot CPUs ...
[14332.582688] CPU 1 is now offline
[14332.596512] Extended CMOS year: 2000
[14332.596599] PM: Creating hibernation image:
[14332.597005] PM: Need to copy 273455 pages
[14332.597005] PM: Not enough free memory
[14332.597005] PM: Error -12 creating hibernation image

Comment 8 Stefan Assmann 2011-11-18 16:33:48 UTC
Yang,
you might try
echo 0 > /sys/power/image_size

Comment 9 YANG Xudong 2011-11-22 03:46:54 UTC
Thanks, it worked. But it seems that I have to rewrite this image size every time I want to hibernate...

Comment 10 Stefan Assmann 2011-11-22 08:29:04 UTC
Yang, correct this is just a workaround.

Comment 11 nagyt234 2011-11-30 08:05:25 UTC
(In reply to comment #9)
> Thanks, it worked. But it seems that I have to rewrite this image size every
> time I want to hibernate...

You can do it automatically by creating a simple file, e.g. "00_workarond_suspend_bug" in the directory "/etc/pm/sleep.d":

#!/bin/sh
case "$1" in
  hibernate) echo 0 > /sys/power/image_size;;
esac

Comment 12 nagyt234 2011-12-02 07:42:14 UTC
I tried "echo 0 > /sys/power/image_size", but it does not solve the problem. Generally the first try to hibernate fails and the second one works. Somehow the requested number of pages are always decreased between the two tries (see log below). Also the value of /sys/power/image_size is always set back by the system:

$ cat /sys/power/image_size
1409955840

I have 3GB RAM and 6GB almost empty swap space.

Log of the first try:

Dec  1 19:33:27 suse1 kernel: [163587.290339] ACPI: Preparing to enter system sleep state S4
Dec  1 19:33:27 suse1 kernel: [163587.295424] PM: Saving platform NVS memory
Dec  1 19:33:27 suse1 kernel: [163587.306630] Disabling non-boot CPUs ...
Dec  1 19:33:27 suse1 kernel: [163587.340783] Unmapping cpu 1 from all nodes
Dec  1 19:33:27 suse1 kernel: [163587.341901] CPU 1 is now offline
Dec  1 19:33:27 suse1 kernel: [163587.341906] SMP alternatives: switching to UP code
Dec  1 19:33:27 suse1 kernel: [163587.348774] PM: Creating hibernation image:
Dec  1 19:33:27 suse1 kernel: [163587.349006] PM: Need to copy 393685 pages
Dec  1 19:33:27 suse1 kernel: [163587.349006] PM: Normal pages needed: 146427 + 1024, available pages: 138726
Dec  1 19:33:27 suse1 kernel: [163587.349006] PM: Not enough free memory
Dec  1 19:33:27 suse1 kernel: [163587.349006] PM: Error -12 creating hibernation image

Log of the second try:

Dec  2 08:22:21 suse1 kernel: [163639.877688] ACPI: Preparing to enter system sleep state S4
Dec  2 08:22:21 suse1 kernel: [163639.882792] PM: Saving platform NVS memory
Dec  2 08:22:21 suse1 kernel: [163639.894033] Disabling non-boot CPUs ...
Dec  2 08:22:21 suse1 kernel: [163639.928418] Unmapping cpu 1 from all nodes
Dec  2 08:22:21 suse1 kernel: [163639.929535] CPU 1 is now offline
Dec  2 08:22:21 suse1 kernel: [163639.929539] SMP alternatives: switching to UP code
Dec  2 08:22:21 suse1 kernel: [163639.936688] PM: Creating hibernation image:
Dec  2 08:22:21 suse1 kernel: [163639.937043] PM: Need to copy 369293 pages
Dec  2 08:22:21 suse1 kernel: [163639.937043] PM: Normal pages needed: 102157 + 1024, available pages: 143240
Dec  2 08:22:21 suse1 kernel: [163639.937043] PM: Restoring platform NVS memory
Dec  2 08:22:21 suse1 kernel: [163639.937043] Enabling non-boot CPUs ...
Dec  2 08:22:21 suse1 kernel: [163639.963492] SMP alternatives: switching to SMP code
Dec  2 08:22:21 suse1 kernel: [163639.969188] Booting Node 0 Processor 1 APIC 0x1
Dec  2 08:22:21 suse1 kernel: [163639.956968] Initializing CPU#1
Dec  2 08:22:21 suse1 kernel: [163639.956968] Mapping cpu 1 to node 0
Dec  2 08:22:21 suse1 kernel: [163640.102652] CPU1 is up
Dec  2 08:22:21 suse1 kernel: [163640.103053] ACPI: Waking up from system sleep state S4

---------------------

It is interesting, that the log of the successful hibernation appears as it would happen at the time of the next waking up.

Note, that I did not do anything between the two tries, only logged in again and started the hibernation again.

Comment 13 Stuart D Gathman 2012-03-03 20:20:21 UTC
I have 2GB ram, and 20GB swap, which is hardly used.  (No question of enough swap space.)  

Mar  2 15:56:43 elissa kernel: [78696.135622] PM: freeze of devices complete after 402.243 msecs
Mar  2 15:56:43 elissa kernel: [78696.136711] PM: late freeze of devices complete after 1.085 msecs
Mar  2 15:56:43 elissa kernel: [78696.136845] ACPI: Preparing to enter system sleep state S4
Mar  2 15:56:43 elissa kernel: [78696.141035] PM: Saving platform NVS memory
Mar  2 15:56:43 elissa kernel: [78696.141038] Disabling non-boot CPUs ...
Mar  2 15:56:43 elissa kernel: [78696.142547] CPU 1 is now offline
Mar  2 15:56:43 elissa kernel: [78696.143216] Extended CMOS year: 2000
Mar  2 15:56:43 elissa kernel: [78696.143320] PM: Creating hibernation image:
Mar  2 15:56:43 elissa kernel: [78696.144010] PM: Need to copy 202229 pages
Mar  2 15:56:43 elissa kernel: [78696.144010] PM: Not enough free memory
Mar  2 15:56:43 elissa kernel: [78696.144010] PM: Error -12 creating hibernation image

Comment 14 Dave Jones 2012-04-11 15:45:54 UTC
Bojan, does this problem look similar to the one your latest patches fixed ?

Comment 15 Bojan Smojver 2012-04-11 22:29:44 UTC
(In reply to comment #14)
> Bojan, does this problem look similar to the one your latest patches fixed ?

This happens before the image writing even starts, so almost certainly a different problem and one that would not be affected by that patch.

The problem also has nothing to do with size of swap. When hibernation image is created, it needs to fit into non-high memory on 32-bit machines (+ some reserved pages for I/O and device drivers). If there isn't enough space to do that there, the hibernation will fail like this.

The reason the second try may work may be that in the first try, the hibernation code evicted some stuff out of memory, which then helped with the second try. Of course, this should probably happen the first time - the code should just try harder, I guess.

The reason you are seeing messages that look like thaw on hibernation, is that that is what really happens. Just before the snapshot of the image is taken, all non-boot CPU are disabled and devices are frozen. After the snapshot is taken, non-boot CPUs are enabled and devices are unfrozen, so that image can be written to disk. You can find info about that here: http://www.kernel.org/doc/Documentation/power/swsusp.txt

Comment 16 Bojan Smojver 2012-04-12 00:21:31 UTC
One more comment: these problems will likely go away if you switch to 64-bit kernel. So, unless you have a CPU that cannot do 64-bits, that's probably the most reasonable thing to do.

Comment 17 Bojan Smojver 2012-04-12 23:06:33 UTC
(In reply to comment #15)
> When hibernation image is
> created, it needs to fit into non-high memory on 32-bit machines (+ some
> reserved pages for I/O and device drivers).

Part of it, that is. Depending on which memory is actually free at the time of hibernation.

Comment 18 Stuart D Gathman 2012-06-06 23:13:26 UTC
Can we bump the version to Fedora 16?  Or is this "won't fix ever because 32-bit Intel is obsolete"?

Comment 19 Josh Boyer 2012-06-07 13:16:01 UTC
(In reply to comment #18)
> Can we bump the version to Fedora 16?  Or is this "won't fix ever because
> 32-bit Intel is obsolete"?

From Bojan's comments it seems there isn't anything that can really be done to fix the issue as it is due to memory fragmentation on 32-bit kernels.  Aside from switching to a 64-bit kernel, or somehow forcing memory to get freed up, I don't see much progress being made on this bug.  We could bump it to rawhide if someone was willing to work on it.

32-bit Intel is not obsolete, but if you are running a 32-bit kernel on 64-bit hardware I would highly recommend you switch to a 64-bit kernel.

Comment 20 Stuart D Gathman 2012-06-08 18:45:56 UTC
It should at least be possible to check whether hibernate is possible, and if memory is too fragmented, show an informative error (maybe suggesting suspend or shutdown instead).  At least suspend works reliably on 32-bit.

Comment 21 Josh Boyer 2012-06-09 13:44:22 UTC
(In reply to comment #20)
> It should at least be possible to check whether hibernate is possible, and
> if memory is too fragmented, show an informative error (maybe suggesting
> suspend or shutdown instead).  At least suspend works reliably on 32-bit.

I don't disagree.  However, that is something that needs to happen upstream and I doubt we're going to get around to implementing that with our current workload.  If you or Bojan (or whomever) want to pursue that, I'm sure many people would be appreciative.