Bug 181420

Summary: swsusp broken with swap on LVM
Product: [Fedora] Fedora Reporter: Jeremy Katz <katzj>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED RAWHIDE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: lists, ndbecker2, pfrields, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-02-15 22:07:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 150222    

Description Jeremy Katz 2006-02-13 21:42:47 UTC
swap on LVM used to be fine for swsusp (in 2.6.15-ish), but with changes in
2.6.16-rc*, that's regressed and the kernel is unable to find the swap device.

Setting the resume device manually by echo'ing major/minor of the swap into
/sys/power/resume seems to "fix"

Comment 1 Andrew Duggan 2006-02-14 22:05:47 UTC
This may be related to this commit 

http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blobdiff;h=0479c9be7d71501be8fa96c0e31ff49f361b6921;hp=d760a6a719f0215fd44846d127f48aa42d7359e5;hb=1adf6c8ea916bc4a2587a881ec7715fece63fb5e;f=kernel/power/swsusp.c

I've been testing Dave J's 200n series kernels on FC4 (that's what my laptop
runs more or less).  Even though FC4 does not support hibernation, I rebuild the
kernel src rpm and turn CONFIG_SOFTWARE_SUSPEND on.  (Yes I am running a rawhide
version of mkinitrd so the resume logic works and I have my rc.sysinit patched
to clear out any left over suspend signatures on the swap partition).

But anyway I notice that staring with this commit, a resume=/dev/<SWAP> (which
in my case is /dev/hda3) must be on the kernel command line or it will not
suspend to disk.  This is because there is nothing to set swsusp_resume_device
to the dev_t of the swap partition to be used, if the resume= does not appear on
the kernel command line.    Furthermore, that processing happens before anything
in the initrd init script runs, so the LVM is not setup and the call to
name_to_dev_t() call fails to find your LVM based swap in /sys/block/... So
swsusp_resume_device remains 0, and this

+ if (!swsusp_resume_device)
+ return -ENODEV;

keeps your machine from being able to write the image to the swap on the LVM. 
Now since my swap is NOT on an LVM it can find it and all is well.  I have
grepped the entire kernel tree and there is no where swsusp_resume_device is set
except for the init code based on the command line or from the resume_store()
function which provides the write method for the /sys/power/resume file.  That
has the primary function of triggering a resume, and all of the previous wisdom
has been to don't touch /sys/power/resume unless you are sure you want a resume
to kick off, and if you even thought about having any filesystems mounted, you
could kiss your data bye-bye ;-)

I have put off raising any sort of fuss on lkml or Pavel or Raphael since they
just concluded a very lengthy flame fest over the implementation of user-space
swsusp vs. suspend2. (plus there was talk about eliminating suspend-to-disk
altogether), and I didn't feel like getting burnt at the moment.

Writing the major and minor the /sys/power/resume should fix the problem, but
people need to know that as long as they don't have a swsusp image on the device
in question it should be safe....although I've not been brave enough to try it.

I guess the question is that the "resume $swsuspdev" that the nash script is
calling should be echoing the right value into /sys/power/resume, and if that is
the case then the swsusp_resume_device in kernel/power/disk.c should be all
setup for the swsusp to use, but it doesn't seem to be the case.  

I'm just about through a patch to fix this for me, but it probably won't meet
anybody else's criteria. 

The sad part is the introduction of /sys/power/image_size (and the default of
500 MB) make the system so much more responsive after a resume.  There is a big
wow factor, but it doesn't work or so it seems for anyone with swap on LVM.

Of course I could be completely wrong...



Comment 2 Jeremy Katz 2006-02-15 22:07:39 UTC
davej got a fix for this in 1953