Description of problem:
Hibernating in RHEL 6.1 and 6.2 pre-beta seems to cause crashes when resuming. Then on next boot the system prompts for root password to run a manual fsck
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Boot system and log in
2. Start some applications, such as OpenOffice, Firefox, Thunderbird, Evince, Rhythmbox
3. Hibernate system
4. Resume from hibernation
About 30% of the time the system will kernel panic when resuming.
Normal resume from hibernate
Can you please add in details on the file system used (ext4 I guess?), the IO stack and type of storage?
ext4 LVMs with full disk LUKS encryption to a local 500gb SATA disk (on a ThinkPad T520 laptop)
Sounds like LUKS might be losing the write barrier/flush requests?
Tough to say without more debugging. I think I'd start by assigning this to a device-mapper developer.
If you're in for more testing and have some hardware to do it, I'd start with a very simple storage stack, and then add things to it, testing along the way, until you can see which layer/component seems to cause the problem.
If it's ext4 on a plain partition, I'll perk up. :)
It could be that FLUSH is lost somewhere, order is wrong, there is missing flush for workqueue (dmcrypt uses internal threads but DM core should send flush only if there is no IO in flight).
Seems to need more debugging. If flush is correctly backported, I do not think the problem is in dmcrypt. (It simply forwards flush to underlying device - the same like linear target. DM core should wait for previous IOs so flush is sent when dmcrypt has empty encryption queues.)
Is the hibernation code properly fixed to send flush when saving memory image to encrypted swap?
What is corrupted first - memory image loaded from swap during resuming or filesystem?
(I would try to hibernate and instead of resume run fsck from live CD - if there no corrupted fs, memory image in swap is corrupted and fs corruption is just consequence.)
Are you requesting I try to reproduce that?
Is there any update on this request?
Can you attach the backtrace you get on resume?
Which kernel are you testing 6.2 with? Make sure that it's -199 or later.