| Summary: | hibernate cause kernel panic | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Caspar Zhang <czhang> | ||||||||
| Component: | kernel | Assignee: | Stanislaw Gruszka <sgruszka> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Guangze Bai <gbai> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | medium | ||||||||||
| Version: | 6.1 | CC: | arozansk, gbai, linville, mishu, prarit, qcai, sgruszka, yshao, yuchen | ||||||||
| Target Milestone: | rc | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | All | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | kernel-2.6.32-211.el6 | Doc Type: | Bug Fix | ||||||||
| Doc Text: |
Cause
Try to hibernate for certain laptops including Lenovo T400
and X200.
Consequence
Kernel could panic occasionally.
|
Story Points: | --- | ||||||||
| Clone Of: | |||||||||||
| : | 746169 (view as bug list) | Environment: | |||||||||
| Last Closed: | 2011-12-06 13:21:44 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Bug Depends On: | |||||||||||
| Bug Blocks: | 702988, 746169, 748554 | ||||||||||
| Attachments: |
|
||||||||||
|
Description
Caspar Zhang
2011-05-04 04:55:32 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. seems similar as bug 613493 *** Bug 698061 has been marked as a duplicate of this bug. *** Can you check this upstream commit 2e725a065b0153f0c449318da1923a120477633d "PM / Hibernate: Return error code when alloc_image_page() fails" ? It could help with fist trace, where we are trying to free bits which are not allocated (in such out of memory case hibernate will just fail, instead of panic). No idea about second trace.
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
Cause
Try to hibernate for certain laptops including Lenovo T400
and X200.
Consequence
Kernel could panic occasionally.
That's what I get so far:
> =============================================================================
> BUG kmalloc-128 (Not tainted): Redzone overwritten
> -----------------------------------------------------------------------------
>
> INFO: 0xffff880036dcd210-0xffff880036dcd217. First byte 0x0 instead of 0xcc
> INFO: Allocated in alloc_vmap_area+0x57/0x380 age=52867 cpu=1 pid=2472
> INFO: Freed in i915_gem_execbuffer2+0xe8/0x210 [i915] age=52869 cpu=1 pid=2472
> INFO: Slab 0xffffea0000c004d8 objects=20 used=17 fp=0xffff880036dcd960 flags=0x20000000000083
> INFO: Object 0xffff880036dcd190 @offset=400 fp=0x(null)
>
> Bytes b4 0xffff880036dcd180: 75 f8 ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a u<F8><FF><FF>....ZZZZZZZZ
> Object 0xffff880036dcd190: 00 70 91 05 00 c9 ff ff 00 c0 91 05 00 c9 ff ff .p...<C9><FF><FF>.<C0>...<C9><FF><FF>
> Object 0xffff880036dcd1a0: 06 00 00 00 00 00 00 00 a9 d1 dc 36 00 88 ff ff ........<A9><D1><DC>6..<FF><FF>
> Object 0xffff880036dcd1b0: d0 db dc 36 00 88 ff ff 70 22 4a 57 00 88 ff ff <D0><DB><DC>6..<FF><FF>p"JW..<FF><FF>
> Object 0xffff880036dcd1c0: c8 d8 dc 36 00 88 ff ff 00 02 20 00 00 00 ad de <C8><D8><DC>6..<FF><FF>......<AD><DE>
> Object 0xffff880036dcd1d0: d8 d8 dc 36 00 88 ff ff a0 d9 dc 36 00 88 ff ff <D8><D8><DC>6..<FF><FF>.<D9><DC>6..<FF><FF>
> Object 0xffff880036dcd1e0: f8 d7 57 76 00 88 ff ff f0 d8 dc 36 00 88 ff ff <F8><D7>Wv..<FF><FF><F0><D8><DC>6..<FF><FF>
> Object 0xffff880036dcd1f0: b0 cd 15 81 ff ff ff ff 6b 6b 6b 6b 6b 6b 6b 6b <B0><CD>..<FF><FF><FF><FF>kkkkkkkk
> Object 0xffff880036dcd200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> Redzone 0xffff880036dcd210: 00 00 00 00 00 00 00 00 ........
> Padding 0xffff880036dcd250: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
> Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.32 #3
> Call Trace:
> <IRQ> [<ffffffff81174f52>] ? print_trailer+0x102/0x170
> [<ffffffff811755ae>] ? check_bytes_and_report+0xfe/0x140
> [<ffffffff8115cdc2>] ? rcu_free_va+0x12/0x20
> [<ffffffff8117782a>] ? check_object+0x6a/0x250
> [<ffffffff8115cdc2>] ? rcu_free_va+0x12/0x20
> [<ffffffff81178013>] ? __slab_free+0x1f3/0x320
> [<ffffffff8115cdc2>] ? rcu_free_va+0x12/0x20
> [<ffffffff811782ae>] ? kfree+0x16e/0x1d0
> [<ffffffff8115cdc2>] ? rcu_free_va+0x12/0x20
> [<ffffffff810f0b3d>] ? __rcu_process_callbacks+0x12d/0x3e0
> [<ffffffff810f0e1b>] ? rcu_process_callbacks+0x2b/0x50
> [<ffffffff81075bfd>] ? __do_softirq+0xdd/0x200
> [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30
> <EOI> [<ffffffff8100dfdd>] ? do_softirq+0xad/0xe0
> [<ffffffff81075530>] ? ksoftirqd+0x80/0x120
> [<ffffffff810754b0>] ? ksoftirqd+0x0/0x120
> [<ffffffff81095a16>] ? kthread+0x96/0xa0
> [<ffffffff8100c20a>] ? child_rip+0xa/0x20
> [<ffffffff8100bb50>] ? restore_args+0x0/0x30
> [<ffffffff81095980>] ? kthread+0x0/0xa0
> [<ffffffff8100c200>] ? child_rip+0x0/0x20
> FIX kmalloc-128: Restoring 0xffff880036dcd210-0xffff880036dcd217=0xcc
What seems to blame i915 driver. Does other laptops T400 and X200 have also intel graphics hardware?
(In reply to comment #20) > What seems to blame i915 driver. Does other laptops T400 and X200 have also > intel graphics hardware? Yes. There are at least two different issues here. One is swapping and hibernate races, other are related with graphics driver. I can fix the former, we have upstream patches for that. For graphics driver problem, I will open separate bug report/s. Created attachment 528161 [details]
checkmem.c
Simple program for check memory corruption in user space.
Created attachment 528163 [details]
test_hib.sh
Script that can be used to reproduce that bug. It hibernate/reboot/resume in loop and check memory using previously attached program. When corruption is encountered it wil print error ans sleep forever, system will crach.
Created attachment 528166 [details]
0001-PM-Hibernate-Fix-memory-corruption-related-to-swap.patch
Proposed fix.
Patch(es) available on kernel-2.6.32-211.el6 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2011-1530.html |