Description of problem: When running a Fedora 12 Xen 32bit PAE PV guest with 1024 MB or more of memory, so that HIGHMEM will be used, the guest crashes pretty easily with a CONFIG_HIGHPTE xen_set_pte() related bug/race in the kernel. Version-Release number of selected component (if applicable): Fedora 12, 2.6.31.* 32bit PAE Xen PV kernels. How reproducible: Always. Steps to Reproduce: 1. Install 32bit PAE Xen PV Fedora 12 guest, give it 1 GB or more of memory. 2. Start kernel-compilation loop in the guest. 3. Wait and the guest crashes, usually within 15-30 minutes. Actual results: The guest kernel crashes. Expected results: Guest works OK without crashes. Additional info: The traceback on the guest kernel is usually like this: BUG: unable to handle kernel paging request at c024fc18 IP: [<c0405031>] xen_set_pte+0x78/0x80 *pdpt = 00000002243c0027 Oops: 0003 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map Modules linked in: sunrpc ipv6 dm_multipath xen_netfront xen_blkfront [last unloaded: microcode] Pid: 30, comm: kswapd0 Tainted: G W (2.6.31.12-174.2.3.fc12.i686.PAE #1) EIP: 0061:[<c0405031>] EFLAGS: 00010296 CPU: 1 EIP is at xen_set_pte+0x78/0x80 EAX: 00000000 EBX: c024fc18 ECX: 80000001 EDX: afded063 ESI: 80000001 EDI: 004f42ed EBP: ecba9db0 ESP: ecba9da0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 Process kswapd0 (pid: 30, ti=ecba8000 task=ecba0000 task.ti=ecba8000) Stack: afded063 fffffe50 00039ae6 00000063 ecba9dd8 c042a731 80000000 f57ff000 <0> 00000036 39ae6000 00000000 80000000 00000000 0888a000 ecba9df0 c0404a0a <0> 00000163 80000000 d51a5220 0000000f ecba9e14 c04b51fc 00000001 c2d0cae0 Call Trace: [<c042a731>] ? kmap_atomic_prot+0xef/0x111 [<c0404a0a>] ? xen_kmap_atomic_pte+0x2f/0x36 [<c04b51fc>] ? page_check_address+0xad/0x179 [<c04b539c>] ? page_referenced_one+0x40/0x10e [<c04b6160>] ? page_referenced+0xa8/0x127 [<c04a3cb2>] ? shrink_active_list+0x168/0x1e0 [<c04a4a02>] ? shrink_zone+0x27f/0x291 [<c04a4edb>] ? kswapd+0x386/0x52b [<c04a380a>] ? isolate_pages_global+0x0/0x1ba [<c0450db9>] ? autoremove_wake_function+0x0/0x34 [<c04a4b55>] ? kswapd+0x0/0x52b [<c0450b0f>] ? kthread+0x70/0x75 [<c0450a9f>] ? kthread+0x0/0x75 [<c0409c07>] ? kernel_thread_helper+0x7/0x10 Code: c6 05 44 03 a3 c0 00 8b 3d a4 02 a3 c0 89 55 f0 e8 e8 fe 01 00 48 0f 94 c0 0f b6 c0 8d 3c 38 89 3d a4 02 a3 c0 89 73 04 8b 55 f0 <89> 13 5e 5b 5e 5f 5d c3 55 89 e5 57 56 53 83 ec 0c 0f 1f 44 00 EIP: [<c0405031>] xen_set_pte+0x78/0x80 SS:ESP 0069:ecba9da0 CR2: 00000000c024fc18 ---[ end trace a7919e7f17c0a727 ]--- Xen hypervisor has this in the "xm dmesg" log: (XEN) mm.c:1816:d1 Bad type (saw 0000000028000001 != exp 00000000e0000000) for mfn 1afded (pfn 39ae6) (XEN) mm.c:649:d1 Error getting mfn 1afded (pfn 39ae6) from L1 entry 80000001afded063 for dom1 (XEN) mm.c:3346:d1 ptwr_emulate: could not get_page_from_l1e() Discussion about this bug on xen-devel mailinglist: http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00412.html Xen guys will post a patch upstream shortly. The other fix is to disable CONFIG_HIGHPTE in the kernel, or use the patch from the link above, which basicly does the same.
Upstream patch here: http://lists.xensource.com/archives/html/xen-devel/2010-02/msg01224.html
A workaround has been committed upstream: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=14315592009c17035cac81f4954d5a1f4d71e489 And the real fix is queued: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commitdiff_plain;h=9be6abceadd654543648b799ae430f30e09959ac
This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.