412691 – kernel-xen panic when X shuts down

Bug 412691 - kernel-xen panic when X shuts down

Summary: kernel-xen panic when X shuts down

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel-xen
Sub Component:
Version:	5.2
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Rik van Riel
QA Contact:	Martin Jenner
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-12-05 19:49 UTC by Prarit Bhargava
Modified:	2009-01-20 20:03 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-01-20 20:03:54 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
x86 numa: Fix the overflow of physical addresses. (1.84 KB, patch) 2008-10-22 21:52 UTC, Rik van Riel	no flags	Details \| Diff
kernel patch with the fixes (3.50 KB, patch) 2008-11-18 20:45 UTC, Rik van Riel	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2009:0225	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 5.3 kernel security and bug fix update	2009-01-20 16:06:24 UTC

Description Prarit Bhargava 2007-12-05 19:49:22 UTC

Description of problem:

When X (RHGB or o/w) shuts down, kernel-xen panics.

Version-Release number of selected component (if applicable): 2.6.18-58.el5xen


How reproducible: 100%


Steps to Reproduce:
1. Boot kernel-xen.
2. startx
3. End the X session

or

1. Boot kernel-xen with rhgb
2. When rhgb completes the system will panic.
  
Actual results:

dhcp83-120.boston.redhat.com login: ------------[ cut here ]------------
kernel BUG at arch/i386/mm/pageattr.c:130!
invalid opcode: 0000 [#1]
SMP 
last sysfs file: /class/drm/card0/dev
Modules linked in: i915 drm netloop netbk blktap blkbk ipt_MASQUERADE iptable_na
t ip_nat bridge autofs4 hidp nfs lockd fscache nfs_acl rfcomm l2cap bluetooth su
nrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_
filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 
dm_multipath video sbs backlight i2c_ec button battery asus_acpi ac parport_pc l
p parport sr_mod cdrom snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd
_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss pcspkr snd_pcm 
serial_core serio_raw i2c_i801 i2c_core snd_timer snd soundcore snd_page_alloc e
1000e sg dm_snapshot dm_zero dm_mirror dm_mod pata_marvell ata_piix libata sd_mo
d scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
CPU:    0
EIP:    0061:[<c04142ba>]    Not tainted VLI
EFLAGS: 00210046   (2.6.18-58.el5xen #1) 
EIP is at change_page_attr+0x69/0x65f
eax: c0668700   ebx: c1cceaa0   ecx: 00000000   edx: c0668700
esi: 15555000   edi: 00000001   ebp: 00000000   esp: ebfd9ec4
ds: 007b   es: 007b   ss: 0069
Process X (pid: 3690, ti=ebfd9000 task=c037e000 task.ti=ebfd9000)
Stack: 7e342067 00000000 00000001 c1cceaa0 00000000 00000000 ebfd9f6c 00000000 
       00000000 003f0000 b7f82000 00000000 b7d82000 ebfd9f6c 00000000 00000000 
       ed1fb200 c1000ac0 00000000 14db1065 ed1fb248 eb995284 ed1fb238 ed1fb200 
Call Trace:
 [<c05fe861>] do_page_fault+0x6d5/0xbd8
 [<c05fe8da>] do_page_fault+0x74e/0xbd8
 [<c0532ccb>] unmap_page_from_agp+0x16/0x19
 [<c0532cef>] agp_generic_destroy_page+0x21/0x44
 [<c0532be8>] agp_free_memory+0x9e/0xd4
 [<c0532cce>] agp_generic_destroy_page+0x0/0x44
 [<c0531f42>] agp_release+0x7e/0x143
 [<c04682eb>] __fput+0x9c/0x167
 [<c0465dd7>] filp_close+0x4e/0x54
 [<c040534f>] syscall_call+0x7/0xb
 =======================
 =======================
Code: 44 24 10 00 00 00 00 e9 bd 05 00 00 8b 54 24 0c 8b 02 c1 e8 1e 8b 14 85 4c
 35 6d c0 8b 82 0c 12 00 00 05 80 37 00 00 39 c2 75 08 <0f> 0b 82 00 8b 82 61 c0
 8b 44 24 0c e8 01 e5 03 00 89 44 24 24 
EIP: [<c04142ba>] change_page_attr+0x69/0x65f SS:ESP 0069:ebfd9ec4
 <0>Kernel panic - not syncing: Fatal exception
 BUG: warning at arch/i386/kernel/smp-xen.c:529/smp_call_function() (Not tainted
)
 [<c040e703>] smp_call_function+0x59/0xfe
 [<c040e7bb>] smp_send_stop+0x13/0x1e
 [<c041ca57>] panic+0x4c/0x175
 [<c0405fc3>] die+0x267/0x29b
 [<c0406513>] do_invalid_op+0x0/0x9d
 [<c04065a4>] do_invalid_op+0x91/0x9d
 [<c04142ba>] change_page_attr+0x69/0x65f
 [<c0457952>] __handle_mm_fault+0xf46/0x104e
 [<c0406dad>] do_IRQ+0xa5/0xae
 [<c04054d3>] error_code+0x2b/0x30
 [<c04142ba>] change_page_attr+0x69/0x65f
 [<c05fe861>] do_page_fault+0x6d5/0xbd8
 [<c05fe8da>] do_page_fault+0x74e/0xbd8
 [<c0532ccb>] unmap_page_from_agp+0x16/0x19
 [<c0532cef>] agp_generic_destroy_page+0x21/0x44
 [<c0532be8>] agp_free_memory+0x9e/0xd4
 [<c0532cce>] agp_generic_destroy_page+0x0/0x44
 [<c0531f42>] agp_release+0x7e/0x143
 [<c04682eb>] __fput+0x9c/0x167
 [<c0465dd7>] filp_close+0x4e/0x54
 [<c040534f>] syscall_call+0x7/0xb
 =======================
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.


Expected results: Something nice should happen instead of a panic.  Maybe
someone will send me flowers.  Or chocolate.  I like chocolate.

Additional info:  Seen on Johannesburg system with 4G of memory.  (Note that
these systems require DIMMs to be added _one-at-a-time_.)

I've just started looking at this, so no patch (yet).  Will work with
clalance/sct/other-xen-persons to get issue resolved.

Comment 1 Prarit Bhargava 2007-12-05 20:02:49 UTC

A workaround appears to be:

1.  open the /etc/X11/xorg.conf file
2.  replace
     Driver      "i810"
with
     Driver      "vesa"

This results in no panic during X shutdown.

Comment 2 Bill Burns 2008-04-30 15:59:49 UTC

Do you know if this is still an issue with later kernels? Some fixes that effect
X have been made that might have addressed this. Can you possibly try the -91
kernel? Thanks.

Comment 4 Gary Case 2008-04-30 17:36:22 UTC

It's still broken. I installed my 8GB DQ35JO system using the snap7 tree (-91)
and i386 arch gets stuck in an endless boot loop because of the panic.

Comment 5 Gary Case 2008-04-30 17:48:54 UTC

It appears that we no longer load 'i810' driver. We've switched to 'intel', but
the same problem occurs. The workaround of switching to 'vesa' still works.

Comment 6 Bill Burns 2008-04-30 17:53:51 UTC

Ok, thanks for the info.

Comment 11 Fal Diabate 2008-09-08 19:08:51 UTC

Per Ron's request - the latest BIOS version for JO and MP is 0942. Pls see this URL for download info.
http://downloadcenter.intel.com/Filter_Results.aspx?strTypes=all&ProductID=2784&OSFullname=OS+Independent&strOSs=38

Regards,
Fal

Comment 12 Rik van Riel 2008-09-26 20:12:19 UTC

static int
__change_page_attr(struct page *page, pgprot_t prot)
{
        pte_t *kpte;
        unsigned long address;
        struct page *kpte_page;

        BUG_ON(PageHighMem(page));
        address = (unsigned long)page_address(page);

This is the BUG being triggered in comment #1.

On the system in question, does the i810 DRM driver by chance mmap physical memory at addresses higher than 1GB into the frame buffer?

Comment 13 Rik van Riel 2008-10-06 22:49:57 UTC

I spent some time reading this code and I don't understand some things:
- why is the i915 driver using the agp memory code?
- if it is using the agp memory code, why did it never call map_page_into_agp(), which should have also run into the BUG_ON in __change_page_attr() ?

Comment 15 Rik van Riel 2008-10-15 17:57:15 UTC

On another Johannesburg system (the one I tried to reproduce the X shutdown bug on), X manages to crash the system very badly at startup.

Every time I start up X, I get a different hypervisor panic.  This leads me to believe that X, with help from the i915 driver, is corrupting hypervisor memory.

I tried running X straced (over serial console), but the last few thousand lines of strace output are al gettimeofday syscalls.  Presumably X is prodding the hardware through mmaps in-between the gettimeofday calls and one of the memory writes is causing the hypervisor to panic.

Comment 17 Rik van Riel 2008-10-15 18:11:41 UTC

Another data point: when the hypervisor is limited to 3GB memory, things work normally.  Going to 4GB or more causes things to break.

Comment 18 Rik van Riel 2008-10-22 21:52:31 UTC

Created attachment 321212 [details]
x86 numa: Fix the overflow of physical addresses.

First potentially relevant changeset I found while combing through upstream.

Comment 19 Rik van Riel 2008-10-23 17:16:14 UTC

Progress!  With the hypervisor patch, my test system no longer crashes on X startup.

Instead, I get the same dom0 kernel panic as in comment #0 when X exits.

Comment 20 Rik van Riel 2008-10-23 17:29:10 UTC

I posted the patches for review this morning.

Comment 21 Rik van Riel 2008-10-23 17:42:06 UTC

Umm, wrong browser tab.  Moving the *other* bug to POST now :)

Comment 22 Keve Gabbert 2008-10-28 20:49:53 UTC

Don, please add this to your agenda for RH-Intel Virtualization meeting.
This issue is blocking the Certification of DQ35MP.

Comment 23 Jiang, Yunhong 2008-11-17 14:38:58 UTC

I'm not sure if following changeset in Xen's linux tree related to this bug, but I'm sure this changeset is needed for intel platform to work, if not for this bug.

BTW,for comments #13, "if it is using the agp memory code, why did it never call
map_page_into_agp(), which should have also run into the BUG_ON in __change_page_attr() ?". it is not always correct. When map_page_into_agp, the page is just allocated, and is ok. however, when unmap_page_from_agp, the page is got from gart_to_virt, which is not always correct.

If it is do for this page, then I suspect it is because in agp_allocate_memory() in drivers/char/agp/generic.c , when "new->memory[i] = virt_to_gart(addr);", the memory is defined as unsigned long *, may lost some information of virt_to_gart().

I will attach the details of the patch also.

# HG changeset patch
# User kfraser
# Date 1182429682 -3600
# Node ID 02a46885bd90a4d936338c135023b511318c7aa2
# Parent  c8c9bc0b7e29e804c09d4375a0e655cda826a9e4
linux: fix agp address handling, namely intel-agp

Make sure machine addresses are in fact constrained to 32 bits, and
assumptions about multi-page extents being contiguous are being met.

Generic parts of the patch are in 2.6.22-rc4.

Signed-off-by: Jan Beulich <jbeulich>

Comment 24 Jiang, Yunhong 2008-11-17 14:56:25 UTC

Seems I can't attach the patch, so I'd give the URL for this patch.
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/02a46885bd90

Comment 25 Rik van Riel 2008-11-17 15:08:24 UTC

Thank you for finding that patch, Yunhong!  Together with the hypervisor patch, that may make things work again.

Of course, I will have to change the patch a bit so the kABI stays the same (we cannot get rid of two exported symbols in-between RHEL updates), but that looks doable.

Comment 26 Jiang, Yunhong 2008-11-17 15:12:12 UTC

So, Rik, waiting for your try.

Also, please check following URL: http://lkml.org/lkml/2007/4/2/186

Most the other part is similar to patch in comments 32, but I'm not sure if
following chunk is needed also.

 @@ -206,7 +207,7 @@ static void i8xx_destroy_pages(void *add
  global_flush_tlb();
  put_page(page);
  unlock_page(page);
- free_pages((unsigned long)addr, 2);
+ __free_pages(page, 2);
  atomic_dec(&agp_bridge->current_memory_agp);
 }

Comment 27 Jiang, Yunhong 2008-11-17 22:55:15 UTC

Any update on this issue? Is it working now?

Comment 28 Jiang, Yunhong 2008-11-18 15:33:40 UTC

We tried the patch on our side, and seems the issues is caused by wrong E820 table.

When we populated 4G memory, the E820 table is reported as following, which means only 512M memory is usable to OS. As xen will only reserve min(memory/16, 128M) for DMA buffer, it means only 32M is reserved(check compute_dom0_nr_pages() in arch/x86/domain_build.c for details please).

One potential improvement for the patch in comments #23 is to add some error handling in map_page_into_agp(page) macro defined in include/asm-i386/mach-xen/asm/agp.h, to return failure if xen_create_contiguous_region() failed, and also update agp_generic_alloc_page() to handle such failure. But even with that ehancement, the system can't start XWindow still.


Nov 11 16:01:45 localhost kernel: BIOS-provided physical RAM map:
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 0000000000100000 - 0000000020d73000 (usable)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 0000000020d73000 - 00000000cd174000 (reserved)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000cd174000 - 00000000cdbfd000 (usable)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000cdbfd000 - 00000000cdca2000 (ACPI NVS)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000cdca2000 - 00000000ceeca000 (usable)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000ceeca000 - 00000000ceecc000 (reserved)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000ceecc000 - 00000000cef84000 (usable)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000cef84000 - 00000000cefe5000 (ACPI NVS)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000cefe5000 - 00000000cefea000 (usable)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000cefea000 - 00000000ceff3000 (ACPI data)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000ceff3000 - 00000000ceff4000 (usable)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000ceff4000 - 00000000cefff000 (ACPI data)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000cefff000 - 00000000cf000000 (usable)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000cf000000 - 00000000d0000000 (reserved)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000f0000000 - 00000000f8000000 (reserved)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 00000000ffc00000 - 0000000100000000 (reserved)
Nov 11 16:01:45 localhost kernel:  BIOS-e820: 0000000100000000 - 000000012c000000 (usable)

Comment 29 Rik van Riel 2008-11-18 20:43:27 UTC

Apparently, there are more bugs lurking somewhere :(

------------[ cut here ]------------
kernel BUG at arch/i386/mm/pageattr.c:156!
invalid opcode: 0000 [#1]
SMP 
last sysfs file: /class/drm/card0/dev
Modules linked in: i915 drm netloop netbk blktap blkbk ipt_MASQUERADE
iptable_nat ip_nat bridge autofs4 hidp nfs lockd fscache nfs_acl rfcomm l2cap
bluetooth sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack
nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter
ip6_tables x_tables ipv6 xfrm_nalgo crypto_api cpufreq_ondemand acpi_cpufreq
dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi
ac parport_pc lp parport sr_mod cdrom sg snd_hda_intel snd_seq_dummy
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
snd_pcm i2c_i801 e1000e snd_timer snd_page_alloc snd_hwdep snd i2c_core
soundcore serio_raw serial_core pcspkr dm_snapshot dm_zero dm_mirror dm_log
dm_mod pata_marvell ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd
ehci_hcd
CPU:    0
EIP:    0061:[<c0416981>]    Not tainted VLI
EFLAGS: 00210046   (2.6.18-124.el5.bz412691.2xen #1) 
EIP is at change_page_attr+0x571/0x7e0
eax: 00000000   ebx: 06881063   ecx: 80000002   edx: 80000002
esi: 06881063   edi: c00412e8   ebp: c1672820   esp: ebbf5eb0
ds: 007b   es: 007b   ss: 0069
Process X (pid: 7339, ti=ebbf5000 task=ecd85000 task.ti=ebbf5000)
Stack: 00000000 80000002 00000001 c17a2ba0 00000000 00000000 c1672820 00000000 
       00000000 c985d000 c00412e8 00000000 00000000 c1672000 00000004 00026174 
       00000063 80000000 00000001 00000000 00000000 06881063 80000002 00000001 
Call Trace:
 [<c053f493>] unmap_page_from_agp+0x27/0x2b
 [<c053f4b8>] agp_generic_destroy_page+0x21/0x44
 [<c053f39f>] agp_free_memory+0x9e/0xd4
 [<c053f497>] agp_generic_destroy_page+0x0/0x44
 [<c053e6e6>] agp_release+0x7e/0x143
 [<c046fc1b>] __fput+0x9c/0x167
 [<c046d609>] filp_close+0x4e/0x54
 [<c046e835>] sys_close+0x71/0xa8
 [<c0405413>] syscall_call+0x7/0xb
 =======================
Code: 89 54 24 04 89 f3 8b 4c 24 04 89 04 24 89 74 24 54 89 54 24 58 8b 07 8b
57 04 f0 0f c7 0f 75 f5 8b 6c 24 18 8b 45 0c 85 c0 75 08 <0f> 0b 9c 00 f2 cf 62
c0 8b 54 24 18 48 89 42 0c eb 08 0f 0b 9f 
EIP: [<c0416981>] change_page_attr+0x571/0x7e0 SS:ESP 0069:ebbf5eb0
 <0>Kernel panic - not syncing: Fatal exception
 BUG: warning at arch/i386/kernel/smp-xen.c:529/smp_call_function() (Not
tainted)
 [<c0410983>] smp_call_function+0x59/0xfe
 [<c0410a3b>] smp_send_stop+0x13/0x1e
 [<c041f68b>] panic+0x4c/0x171
 [<c04060a5>] die+0x262/0x296
 [<c04065f5>] do_invalid_op+0x0/0x9d
 [<c0406686>] do_invalid_op+0x91/0x9d
 [<c0416981>] change_page_attr+0x571/0x7e0
 [<c04517af>] __generic_file_aio_write_nolock+0x4a6/0x52a
 [<c04be286>] avc_has_perm+0x3a/0x44
 [<c0405597>] error_code+0x2b/0x30
 [<c0416981>] change_page_attr+0x571/0x7e0
 [<c053f493>] unmap_page_from_agp+0x27/0x2b
 [<c053f4b8>] agp_generic_destroy_page+0x21/0x44
 [<c053f39f>] agp_free_memory+0x9e/0xd4
 [<c053f497>] agp_generic_destroy_page+0x0/0x44
 [<c053e6e6>] agp_release+0x7e/0x143
 [<c046fc1b>] __fput+0x9c/0x167
 [<c046d609>] filp_close+0x4e/0x54
 [<c046e835>] sys_close+0x71/0xa8
 [<c0405413>] syscall_call+0x7/0xb
 =======================
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

Comment 30 Rik van Riel 2008-11-18 20:45:20 UTC

Created attachment 323960 [details]
kernel patch with the fixes

The kernel side patch I used, in addition to the Xen hypervisor patch from the other attachment.

With both of these patches, I still get the oops.

Comment 31 Rik van Riel 2008-11-18 21:23:59 UTC

Heh, this may have been a logic inversion.  Let me try again with this:

diff -r1.1.2.1 linux-2.6-xen-agp-paddr-overflow.patch
50c50
< +	if (xen_create_contiguous_region((unsigned long)page_address(page), 0, 32))
---
> +	if (!xen_create_contiguous_region((unsigned long)page_address(page), 0, 32))

Comment 32 Rik van Riel 2008-11-18 21:54:29 UTC

No luck, still the same bug :(

------------[ cut here ]------------
kernel BUG at arch/i386/mm/pageattr.c:156!
invalid opcode: 0000 [#1]
SMP 
last sysfs file: /class/drm/card0/dev
Modules linked in: i915 drm netloop netbk blktap blkbk ipt_MASQUERADE iptable_nat ip_nat bridge autofs4 hidp nfs lockd fscache nfs_acl rfcomm l2cap bluetooth sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api cpufreq_ondemand acpi_cpufreq dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi ac parport_pc lp parport sr_mod cdrom sg snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss serial_core snd_pcm snd_timer i2c_i801 snd_page_alloc snd_hwdep snd soundcore e1000e serio_raw i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_log dm_mod pata_marvell ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
CPU:    0
EIP:    0061:[<c0416981>]    Not tainted VLI
EFLAGS: 00210046   (2.6.18-124.el5.bz412691.3xen #1) 
EIP is at change_page_attr+0x571/0x7e0
eax: 00000000   ebx: 06be3063   ecx: 80000002   edx: 80000002
esi: 06be3063   edi: c0041a20   ebp: c1672820   esp: eb4d0eb0
ds: 007b   es: 007b   ss: 0069
Process X (pid: 7350, ti=eb4d0000 task=ecd63550 task.ti=eb4d0000)
Stack: 00000000 80000002 00000001 c17a4880 00000000 00000000 c1672820 00000000 
       00000000 c9944000 c0041a20 00000000 00000000 c1672000 00000004 00026510 
       00000063 80000000 00000001 00000000 00000000 06be3063 80000002 00000001 
Call Trace:
 [<c053f493>] unmap_page_from_agp+0x27/0x2b
 [<c053f4b8>] agp_generic_destroy_page+0x21/0x44
 [<c053f39f>] agp_free_memory+0x9e/0xd4
 [<c053f497>] agp_generic_destroy_page+0x0/0x44
 [<c053e6e6>] agp_release+0x7e/0x143
 [<c046fc1b>] __fput+0x9c/0x167
 [<c046d609>] filp_close+0x4e/0x54
 [<c046e835>] sys_close+0x71/0xa8
 [<c0405413>] syscall_call+0x7/0xb
 =======================
Code: 89 54 24 04 89 f3 8b 4c 24 04 89 04 24 89 74 24 54 89 54 24 58 8b 07 8b 57 04 f0 0f c7 0f 75 f5 8b 6c 24 18 8b 45 0c 85 c0 75 08 <0f> 0b 9c 00 f2 cf 62 c0 8b 54 24 18 48 89 42 0c eb 08 0f 0b 9f 
EIP: [<c0416981>] change_page_attr+0x571/0x7e0 SS:ESP 0069:eb4d0eb0
 <0>Kernel panic - not syncing: Fatal exception
 BUG: warning at arch/i386/kernel/smp-xen.c:529/smp_call_function() (Not tainted)
 [<c0410983>] smp_call_function+0x59/0xfe
 [<c0410a3b>] smp_send_stop+0x13/0x1e
 [<c041f68b>] panic+0x4c/0x171
 [<c04060a5>] die+0x262/0x296
 [<c04065f5>] do_invalid_op+0x0/0x9d
 [<c0406686>] do_invalid_op+0x91/0x9d
 [<c0416981>] change_page_attr+0x571/0x7e0
 [<c040e7e9>] generic_get_mtrr+0x21/0x42
 [<c04517af>] __generic_file_aio_write_nolock+0x4a6/0x52a
 [<c04be286>] avc_has_perm+0x3a/0x44
 [<c0405597>] error_code+0x2b/0x30
 [<c0416981>] change_page_attr+0x571/0x7e0
 [<c053f493>] unmap_page_from_agp+0x27/0x2b
 [<c053f4b8>] agp_generic_destroy_page+0x21/0x44
 [<c053f39f>] agp_free_memory+0x9e/0xd4
 [<c053f497>] agp_generic_destroy_page+0x0/0x44
 [<c053e6e6>] agp_release+0x7e/0x143
 [<c046fc1b>] __fput+0x9c/0x167
 [<c046d609>] filp_close+0x4e/0x54
 [<c046e835>] sys_close+0x71/0xa8
 [<c0405413>] syscall_call+0x7/0xb
 =======================
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

Comment 33 Jiang, Yunhong 2008-11-19 03:32:57 UTC

Can you please try add dom0_mem=512M to grub's xen entry when the memory is populated to 4G ? That should workaround this issue. 

Also, I think patch in comments 31 is not needed, since the xen_create_contiguous_region() will return 0 for success, but maybe we need add some check, so that if the xen_create_contiguous_region() failed, we need to fail agp_generic_alloc_page() also.

Thanks
Yunhong Jiang

Comment 34 Rik van Riel 2008-11-19 15:31:57 UTC

Booting with dom0_mem=512M does indeed avoid the bug.  Of course, that is probably not an acceptable thing to do for RHEL :)

Comment 35 Don Dugger 2008-11-19 16:21:59 UTC

Can you boot native Linux on that machine and attach the E820 memory map reported to the BZ entry?  I'd like to verify that you are seeing the same BIOS issue we are.  With 4G of RAM in your machine the E820 map should be showing <1G available.

Comment 36 Rik van Riel 2008-11-19 17:00:52 UTC

Here is the e820 map as printed out by the Xen hypervisor:

(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 000000000009dc00 (usable)
(XEN)  000000000009dc00 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 00000000cdc90000 (usable)
(XEN)  00000000cdc90000 - 00000000cdcf6000 (ACPI NVS)
(XEN)  00000000cdcf6000 - 00000000ceec6000 (usable)
(XEN)  00000000ceec6000 - 00000000ceec8000 (reserved)
(XEN)  00000000ceec8000 - 00000000cef7a000 (usable)
(XEN)  00000000cef7a000 - 00000000cefe5000 (ACPI NVS)
(XEN)  00000000cefe5000 - 00000000cefe7000 (usable)
(XEN)  00000000cefe7000 - 00000000ceff3000 (ACPI data)
(XEN)  00000000ceff3000 - 00000000ceff4000 (usable)
(XEN)  00000000ceff4000 - 00000000cefff000 (ACPI data)
(XEN)  00000000cefff000 - 00000000cf000000 (usable)
(XEN)  00000000cf000000 - 00000000d0000000 (reserved)
(XEN)  00000000f0000000 - 00000000f8000000 (reserved)
(XEN)  00000000ffc00000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 000000022c000000 (usable)
(XEN) System RAM: 8110MB (8305356kB)

The e820 map seen by the non-xen kernel is the same:

BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009dc00 (usable)
 BIOS-e820: 000000000009dc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000cdc90000 (usable)
 BIOS-e820: 00000000cdc90000 - 00000000cdcf6000 (ACPI NVS)
 BIOS-e820: 00000000cdcf6000 - 00000000ceec6000 (usable)
 BIOS-e820: 00000000ceec6000 - 00000000ceec8000 (reserved)
 BIOS-e820: 00000000ceec8000 - 00000000cef7a000 (usable)
 BIOS-e820: 00000000cef7a000 - 00000000cefe5000 (ACPI NVS)
 BIOS-e820: 00000000cefe5000 - 00000000cefe7000 (usable)
 BIOS-e820: 00000000cefe7000 - 00000000ceff3000 (ACPI data)
 BIOS-e820: 00000000ceff3000 - 00000000ceff4000 (usable)
 BIOS-e820: 00000000ceff4000 - 00000000cefff000 (ACPI data)
 BIOS-e820: 00000000cefff000 - 00000000cf000000 (usable)
 BIOS-e820: 00000000cf000000 - 00000000d0000000 (reserved)
 BIOS-e820: 00000000f0000000 - 00000000f8000000 (reserved)
 BIOS-e820: 00000000ffc00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 000000022c000000 (usable)
8000MB HIGHMEM available.
896MB LOWMEM available.

Comment 37 Rik van Riel 2008-11-19 18:09:34 UTC

FYI, here is the dmidecode info on the BIOS:

Handle 0x0005, DMI type 0, 24 bytes.
BIOS Information
        Vendor: Intel Corp.
        Version: JOQ3510J.86A.0942.2008.0807.1958
        Release Date: 08/07/2008
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 4096 kB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                ATAPI Zip drive boot is supported
                BIOS boot specification is supported
                Function key-initiated network boot is supported
                Targeted content distribution is supported
        BIOS Revision: 0.0
        Firmware Revision: 0.0

Comment 38 Jiang, Yunhong 2008-11-20 02:10:07 UTC

Riv, according to comments 32, seems it is at line 156 of arch/i386/mm/pageattr.c, while originally it is in line 130. I have a look on the code and a bit strange why it hit line 156. The pgprot_val should have been changed to PAGE_KERNEL_NOCACHE, so we should be in the first "if" statement and instead of the BUG_ON in the "else if" statement.
Anyway, can you add some changes to agp_generic_alloc_page(), so that it will fail if the map_page_into_agp() failed?

        if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL)) {
                if ((pte_val(*kpte) & _PAGE_PSE) == 0) {
                        set_pte_atomic(kpte, mk_pte(page, prot));
                } else {
                        pgprot_t ref_prot;
                        struct page *split;

                        ref_prot =
                        ((address & LARGE_PAGE_MASK) < (unsigned long)&_etext)
                                ? PAGE_KERNEL_EXEC : PAGE_KERNEL;
                        split = split_large_page(address, prot, ref_prot);
                        if (!split)
                                return -ENOMEM;
                        set_pmd_pte(kpte,address,mk_pte(split, ref_prot));
                        kpte_page = split;
                }
                page_private(kpte_page)++;
        } else if ((pte_val(*kpte) & _PAGE_PSE) == 0) {
                set_pte_atomic(kpte, mk_pte(page, PAGE_KERNEL));
                BUG_ON(page_private(kpte_page) == 0);
                page_private(kpte_page)--;
        } else
                BUG();

Comment 39 Jiang, Yunhong 2008-11-20 14:40:26 UTC

Riv, after more investigation, we have got the reason of the panic. Currently xen reserve 128M DMA buffer at most, while the on-board graphic card requires 256M memory. With following patch + xen patch + your patch in comments 30+31, everything works quite well.

diff -r b90893077a90 xen/arch/x86/domain_build.c
--- a/xen/arch/x86/domain_build.c       Thu Nov 20 07:29:20 2008 +0800
+++ b/xen/arch/x86/domain_build.c       Thu Nov 20 07:29:39 2008 +0800
@@ -139,7 +139,7 @@ static unsigned long __init compute_dom0
     if ( dom0_nrpages == 0 )
     {
         dom0_nrpages = avail;
-        dom0_nrpages = min(dom0_nrpages / 16, 128L << (20 - PAGE_SHIFT));
+        dom0_nrpages = min(dom0_nrpages / 8, 384L << (20 - PAGE_SHIFT));
         dom0_nrpages = -dom0_nrpages;
     }

There are some alternative method to achieve this method:
a) Update xen, so that when dom0 allocate page with GFP_DMA32, it will get memory below 4G, instead of >4G memory. This requies change Xen on how to setup mapping for dom0, however, seems upstream does not want to accept this solution. See http://article.gmane.org/gmane.comp.emulators.xen.devel/58160 for more discussion.

Comment 40 Rik van Riel 2008-11-21 18:29:25 UTC

With all three patches, the oops no longer happens.

Of course, X still does not actually work right, but the oops seems to be gone.

Comment 41 Rik van Riel 2008-11-21 18:47:02 UTC

Doh ... X was just misdetecting the monitor now.

Everything is working now with the 3 patches above.

Comment 45 Don Zickus 2008-12-02 22:18:48 UTC

in kernel-2.6.18-125.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 47 John Villalovos 2008-12-09 18:10:23 UTC

Jiang, Yunhong,

Have you been able to validate the kernel mention in comment #45?

Comment 48 John Villalovos 2008-12-10 17:45:22 UTC

I emailed Yunhong and they said that their motherboard they were using for testing is currently broken so they are unable to test this issue.

Comment 50 Gary Case 2009-01-06 19:56:25 UTC

I've received confirmation that this issue has definitely been fixed based on testing with our DQ35JO system.

Comment 51 errata-xmlrpc 2009-01-20 20:03:54 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html

Note You need to log in before you can comment on or make changes to this bug.