Bug 452756

Summary: panic at intel_i915_configure() when amount of shared video memory set to 1M
Product: Red Hat Enterprise Linux 5 Reporter: Flavio Leitner <fleitner>
Component: kernelAssignee: Prarit Bhargava <prarit>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.2CC: airlied, ajax, akarlsso, dzickus, gasmith, herrold, jvillalo, lkundrak, peterm, tao, zhenyu.z.wang
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-10-07 16:57:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 525215, 533192    
Attachments:
Description Flags
boot-with-nommconf.log showing oops backtrace
none
AGP patch
none
xf86 patch
none
xorg-x11-drv-i810 i386 rpm with patch
none
i686 kernel with patch none

Description Flavio Leitner 2008-06-24 20:35:11 UTC
Description of problem:

Happens at boot, on a system that requires RHEL5.[12] to be booted with
pci=nommconf, and that has a Intel Corporation 82G33/G31 Express Integrated
Graphics Controller. One other factor plays in to this - in BIOS, you have
to set the amount of RAM reserved for video to 1M

What happens is that the system crash (have a look at boot-with-nommconf.log,
attached) and that is not expected. Discussion with Adam Jackson again suggests
that the AGP kernel driver could handle this better (not blow the kernel up for
example).

There is a BZ, 445592, for Fedora 9 which describes this exact problem. Back
trace is different, because of kernel differences.

Version-Release number of selected component (if applicable):
RHEL 5.2

How reproducible:
Always

Steps to Reproduce:
To reproduce you probably need this type gfx chip and a BIOS that will 
let you set the amount of RAM to 1MB. May be difficult to obtain, but 
vendor should be able to help us test.

Additional Info:
Worth pointing out is that this may be considered a regression as RHEL4 
does not exhibit the problem.

Comment 1 Flavio Leitner 2008-06-24 20:35:11 UTC
Created attachment 310182 [details]
boot-with-nommconf.log showing oops backtrace

Comment 2 Flavio Leitner 2008-06-24 20:36:35 UTC
Backtrace:
agpgart: No pre-allocated video memory detected.
BUG: unable to handle kernel paging request at virtual address f887fffc
 printing eip:
c0540aea
*pde = 1faf0067
Oops: 0002 [#1]
SMP
last sysfs file:
Modules linked in:
CPU:    1
EIP:    0060:[<c0540aea>]    Not tainted VLI
EFLAGS: 00010287   (2.6.18-53.el5 #1)
EIP is at intel_i915_configure+0xce/0xee
eax: f887fffc   ebx: 00000000   ecx: fffffffc   edx: 03100001
esi: c0689a38   edi: f8826000   ebp: 00000008   esp: dfad7f08
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 1, ti=dfad7000 task=dfad6aa0 task.ti=dfad7000)
Stack: e0000008 01140008 00000000 dfbf8840 c053bb96 c068966c c0689640 dffa7c00
       c054be11 c04ed2ac dffa7c48 dffa7c48 c068966c c054bd64 dffa7c48 dffc9200
       c068966c c054be55 00000000 c0681820 c068966c c054b862 c068193c c0681940
Call Trace:
 [<c053bb96>] agp_add_bridge+0x165/0x26f
 [<c054be11>] __driver_attach+0x0/0x6b
 [<c04ed2ac>] pci_device_probe+0x36/0x57
 [<c054bd64>] driver_probe_device+0x42/0x92
 [<c054be55>] __driver_attach+0x44/0x6b
 [<c054b862>] bus_for_each_dev+0x37/0x59
 [<c054bcce>] driver_attach+0x11/0x13
 [<c054be11>] __driver_attach+0x0/0x6b
 [<c054b56a>] bus_add_driver+0x64/0xfd
 [<c04ed3da>] __pci_register_driver+0x47/0x63
 [<c06e75a5>] init+0x17d/0x24a
 [<c0404dee>] ret_from_fork+0x6/0x1c
 [<c06e7428>] init+0x0/0x24a
 [<c06e7428>] init+0x0/0x24a
 [<c0405c3b>] kernel_thread_helper+0x7/0x10
 =======================
Code: 7b c0 8b 40 04 83 78 14 00 74 34 8b 1d b8 e8 7b c0 8d 0c 9d 00 00 00 00 eb
 20 a1 00 e7 7b c0 43 8b 50 24 89 c8 03 05 b4 e8 7b c0 <89> 10 89 c8 83 c1 04 03
 05 b4 e8 7b c0 8b 00 3b 5e 04 7c db e8


Comment 4 Flavio Leitner 2008-06-24 20:41:39 UTC
The kernel does not crash if you pass pci=nommconf and set the shared video 
memory to 8MB. 


Comment 6 RHEL Program Management 2008-06-24 20:58:02 UTC
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.

Comment 7 Adam Jackson 2008-06-25 14:22:36 UTC
(In reply to comment #4)
> The kernel does not crash if you pass pci=nommconf and set the shared video 
> memory to 8MB. 

Surely nommconf is irrelevant here?  This isn't about PCI config space at all,
it's about not having enough GART space.

Comment 8 Sirius Rayner-Karlsson 2008-06-25 14:29:50 UTC
re comment#7

Correct. There are other reasons for why you'd need pci=nommconf, not relating
to GART space, but still required in order to get far enough in the boot process
that you can hit the GART problem.


Comment 17 Flavio Leitner 2008-08-18 16:48:51 UTC
Prarit, I know that it panics in RHEL_5_2 and RHEL_5_1, though not sure 
about RHEL_5_0. Let me check with Anders, then I get back to you.
Anyway, the regression flag was considered because RHEL_4_X didn't exhibit 
this issue.

Flavio

Comment 18 Prarit Bhargava 2008-08-19 12:54:49 UTC
Unfortunately, we do not have G33 hardware internally at RH.  I'm going to try to reproduce on another chipset.

P.

Comment 26 Prarit Bhargava 2008-08-21 12:07:53 UTC
Adding John, jvillalo, the onsite Intel engineer.

John -- here is the situation.  This customer is using an Intel G33 video chipset , which allows the RAM reserved from memory to be set to 1M.  When this is done, the system panics (see comment #2).

A patch went upstream, f443675affe3f16dd428e46f0f7fd3f4d703eeab, to fix this issue but it was later pulled as it broke Intel's DDX.

Intel has not attempted to resubmit the patch -- can you ping internally at Intel to see if they are going to resubmit a patch?

Thanks,

P.

Comment 27 John Villalovos 2008-08-21 13:58:08 UTC
Zhenyu,

Can you help with this?

John

Comment 28 Prarit Bhargava 2008-08-22 12:16:54 UTC
Another option is to do the following:

1.  Check if we are on this particular HW via a DMI check.
2.  If we are on this HW, execute the code in the above (now rejected) commmit.

This has, however, a significant side effect -- users would not be able to use X on systems configured in this manner.

Could someone on the customer's side comment on this suggestion and whether or not the solution would be acceptable?

P.

Comment 29 John Villalovos 2008-08-22 13:28:09 UTC
Zhenyu had difficulty resetting their password, so I am posting for them:

------------------------------------
The problem is mine, and a known issue. It was fixed in both upstream of
kernel and X intel video driver, but due to at that time intel video
driver hasn't been released, so people saw compat issue when using
stable released intel driver. So that fix was reverted in kernel and for
compat issue reverted in video driver later.

I'm looking into some way to fix it, maybe there's some version number
we can use.

Here's two patches for kernel agp and intel video driver, OSV could
include them now, and I'll work on it into upstream.

-- Open Source Technology Center, Intel ltd. $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827



commit f443675affe3f16dd428e46f0f7fd3f4d703eeab
Author: Zhenyu Wang <zhenyu.z.wang>
Date:   Tue Sep 11 15:23:57 2007 -0700

    intel_agp: fix stolen mem range on G33
    
    G33 GTT stolen memory is below graphics data stolen memory and be seperate,
    so don't subtract it in stolen mem counting.
    
diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c
index 2c9ca2c..581f922 100644
--- a/drivers/char/agp/intel-agp.c
+++ b/drivers/char/agp/intel-agp.c
@@ -506,6 +506,11 @@ static void intel_i830_init_gtt_entries(void)
 			break;
 		}
 	} else {
+		/* G33's GTT stolen memory is separate from gfx data
+		 * stolen memory.
+		 */
+		if (IS_G33)
+			size = 0;
 		switch (gmch_ctrl & I830_GMCH_GMS_MASK) {
 		case I855_GMCH_GMS_STOLEN_1M:
 			gtt_entries = MB(1) - KB(size);



commit 2a8592f2ebcba86b1127aa889155d58a3dc186ca
Author: Zhenyu Wang <zhenyu.z.wang>
Date:   Wed Sep 5 14:52:56 2007 +0800

    Fix G33 GTT stolen mem range
    
    G33 GTT table lives in seperate stolen mem with
    graphics data stolen mem.

diff --git a/src/i830_driver.c b/src/i830_driver.c
index 9fa231d..983be76 100644
--- a/src/i830_driver.c
+++ b/src/i830_driver.c
@@ -483,6 +483,9 @@ I830DetectMemory(ScrnInfoPtr pScrn)
    range = gtt_size + 4;
 
    if (IS_I85X(pI830) || IS_I865G(pI830) || IS_I9XX(pI830)) {
+      /* G33 has seperate GTT stolen mem */
+      if (IS_G33CLASS(pI830))
+	  range = 0;
       switch (gmch_ctrl & I830_GMCH_GMS_MASK) {
       case I855_GMCH_GMS_STOLEN_1M:
 	 memsize = MB(1) - KB(range);

Comment 30 John Villalovos 2008-08-22 13:30:18 UTC
Created attachment 314802 [details]
AGP patch

Comment 31 John Villalovos 2008-08-22 13:31:07 UTC
Created attachment 314803 [details]
xf86 patch

Comment 33 Prarit Bhargava 2008-08-27 10:42:43 UTC
Could the customer please test the patches in comment #29 and report back whether or not they fix the customer's problem?

P.

Comment 36 Prarit Bhargava 2008-08-28 18:11:53 UTC
Anders, I am attaching a new kernel rpm and a new xorg-x11-drv-i810 package (which contains the i830 driver).  The xorg-x11-drv-i810 package should be updated using 'rpm -Uvh --force'.

1) Boot the system with a "known good" configuration, install, and confirm that
the system is "functional" wrt this bug.

ie) 

5.0 installation with text mode
====================================
BIOS Default Values Loaded
BIOS Memory Settings: AUTO / 376M

Params: linux nofb pci=nommconf

Result: Successful full text mode installation, and subsequent boot.

2) Re-configure the BIOS for 1M of shared video RAM.

ie)

5.0 installation with graphics mode
========================================
BIOS Default Values Loaded
BIOS Memory Settings Manually Changed to:
- DVMT 4.0 Mode         = Fixed
- Fixed Graphics Memory = 248M
- Pre-alloc Mem Size    = 1M
- IGD Memory Size       = 256M

3) Confirm that the above configuration is failing when booting.

4) Switch back to the "known good" configuration, and reboot.

5) Install both new rpms provided.  Reboot.

7) Again re-configure the BIOS for 1M of shared video RAM as in step 2)

8) Reboot the new kernel -- and, of course, let us know the results.

Comment 37 Prarit Bhargava 2008-08-28 18:14:15 UTC
Created attachment 315272 [details]
xorg-x11-drv-i810 i386 rpm with patch

Anders, please use this binary rpm to test.

Comment 38 Prarit Bhargava 2008-08-28 18:16:18 UTC
Created attachment 315273 [details]
i686 kernel with patch

Comment 42 Issue Tracker 2008-09-19 13:52:48 UTC


This event sent from IssueTracker by gasmith 
 issue 186310
it_file 157986

Comment 46 RHEL Program Management 2009-02-16 15:26:42 UTC
Updating PM score.

Comment 51 RHEL Program Management 2009-10-07 16:57:40 UTC
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.