Bug 65384
Summary: | Radeon Mobility M6 and IBM Thinkpad X22 lockup on apm resume | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Robert Spier <rspier> | ||||
Component: | XFree86 | Assignee: | Mike A. Harris <mharris> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | David Lawrence <dkl> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.3 | CC: | davej, derrien, k.georgiou, mharris, sahai | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2005-04-20 14:49:28 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Robert Spier
2002-05-22 22:24:50 UTC
I played around with this a little more tonight. If I disable agpgart with 'alias agpgart off' in /etc/modules.conf, then we get some interesting results. 1- the first time I run startx, the kernel oopses, as below. 2- The text console is horked, but I can still type, and if I run startx again, X starts fine, albeit unaccelerated. I can suspend and resume just fine. Of course, during my playing, I was also able to get the machine to spontaneously reboot on starting the X server, with various combinations of installing agpgart, removing it, and various versions of radeon.o. (Not that any of that is useful, but something is definitely odd.) [drm:radeon_do_init_cp] *ERROR* PCI GART not yet supported for Radeon! Unable to handle kernel NULL pointer dereference at virtual address 0000001c printing eip: d88d4fff *pde = 17934067 *pte = 00000000 Oops: 0000 radeon ds yenta_socket pcmcia_core eepro100 ipchains usb-uhci usbcore ext3 jbd CPU: 0 EIP: 0010:[<d88d4fff>] Not tainted EFLAGS: 00013246 EIP is at radeon_do_cp_idle [radeon] 0x1f (2.4.18-3) eax: 00000000 ebx: 00000000 ecx: 00000000 edx: 00000001 esi: 00000000 edi: 00000001 ebp: 00000000 esp: d4b5ff44 ds: 0018 es: 0018 ss: 0018 Process X (pid: 1112, stackpage=d4b5f000) Stack: d4b5ff58 d88d5f85 00000000 00000000 d5297800 00000001 00000001 d5297800 d52ad3a0 bffff940 40086442 d88d0e14 d42a65e0 d52ad3a0 40086442 bffff940 40086442 ffffffe7 bffff940 d52ad3a0 c0146547 d42a65e0 d52ad3a0 40086442 Call Trace: [<d88d5f85>] radeon_cp_stop [radeon] 0xf5 [<d88d0e14>] radeon_ioctl [radeon] 0xe4 [<c0146547>] sys_ioctl [kernel] 0x217 [<c0108923>] system_call [kernel] 0x33 Code: 8b 43 1c 83 f8 18 77 19 6a 18 53 e8 01 15 00 00 59 58 8b 43 Ok interesting; my laptop also has an M6 and resume has never failed for me. (but it's no IBM) Created attachment 58724 [details]
To help trace the problem, here is the lspci output for my X22 Laptop.
We have the same pb with a Compaq N600c (ATI Technologies Inc Radeon Mobility M6 LY) when we are switching from console to console : startx CRTL+ALT+F1 CTRL+ALT+F7 We get back to a corrupted X-windows screen and locked keyboard (the pointer moves but doesn't do anything.) If you can remote login on the machine you see : PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 1287 root 24 -1 60424 16M 3004 R < 99.4 3.2 9:17 X My lspci configuration is almost identical to sahai's, except for a different location for the IDE controller's memory. --- /tmp/x22.lspci.txt Tue May 28 20:42:49 2002 +++ /tmp/robert.lspci.txt Tue May 28 22:35:03 2002 @@ -69,7 +69,7 @@ Region 2: I/O ports at 0170 [size=8] Region 3: I/O ports at 0374 Region 4: I/O ports at 1860 [size=16] - Region 5: Memory at 28000000 (32-bit, non-prefetchable) [size=1K] + Region 5: Memory at 18000000 (32-bit, non-prefetchable) [size=1K] 00:1f.3 SMBus: Intel Corp. 82801CA/CAM SMBus (rev 01) Subsystem: IBM: Unknown device 0220 I wonder if busmastering gets disabled upon APM resume. This has happened with other hardware in the past. If the laptop has busmastering enabled for video and suspends, then comes back with it disabled, the machine most likely will hang. the kernel ought to detect that and printk something... not that you can see it in X passing resume=force on the kernel commandline will reforce the busmaster bit to on Updated to 2.4.18-4 Added resume=force to kernel command line. Updated to latest BIOS. Had conversation with mharris: <mharris> Rbrt: It depends on what the real problem turns out to be. If it is indeed bus mastering, it is a BIOS flaw on your machine. <mharris> In that case you may need to switch to a VT before suspending, and after resuming, run a shell script which enables busmastering again, then switch back to X. <mharris> setpci <somereallyobscureoptions> <mharris> setpci -s 1:0.0 4.L=0187 <mharris> Something like that is supposed to do it. Replace 1:0.0 with the bus ID of your video card. The lspci -s 1:0.0 -x output has 0x87 at offset 4 already, so I believe bus mastering is on. But - this doesn't help with the problem. Even if I switch to a text VT before suspending, the machine never comes back up. It just sits in "trying to resume" mode, with the suspend light blinking. Should mharris be added as a CC to this ticket? Do you (RH) have an IBM contact? Supend/Resume works fine if I do not enable DRI. (i.e. comment out Load "dri" in XF86Config-4) Another user has reported this also, and it seems his AGP chipset is unsupported so agpgart wont load. The Radeon driver then tries to use pcigart and fails. pcigart support is enabled in X, but "unsupported". The idea being, if it works great, if not, no harm done. Aparently our kernel may have missed getting the kernel side of this support though so it might not work anyway. The unknown AGP chipset is an Ali one (device id: 1671) options agpgart agp_try_unsupported=1 fixes the startup crash but X still get stuck in a loop somewhere (99% cpu) after a switch to console and back. The laptop is an hp xt6200 btw. The ix86 kernel's don't have PCIGART enabled for the RADEON driver. PCIGART_ENABLED is undef except on alphas. (drivers/char/drm/radeon_cp.c) The thinkpads have i830 flavor AGP GARTs, which is mostly supported (iirc.) Related to or possible duplicate of bug 62067. *** Bug 62067 has been marked as a duplicate of this bug. *** Just wanted to report that this bug is still here in the latest redhat 9. I am using kernel-2.4.20-20.9 and the XFree86 that comes with RH9 with my IBM X22. The same workaround, disabling 3d accelleration, continues to avoid the lockups on suspend. Our kernels do have PCIGART enabled for Radeon, unless someone disabled it without letting me know about it. I've CC'd some of our kernel guys for them to comment on it. PCIGART on radeon should be enabled in Red Hat Linux 9 and later I believe. Keep in mind this means "enabled" and not "supported", the difference being that that means it is supported as-is, and if it happens to work for someone, that's great, but if it does not work, then we don't consider it a bug, however if someone debugs the problems they have and solves them and submits a patch for review, it's possible we might apply their patch to a future kernel build if it doesn't risk any regression. Back to this particular bug/issue though.. This problem seems almost certainly to be something that might be resolved by Charl Botha's DRI-resume patches perhaps, which as I understand it are a workaround for some broken BIOSs out there. The dri-resume patches are both XFree86 and kernel intrusive however, and there is no intention of applying them to our XFree86 4.3.0 or kernel. XFree86 4.4.0 once released will support Charl's dri-resume patch however, and so this problem will likely be resolved automatically in a future Red Hat Linux release when 4.4.0 gets integrated. Defering until 4.4.0 is released, or developmental builds are available in our rawhide tree for future OS development. Since this bugzilla report was filed, there have been several major updates to the X Window System, which may resolve this issue. Users who have experienced this problem are encouraged to upgrade to the latest version of Fedora Core, which can be obtained from: http://fedora.redhat.com/download If this issue turns out to still be reproduceable in the latest version of Fedora Core, please file a bug report in the X.Org bugzilla located at http://bugs.freedesktop.org in the "xorg" component. Once you've filed your bug report to X.Org, if you paste the new bug URL here, Red Hat will continue to track the issue in the centralized X.Org bug tracker, and will review any bug fixes that become available for consideration in future updates. Setting status to "CURRENTRELEASE". |