Bug 531825

Summary: KMS:RV250|M9:9000:AGP Suspend/Resume fails (ThinkPad T41)
Product: [Fedora] Fedora Reporter: Robert de Rooy <rderooy>
Component: xorg-x11-drv-atiAssignee: Jerome Glisse <jglisse>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 16CC: airlied, awilliam, bjaglin, braney.bugzilla4redhat, bugs+redhat, dcantrell, doodle62, jamundso, jglisse, jlaska, maciej.grela, notting, pebolle, rick.hendricksen, sassmann, tomash.brechko, wirawan0, xgl-maint
Target Milestone: ---Keywords: Patch, Triaged
Target Release: ---   
Hardware: i686   
OS: Linux   
See Also: http://bugzilla.kernel.org/show_bug.cgi?id=16140
Whiteboard: card_R200/m
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-13 21:40:55 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
syslog
none
lspci -vvv after resume
none
RV250 lspci -vvvxxx before suspend
none
RV250 lspci -vvvxxx after suspend
none
xorg log with suspend/resume cycle
none
syslog from -127 kernel
none
dmesg log file from 2.6.32.9-64.fc12.i686
none
kernel messages in /var/log/messages after resume
none
drm messages in dmesg after resume
none
radeon: add AGPMode 1 quirk for RV250 none

Description Robert de Rooy 2009-10-29 11:34:17 EDT
Created attachment 366644 [details]
syslog

Description of problem:
Resume fails on a ThinkPad T41 with ATI RV250 running F12 Beta. I upgraded to the latest Radeon driver, X server and kernel from Koji but it made no difference.
It did work with F11 and KMS, so this is a regression.

I am seeing messages like these on resume in the log;
radeon 0000:01:00.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
[drm] GPU reset succeed (RBBM_STATUS=0x00000140)
[drm] radeon: cp idle (0x02000000)
[drm] radeon: ring at 0x00000000D0000000
[drm:r100_ring_test] *ERROR* radeon: ring test failed (sracth(0x15E4)=0xCAFEDEAD)
[drm:r100_cp_init] *ERROR* radeon: cp isn't working (-22).
radeon 0000:01:00.0: failled initializing CP (-22).
[drm] LVDS-13: set mode 1400x1050 1e

After which I get a continuous stream of these errors:
[drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
[drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(6).

My LVDS display is sometimes garbled, other times black with a mouse cursor, but I can switch to a VT.

Version-Release number of selected component (if applicable):
kernel-2.6.31.5-104.fc12.i686
xorg-x11-drv-ati-6.13.0-0.10.20091006git457646d73.fc12.i686
xorg-x11-server-Xorg-1.7.1-1.fc12.i686  

How reproducible:
every time

Steps to Reproduce:
1. Hibernate
2. Resume
3.
  
Actual results:
black screen with mouse cursor, or corrupted display

Expected results:
Successful resume

Additional info:
It does work when I disable KMS by booting with nomodeset
Comment 1 Robert de Rooy 2009-10-29 11:35:35 EDT
Created attachment 366646 [details]
lspci -vvv after resume
Comment 2 Dave Airlie 2009-11-05 00:48:02 EST
I've just started a kernel build 2.6.31.5-121 in koji

this should hopefully fix the AGP suspend/resume issues, can you test this, if you need help getting the kernel let me know.
Comment 3 Adam Williamson 2009-11-05 03:38:40 EST
CCing the usual suspects: Dave would like to ask whether we should consider this a blocker and jam it in today/tomorrow. The impact is that, without this fix, every AGP chipset Radeon adapter (that includes integrated adapters and mobility adapters that use an AGP bus, not just AGP slot cards, I think) will fail to resume from suspend. Obviously, if we don't put it in as a blocker fix, Dave will ship it as a 0-day update instead.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 4 cblaauw 2009-11-05 04:01:24 EST
*** Bug 531598 has been marked as a duplicate of this bug. ***
Comment 5 Stefan Assmann 2009-11-05 04:11:59 EST
I'm seeing similar behaviour on a Lenovo T500 (black screen with mouse cursor).
Comment 6 Adam Williamson 2009-11-05 04:27:13 EST
see my comment - we're not surprised that this affects...lots of systems. if you could test the fixed kernel that would be very helpful. it's at:

http://koji.fedoraproject.org/koji/buildinfo?buildID=139823

thanks!

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 7 Robert de Rooy 2009-11-05 04:55:17 EST
I just tried the -122 kernel build. After a suspend/resume cycle I had a black screen with working mouse cursor. syslog contained the same errors as before.

With other words it made no difference here.
Comment 8 Matěj Cepl 2009-11-05 12:19:52 EST
Since this bugzilla report was filed, there have been several major updates in various components of the Xorg system, which may have resolved this issue. Users who have experienced this problem are encouraged to upgrade their system to the latest version of their packages (at least F12Beta, but even better if the very latest versions).

Please, if you experience this problem on the up-to-date system, let us now in the comment for this bug, or whether the upgraded system works for you.

If you won't be able to reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you.

[This is a bulk message for all open Fedora Rawhide Xorg-related bugs. I'm adding myself to the CC list for each bug, so I'll see any comments you make after this and do my best to make sure every issue gets proper attention.]
Comment 9 Adam Williamson 2009-11-05 12:29:12 EST
mcepl: from the initial report:

"Resume fails on a ThinkPad T41 with ATI RV250 running F12 Beta. I upgraded to
the latest Radeon driver, X server and kernel from Koji but it made no
difference."

Dave is aware of this and knows that it's a real problem.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 10 cblaauw 2009-11-05 17:08:10 EST
My machine does resume now with the -122 Kernel. It's an Acer Travelmate 800 with Radeon rv250.

One thing is still there: after logging in with GDM the mouse pointer is invisible. Chaging to a VT and back to X restores the pointer. From that point on the pointer is always ok.
Comment 11 Robert de Rooy 2009-11-06 04:19:19 EST
To make sure I did not screw up, I tested it again today after installing the latest updates, such as the 1.7.1-7 Xserver. It made no difference, it still fails the same way.

I also tried without forcing HPET, but also that made no difference.
Comment 12 Robert de Rooy 2009-11-06 05:11:00 EST
I tried with radeon.agpmode=-1 and with that I can suspend/resume successfully. So somehow AGP is still not being restored properly.
Comment 13 Jerome Glisse 2009-11-06 10:06:27 EST
Robert please attach output of :
sudo lspci -vvvxxx > rv250-before-suspend
sudo lspci -vvvxxx > rv250-after-resume

So before & after suspend. Thanks
Comment 14 Jerome Glisse 2009-11-06 10:43:33 EST
*** Bug 493254 has been marked as a duplicate of this bug. ***
Comment 15 Robert de Rooy 2009-11-08 02:04:15 EST
Created attachment 368008 [details]
RV250 lspci -vvvxxx before suspend
Comment 16 Robert de Rooy 2009-11-08 02:05:26 EST
Created attachment 368009 [details]
RV250 lspci -vvvxxx after suspend
Comment 17 Matěj Cepl 2009-11-08 05:17:42 EST
Tons of messages

Oct 29 16:18:07 t41 kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(6).
Oct 29 16:18:07 t41 kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !

in syslog.

Reporter, could we get /var/log/Xorg.0.log attached to this bug report as well, please?

Thank you in advance
Comment 18 Robert de Rooy 2009-11-09 02:40:49 EST
Created attachment 368127 [details]
xorg log with suspend/resume cycle

kernel-2.6.31.5-122.fc12.i686
xorg-x11-drv-ati-6.13.0-0.10.20091006git457646d73.fc12.i686
xorg-x11-server-Xorg-1.7.1-7.fc12.i686
Comment 19 Maciej Grela 2009-11-09 14:43:45 EST
Works for my RV250 with -127 kernel. Awsome work guys !
Comment 20 Robert de Rooy 2009-11-10 03:37:32 EST
Created attachment 368343 [details]
syslog from -127 kernel

I noticed while looking at the logs, that just before the ring test error on resume, that there is another error;

pm_op(): pci_pm_resume+0x0/0x68 returns -16
PM: Device 0000:00:00.0 failed to resume: error -16
Comment 21 Robert de Rooy 2009-11-10 03:41:01 EST
obviously that should have read -127 kernel. call my dyslectic
Comment 22 Tomash Brechko 2009-11-14 05:15:11 EST
I experience the same problem with RC410 [Radeon Xpress 200M]. In my case suspend/resume is not involved.  When I boot the system, gdm starts fine, I log in, and after several seconds get blank screen.  /var/log/messages contains

kernel: [drm:radeon_fence_wait] *ERROR* fence(ffff880064d61640:0x00000307) 509ms timeout going to reset GPU
kernel: [drm] CP reset succeed (RBBM_STATUS=0x00000140)
kernel: [drm] radeon: cp idle (0x10000000)
kernel: [drm] radeon: ring at 0x0000000080000000
kernel: [drm:r100_ring_test] *ERROR* radeon: ring test failed (sracth(0x15E4)=0xCAFEDEAD)
kernel: [drm:r100_cp_init] *ERROR* radeon: cp isn't working (-22).
kernel: [drm:r300_gpu_reset] *ERROR* Failed to reset GPU (RBBM_STATUS=0x80010140)

and so on (system is x86_64).  This happens with up-to-date packages (as were listed above).  Interestingly, the problem is *not* present in kernel-2.6.31.1-56.fc12.x86_64 that has been installed with Fedora 12 beta originally---I switch to this kernel after all updates, and the system runs smoothly.
Comment 23 Jerome Glisse 2009-11-16 04:13:12 EST
Tomash you are seeing the same message but your bug is different, you more likely hit by :
https://bugzilla.redhat.com/show_bug.cgi?id=532308

This bug is about AGP not working on resume.
Comment 24 Bug Zapper 2009-11-16 09:36:00 EST
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 25 Dave Airlie 2009-11-17 05:39:27 EST
http://kojipkgs.fedoraproject.org/packages/kernel/2.6.31.6/134.fc12/

care to try that?
Comment 26 Robert de Rooy 2009-11-17 05:53:14 EST
already did. No change.
Comment 27 Robert de Rooy 2009-11-20 04:50:54 EST
I upgraded today to

kernel-2.6.31.6-142.fc12.i686
xorg-x11-drv-ati-6.13.0-0.12.20091119git437113124.fc12.i686

The kernel update did not seem to cause any change, but the radeon driver
update did caused one change, instead of getting a black background with
working mouse cursor, I instead get the fedora wallpaper with working mouse
cursor.

syslog is still getting filled with the same errors as before.
Comment 28 bjaglin 2009-12-16 17:54:19 EST
Still reproducable on a T40p using latest drm-radeon-testing and ddx from git.

01:00.0 VGA compatible controller: ATI Technologies Inc Radeon RV250 [Mobility
FireGL 9000] (rev 02)
Comment 29 bjaglin 2010-01-06 20:01:06 EST
Booting with radeon.agpmode=-1 works around the problem at the price of switching to PCI.
Comment 30 Robert de Rooy 2010-01-14 05:18:06 EST
No change with;

kernel-2.6.32.3-24.fc12.i686
libdrm-2.4.17-1.fc12.i686
xorg-x11-drv-ati-6.13.0-0.20.20091221git4b05c47ac.fc12.i686
xorg-x11-server-Xorg-1.7.4-1.fc12.i686
Comment 31 bjaglin 2010-02-25 05:28:40 EST
No change with 2.6.33 or drm-radeon-testing. Rest is git master.

Is there some more investigation that can be done? This is really a showstopper for switching to KMS with RV250 AGP, and as far as I know KMS left staging in 2.6.33.
Comment 32 Robert de Rooy 2010-03-04 09:48:16 EST
One more strange thing, that may help.

When I start the computer, and at the GDM screen do a suspend immediately, it will resume seemingly normal. However, when you then actually try to login you will have the same problem as described previously, where you just have the Fedora wallpaper and a (working) mouse cursor and nothing else.
Comment 33 Robert de Rooy 2010-03-04 10:04:32 EST
Created attachment 397825 [details]
dmesg log file from 2.6.32.9-64.fc12.i686

Well, perhaps this helps. With this sequence I have been able to trigger a WARNING not seen before. This happens after the usual errors, so I'm not sure of the value.

Mar  4 15:41:04 t41 kernel: WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:159 radeon_fence_signaled+0x56/0x83 [radeon]()
Mar  4 15:41:04 t41 kernel: Hardware name: 2373TG5
Mar  4 15:41:04 t41 kernel: Querying an unemited fence : f4fa7ba0 !
Mar  4 15:41:04 t41 kernel: Modules linked in: tun sunrpc cpufreq_ondemand acpi_cpufreq ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 uinput hdaps input_polldev arc4 ecb ath5k mac80211 ath ppdev cfg80211 nsc_ircc snd_intel8x0 snd_intel8x0m parport_pc irda snd_ac97_codec snd_seq crc_ccitt parport ac97_bus thinkpad_acpi rfkill iTCO_wdt iTCO_vendor_support snd_seq_device snd_pcm e1000 snd_timer snd soundcore snd_page_alloc i2c_i801 joydev dm_multipath video output yenta_socket rsrc_nonstatic radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Mar  4 15:41:04 t41 kernel: Pid: 1512, comm: Xorg Not tainted 2.6.32.9-64.fc12.i686 #1
Mar  4 15:41:04 t41 kernel: Call Trace:
Mar  4 15:41:04 t41 kernel: [<c043a2fd>] warn_slowpath_common+0x6a/0x81
Mar  4 15:41:04 t41 kernel: [<f7daf28c>] ? radeon_fence_signaled+0x56/0x83 [radeon]
Mar  4 15:41:04 t41 kernel: [<c043a352>] warn_slowpath_fmt+0x29/0x2c
Mar  4 15:41:04 t41 kernel: [<f7daf28c>] radeon_fence_signaled+0x56/0x83 [radeon]
Mar  4 15:41:04 t41 kernel: [<f7daf2f7>] radeon_fence_wait+0x3e/0x2a1 [radeon]
Mar  4 15:41:04 t41 kernel: [<f7daf66f>] ? radeon_fence_create+0x21/0xda [radeon]
Mar  4 15:41:04 t41 kernel: [<f7dbda04>] radeon_ib_get+0xf4/0x19e [radeon]
Mar  4 15:41:04 t41 kernel: [<f7dbe87a>] radeon_cs_ioctl+0x80/0x162 [radeon]
Mar  4 15:41:04 t41 kernel: [<f7c9194f>] drm_ioctl+0x251/0x2fa [drm]
Mar  4 15:41:04 t41 kernel: [<f7dbe7fa>] ? radeon_cs_ioctl+0x0/0x162 [radeon]
Mar  4 15:41:04 t41 kernel: [<c04541f5>] ? autoremove_wake_function+0x0/0x34
Mar  4 15:41:04 t41 kernel: [<c0586d10>] ? file_has_perm+0x89/0xa3
Mar  4 15:41:04 t41 kernel: [<f7c916fe>] ? drm_ioctl+0x0/0x2fa [drm]
Mar  4 15:41:04 t41 kernel: [<c04e5826>] vfs_ioctl+0x1d/0x76
Mar  4 15:41:04 t41 kernel: [<c04e5dc0>] do_vfs_ioctl+0x493/0x4d1
Mar  4 15:41:04 t41 kernel: [<c0586fb4>] ? selinux_file_ioctl+0x43/0x46
Mar  4 15:41:04 t41 kernel: [<c04e5e44>] sys_ioctl+0x46/0x66
Mar  4 15:41:04 t41 kernel: [<c040365c>] syscall_call+0x7/0xb
Mar  4 15:41:04 t41 kernel: ---[ end trace d14a0fcc30e3dc30 ]---
Mar  4 15:41:04 t41 kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(4).
Mar  4 15:41:04 t41 kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Mar  4 15:41:04 t41 kernel: ------------[ cut here ]------------
Comment 34 Robert de Rooy 2010-03-04 10:08:20 EST
Comment on attachment 397825 [details]
dmesg log file from 2.6.32.9-64.fc12.i686

should have read kernel 2.6.32.9-64.fc12.i686
Comment 35 Robert de Rooy 2010-03-05 07:46:41 EST
The situation got worse with kernel-2.6.32.9-70

-67 still behaved as before, but with -70, after resume the system just immediately hangs (cannot switch VT), and the display turns slowly from black to white, starting at the edges.

Nothing in syslog about the resume either.
Comment 36 Robert de Rooy 2010-04-15 09:41:13 EDT
with Fedora 13 Beta the situation is back to what it was before Comment 35.
kernel-2.6.33.2-46.fc13.i686

So after resume I get the fedora wallpaper, and a working mouse cursor. But I do not get the unlock dialog box. I can switch to a VT and login, but the situation remains the same when switching back to the X server.

Is there anything I can do to gather more info that would help resolve this issue?

syslog displays the same errors as before

radeon 0000:01:00.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
[drm] AGP mode requested: 4
agpgart-intel 0000:00:00.0: AGP 2.0 bridge
agpgart-intel 0000:00:00.0: putting AGP V2 device into 4x mode
radeon 0000:01:00.0: putting AGP V2 device into 4x mode
[drm] GPU reset succeed (RBBM_STATUS=0x00000140)
[drm] radeon: cp idle (0x02000000)
[drm] radeon: ring at 0x00000000D0000000
[drm:r100_ring_test] *ERROR* radeon: ring test failed (sracth(0x15E4)=0xCAFEDEAD)
[drm:r100_cp_init] *ERROR* radeon: cp isn't working (-22).
radeon 0000:01:00.0: failled initializing CP (-22).
e1000 0000:02:01.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
pci 0000:00:1e.0: wake-up capability disabled by ACPI
e1000 0000:02:01.0: PME# disabled
serial 00:09: activated
parport_pc 00:0a: activated
nsc-ircc 00:0b: activated
sd 0:0:0:0: [sda] Starting disk
PM: resume of devices complete after 2828.293 msecs
PM: resume devices took 2.829 seconds
PM: Finishing wakeup.
Restarting tasks ... done.
[drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(13).
[drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
Comment 37 Robert de Rooy 2010-06-14 11:00:19 EDT
Can we get the patch from https://bugzilla.kernel.org/show_bug.cgi?id=15969 added to the Fedora 13 kernel? It should fix this long standing suspend-resume issue.
Comment 38 bjaglin 2010-07-08 18:14:59 EDT
Tried 2.6.35-rc4 and latest drm-radeon-next, still no improvement.
Comment 39 rick 2010-07-20 11:15:48 EDT
xorg-x11-drv-ati.i686           6.13.0-1.fc13
xorg-x11-server-Xorg.i686       1.8.2-1.fc13   
libdrm.i686                     2.4.21-2.fc13
kernel                          2.6.33.6-147.fc13.i686.PAE

here sleep works fine, but i do get these errors (at boot):
...
[drm] RAM width 128bits DDR
[TTM] Zone  kernel: Available graphics memory: 442782 kiB.
[TTM] Zone highmem: Available graphics memory: 513154 kiB.
[ttm] Initializing pool allocator.
[drm] radeon: 256M of VRAM memory ready
[drm] radeon: 32M of GTT memory ready.
[drm] radeon: 1 quad pipes, 1 Z pipes initialized.
[drm] radeon: cp idle (0x10000C03)
[drm] Loading R300 Microcode
platform radeon_cp.0: firmware: requesting radeon/R300_cp.bin
[drm] radeon: ring at 0x00000000C0000000
[drm:r100_ring_test] *ERROR* radeon: ring test failed (sracth(0x15E4)=0xCAFEDEAD)
[drm:r100_cp_init] *ERROR* radeon: cp isn't working (-22).
radeon 0000:01:00.0: failled initializing CP (-22).
radeon 0000:01:00.0: Disabling GPU acceleration
[drm:r100_cp_fini] *ERROR* Wait for CP idle timeout, shutting down CP.
[drm] radeon: cp finalized
[drm] GPU reset succeed (RBBM_STATUS=0x00000140)
[drm] radeon: cp finalized
[ttm] Finilizing pool allocator.
[TTM] Zone  kernel: Used memory at exit: 0 kiB.
[TTM] Zone highmem: Used memory at exit: 0 kiB.
[drm] radeon: ttm finalized
[drm] Forcing AGP to PCI mode
[drm] GPU reset succeed (RBBM_STATUS=0x00000140)
[drm] Generation 2 PCI interface, using max accessible memory
[drm] radeon: VRAM 256M
[drm] radeon: VRAM from 0x00000000 to 0x0FFFFFFF
[drm] radeon: GTT 512M
[drm] radeon: GTT from 0x20000000 to 0x3FFFFFFF
[drm] radeon: irq initialized.
[drm] Detected VRAM RAM=256M, BAR=256M
[drm] RAM width 128bits DDR
[TTM] Zone  kernel: Available graphics memory: 442782 kiB.
[TTM] Zone highmem: Available graphics memory: 513154 kiB.
...

messages from sleep:
...
radeon 0000:01:00.0: restoring config space at offset 0xf (was 0x801ff, writing 0x8010b)
radeon 0000:01:00.0: restoring config space at offset 0xc (was 0x0, writing 0xfea00000)
radeon 0000:01:00.0: restoring config space at offset 0x6 (was 0x0, writing 0xfe9e0000)
radeon 0000:01:00.0: restoring config space at offset 0x5 (was 0x1, writing 0xde01)
radeon 0000:01:00.0: restoring config space at offset 0x4 (was 0x8, writing 0xe0000008)
radeon 0000:01:00.0: restoring config space at offset 0x3 (was 0x800000, writing 0x804010)
radeon 0000:01:00.0: restoring config space at offset 0x1 (was 0x2b00000, writing 0x2b00107)
...
radeon 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[drm] GPU reset succeed (RBBM_STATUS=0x00000140)
[drm] radeon: 1 quad pipes, 1 Z pipes initialized.
[drm] radeon: cp idle (0x10000C03)
[drm] radeon: ring at 0x0000000020000000
[drm] ring test succeeded in 1 usecs
[drm] ib test succeeded in 0 usecs
...


full dmesg at: http://pastebin.com/d37mB4XF
Comment 40 Paul Bolle 2010-08-08 14:13:06 EDT
(In reply to comment #36)
> So after resume I get the fedora wallpaper, and a working mouse cursor. But I
> do not get the unlock dialog box. I can switch to a VT and login, but the
> situation remains the same when switching back to the X server.

This is basically what still happens with kernel-2.6.36-0.0.rc0.git1.fc15.i686 and xorg-x11-drv-ati-6.13.1-1.20100705git37b348059.fc14.i686 (ie, current rawhide).

(In reply to comment #37)
> Can we get the patch from https://bugzilla.kernel.org/show_bug.cgi?id=15969
> added to the Fedora 13 kernel? It should fix this long standing suspend-resume
> issue.

That patch is commit 10b06122afcc78468bd1d009633cb71e528acdc5. It is part of vanilla v2.6.35 (and later releases) and therefor, I assume, part of current rawhide's kernel, so I'm not use whether that patch is sufficient.
Comment 41 Paul Bolle 2010-08-08 15:43:18 EDT
Created attachment 437473 [details]
kernel messages in /var/log/messages after resume

Attached are the messages printed by the drm radeon "stuff" (ie, module or subsytem) in /var/log/messages.

(Generated with something like:
sudo tail -60 /var/log/messages | sed -n "/Restarting\ tasks\ .../,$ p" | grep "\[drm:" .)
Comment 42 Paul Bolle 2010-08-08 16:11:36 EDT
Created attachment 437476 [details]
drm messages in dmesg after resume

Messages that show up in dmesg after resume. kernel 2.6.36-0.0.rc0.git1.fc15.i686 was booted with parameter drm.debug=1 (for what that's worth).

(Generated with:
dmesg | sed -n "/Restarting\ tasks\ .../,$ p" | grep "^\[drm:" ).
Comment 43 Robert de Rooy 2010-08-09 05:34:51 EDT
I just verified with kernel-2.6.35.1-4.rc1.fc14.i686 and xorg-x11-drv-ati-6.13.0-2.fc13.i686

After a resume, no login prompt, and initially the mouse cursor was a big block of seemingly random corruption until I started moving it about. I can switch to a VT and login.

Here are some possibly relevant messages on resume

$ dmesg |grep -i -e drm -e ttm -e radeon -e agp -e pm:
PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
PM: Registered nosave memory: 00000000000a0000 - 00000000000d2000
PM: Registered nosave memory: 00000000000d2000 - 00000000000d4000
PM: Registered nosave memory: 00000000000d4000 - 00000000000dc000
PM: Registered nosave memory: 00000000000dc000 - 0000000000100000
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGP_._PRT]
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
apm: overridden by ACPI.
Linux agpgart interface v0.103
agpgart-intel 0000:00:00.0: Intel 855PM Chipset
agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xd0000000
PM: Resume from disk failed.
[drm] Initialized drm 1.1.0 20060810
[drm] radeon defaulting to kernel modesetting.
[drm] radeon kernel modesetting enabled.
radeon 0000:01:00.0: power state changed by ACPI to D0
radeon 0000:01:00.0: power state changed by ACPI to D0
radeon 0000:01:00.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
[drm] initializing kernel modesetting (RV250 0x1002:0x4C66).
[drm] register mmio base: 0xC0100000
[drm] register mmio size: 65536
agpgart-intel 0000:00:00.0: AGP 2.0 bridge
agpgart-intel 0000:00:00.0: putting AGP V2 device into 4x mode
radeon 0000:01:00.0: putting AGP V2 device into 4x mode
radeon 0000:01:00.0: GTT: 256M 0xD0000000 - 0xDFFFFFFF
radeon 0000:01:00.0: VRAM: 64M 0xE0000000 - 0xE3FFFFFF (32M used)
[drm] radeon: irq initialized.
[drm] Detected VRAM RAM=64M, BAR=128M
[drm] RAM width 64bits DDR
[TTM] Zone  kernel: Available graphics memory: 431160 kiB.
[TTM] Zone highmem: Available graphics memory: 1026780 kiB.
[TTM] Initializing pool allocator.
[drm] radeon: 32M of VRAM memory ready
[drm] radeon: 256M of GTT memory ready.
[drm] Loading R200 Microcode
[drm] radeon: ring at 0x00000000D0000000
[drm] ring test succeeded in 1 usecs
[drm] radeon: ib pool ready.
[drm] ib test succeeded in 0 usecs
[drm] DFP table revision: 3
[drm] Panel ID String: SXGA+ Single (85MHz)    
[drm] Panel Size 1400x1050
[drm] Default TV standard: NTSC
[drm] 27.000000000 MHz TV ref clk
[drm] No TV DAC info found in BIOS
[drm] Default TV standard: NTSC
[drm] 27.000000000 MHz TV ref clk
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   VGA
[drm]   DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60
[drm]   Encoders:
[drm]     CRT1: INTERNAL_DAC1
[drm] Connector 1:
[drm]   DVI-D
[drm]   HPD1
[drm]   DDC: 0x64 0x64 0x64 0x64 0x64 0x64 0x64 0x64
[drm]   Encoders:
[drm]     DFP1: INTERNAL_TMDS1
[drm] Connector 2:
[drm]   LVDS
[drm]   Encoders:
[drm]     LCD1: INTERNAL_LVDS
[drm] Connector 3:
[drm]   S-video
[drm]   Encoders:
[drm]     TV1: INTERNAL_DAC2
[drm] radeon: power management initialized
[drm] fb mappable at 0xE0040000
[drm] vram apper at 0xE0000000
[drm] size 1478656
[drm] fb depth is 8
[drm]    pitch is 1408
fbcon: radeondrmfb (fb0) is primary device
fb0: radeondrmfb frame buffer device
drm: registered panic notifier
[drm] Initialized radeon 2.5.0 20080528 for 0000:01:00.0 on minor 0
PM: Syncing filesystems ... done.
PM: Preparing system for mem sleep
PM: Entering mem sleep
radeon 0000:01:00.0: PCI INT A disabled
radeon 0000:01:00.0: Refused to change power state, currently in D0
radeon 0000:01:00.0: power state changed by ACPI to D3
PM: suspend of devices complete after 577.648 msecs
PM: suspend devices took 0.578 seconds
PM: late suspend of devices complete after 62.598 msecs
PM: Saving platform NVS memory
PM: Restoring platform NVS memory
PM: early resume of devices complete after 99.728 msecs
PM: Device 0000:00:00.0 failed to resume async: error -16
radeon 0000:01:00.0: power state changed by ACPI to D0
radeon 0000:01:00.0: power state changed by ACPI to D0
radeon 0000:01:00.0: power state changed by ACPI to D0
radeon 0000:01:00.0: power state changed by ACPI to D0
radeon 0000:01:00.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
[drm] AGP mode requested: 4
agpgart-intel 0000:00:00.0: AGP 2.0 bridge
agpgart-intel 0000:00:00.0: putting AGP V2 device into 4x mode
radeon 0000:01:00.0: putting AGP V2 device into 4x mode
radeon 0000:01:00.0: GTT: 256M 0xD0000000 - 0xDFFFFFFF
[drm] radeon: ring at 0x00000000D0000000
[drm] ring test succeeded in 1 usecs
[drm:r100_ib_test] *ERROR* radeon: ib test failed (sracth(0x15E4)=0xCAFEDEAD)
radeon 0000:01:00.0: failled testing IB (-22).
radeon 0000:01:00.0: failled initializing IB (-22).
PM: resume of devices complete after 1311.378 msecs
PM: resume devices took 1.312 seconds
PM: Finishing wakeup.

A few things that stand out is that the Radeon refused to change power state for entering suspend, and that the PCI device 00.0 (Host Bridge) failed to resume async. Not sure if either if these things matters.
Comment 44 Robert de Rooy 2010-10-25 04:14:42 EDT
No change with F14 RC1 compared to F13.

I also tested kernel-2.6.36-1.fc15.i686 on top of F14 RC1, but it did not return at all from suspend (black screen, no keyboard, no ssh)
Comment 45 Paul Bolle 2010-10-25 08:55:10 EDT
0) Please note that the current state of affairs means that for suspend/resume to work (most of the times) the nomodeset kernel option needs to be set. However, on a substantial number of resumes vbetool will then take 100% of CPU. This apparently won't be investigated because:
    The supported suspend/resume configuration for [...] radeon is with
    KMS enabled, at which point vbetool won't be run.
(see: https://bugzilla.redhat.com/show_bug.cgi?id=531874#c6)

Our options currently seem to be either a broken configuration or a sometimes broken, but unsupported, configuration.

1) So is there anything that we can further do to help make the supported configuration (ie, radeon with KMS) work?
Comment 46 Paul Bolle 2010-12-19 09:26:06 EST
0) This is still an issue with current rawhide (ie, 2.6.37-0.rc6.git0.1.fc15.i686). Resume still behaves as described in comment #36:
> So after resume I get the fedora wallpaper, and a working mouse cursor. But I
> do not get the unlock dialog box. I can switch to a VT and login, but the
> situation remains the same when switching back to the X server.

1) Feel free to ask me to provide more details, to test stuff, etc.
Comment 47 Paul Bolle 2011-02-20 18:18:48 EST
Created attachment 479799 [details]
radeon: add AGPMode 1 quirk for RV250

0) After quite some tests and grepping through the kernel code I ran into the radeon AGP mode quirk table. There I saw that a similar system apparently needed agpmode=1 to function correctly. It turns out that agpmode works for my system too.

1) Attached is a trivial patch (againt v2.6.38-rc5) to add AGPMode 1 quirk for this RV250.

2) Is there any point in further examining why agpmode=4 (the default) doesn't work correctly?
Comment 48 Robert de Rooy 2011-03-01 07:04:17 EST
I tested it here and booting with radeon.agpmode=1 solved the suspend/resume issue, and at least prevents us from dropping all the way back to PCI mode (-1).
I also tested with radeon.agpmode=2, but then the suspend/resume issue remains as before. 

The difference between AGP 1x and AGP 2x from what I have been found is that 2x is double pumped, resulting in double the memory bandwidth.

Lastly, I tried world of padman from the phoronix test suite, to see if there was a noticeable performance difference between the default AGP 4x and AGP 1x and there is.
Running at the native 1400x1050 in AGP 4x mode gave 4-6 FPS, sometimes peaking at 11 FPS.
But running it the same way in AGP 1x mode gave only 1-2 FPS, with the occasional peak of 4 FPS.

Unplayable in either case, but still a measurably difference.
Comment 49 Paul Bolle 2011-03-21 16:43:42 EDT
This bug is also discussed at http://bugzilla.kernel.org/show_bug.cgi?id=16140 . (It seems I cannot update the external tracker info.)
Comment 50 Paul Bolle 2011-03-25 05:49:57 EDT
(In reply to comment #0)
> It did work with F11 and KMS, so this is a regression.

0) I looked into that a bit (to see if anything can be learned by looking at the changes made at that time). It looks like suspend and resume only worked partially back then.

1) Using kernel 2.6.29-4-167 (from the F11 live image), running in either radeon.agpmode=1 or radeon.agpmode=4, this shows up in the logs after resume:
[...]
[drm:radeon_resume] *ERROR*
[drm] Loading R200 Microcode
[drm] writeback test succeeded in 1 usecs
[drm] LVDS-9: set mode 1400x1050 e
[...]

After that lspci shows that the agpgart (8086:3340) is in AGP rate "x1" and that the RV250 card (1002:4c66) is in AGP rate "<none>". I'm guessing everything is running in AGP rate x1 at that point. (There probably are also other things not running in the same way after resume. Eg, X sometimes crashed shortly after that.) But since X initially returned after resume, things did _appear_ like they were working.

(Please note that the drm related messages in the log - also those shown before suspend - suggest that KMS in that point in time was less diverged from UMS than it is now. But that's just a guess.)

2) Kernel 2.6.31.5-127 (from the F12 live image) fails in the same way as discussed in this report.

3) So unless things got fixed in between, it seems that suspend and resume never really worked correctly with KMS for these machines.
Comment 51 Robert de Rooy 2011-12-05 05:39:01 EST
Problem remains the same with F16.

kernel-3.1.2-1.fc16.i686
xorg-x11-drv-ati-6.14.3-3.20111125git534fb6e41.fc16.i686
Comment 52 Paul Bolle 2011-12-05 06:01:43 EST
(In reply to comment #51)
> Problem remains the same with F16.
> 
> kernel-3.1.2-1.fc16.i686
> xorg-x11-drv-ati-6.14.3-3.20111125git534fb6e41.fc16.i686

0) Perhaps I should send the "quirk" I suggested in attachment #479799 [details] to LKML and the other addresses relevant for Radeon fixes. It's clear it is going nowhere here. If you feel like testing it, please say so, because then I could add "Tested-by" and/or "Reported-by" tags on your name to the patch.

1) Note that this patch doesn't really fix the actual problem. It just makes sure resume works without needing to set a kernel parameter (by defaulting to AGP rate x1).
Comment 53 soren121 2012-02-08 23:10:04 EST
No fix in sight, but there is a workaround: set a primary password in the BIOS. The BIOS will initialize the video card on wake-up and allow Linux to resume normally. Inconvenient? Yes, but it's better than bottlenecking your already out-of-date graphics card. I posted this workaround on the Canonical Launchpad bug report some time ago; I guess it hasn't been shared outside as of yet.
Comment 54 Paul Bolle 2012-02-09 08:48:30 EST
(In reply to comment #53)
> No fix in sight, but there is a workaround: set a primary password in the BIOS.
> The BIOS will initialize the video card on wake-up and allow Linux to resume
> normally. Inconvenient? Yes, but it's better than bottlenecking your already
> out-of-date graphics card. I posted this workaround on the Canonical Launchpad
> bug report some time ago; I guess it hasn't been shared outside as of yet.

You posted an identical comment on bugzilla.kernel.org. So you're cross posting and I guess that's frowned upon.
Comment 55 Paul Bolle 2012-11-23 05:23:57 EST
(In reply to comment #52)
> Perhaps I should send the "quirk" I suggested in attachment #479799 [details]
> to LKML and the other addresses relevant for Radeon fixes.

0) That's commit 45171002b01b2e2ec4f991eca81ffd8430fd0aec ("radeon: add AGPMode 1 quirk for RV250"), see http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=45171002b01b2e2ec4f991eca81ffd8430fd0aec , in current (unreleased) mainline.

(In reply to comment #47)
> Is there any point in further examining why agpmode=4 (the default)
> doesn't work correctly?

1) I was unable to determine this and it seems neither was anyone else. Or maybe no-one cared enough. Anyhow, I guess we'll never know.

2) I suggest to finally close this bug.

3) Anyone with a system running into this bug that doesn't match the values used in this quirk is invited to sent a quirk for their system to the DRM maintainers.
Comment 56 Fedora End Of Life 2013-01-16 20:11:22 EST
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 57 Fedora End Of Life 2013-02-13 21:41:09 EST
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.