Red Hat Bugzilla – Bug 601638
Enabled PCIe ASPM technology causes systems with GeForce 8+ to freeze randomly
Last modified: 2013-01-10 03:09:36 EST
Description of problem:
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Sorry something went wrong while posting...
First i encountered bug 590437 bypassing this bug I installed my computer normally also it seemed. Whenever I work no matter what video driver I use the system freeze s up randomly. SSH still works when my computer is frozen but I cannot init 3 or reboot nor halt. In SSH i get loads of these messages:
(using the nvidia driver)
Message from syslogd@host at Jun 8 13:13:56
kernel:Code: e8 89 c0 0f b7 04 42 0f b7 c0 c3 89 d1 b8 00 00 00 00 39 96 84 02
00 00 76 11 48 8b 96 c0 02 00 00 89 c8 c1 e8 02 89 c0 8b 04 82 <f3> c3 39 96 90
02 00 00 76 0c 89 d2 48 8b 86 d8 02 00 00 88 0c
I tried using the vesa driver (nomodeset, vga=0x31B) the nouveau driver and the rpmfusion nvidia driver. Same results.
If I should provide anymore info I'll be happy to.
ABRT detected the freeze and I send the bugs to kerneloops (using ABRT). It calls the bug: a soft lockup of CPU #n stuck for 61s! It seems I'm not able to find the adress where the bug is available.
BUG: soft lockup - CPU#2 stuck for 61s! [Xorg:2237]
Modules linked in: fuse cpufreq_ondemand acpi_cpufreq freq_table ipv6 saa7134_alsa mt352 saa7134_dvb videobuf_dvb dvb_core uinput mt20xx tea5767 tda9887 tda8290 tuner snd_hda_codec_realtek snd_hda_intel snd_hda_codec saa7134 ir_common snd_hwdep snd_seq snd_seq_device v4l2_common snd_pcm videodev v4l1_compat v4l2_compat_ioctl32 videobuf_dma_sg snd_timer videobuf_core 8139too ir_core snd 8139cp tveeprom soundcore nvidia(P) i2c_viapro i2c_core microcode shpchp mii snd_page_alloc pata_acpi ata_generic usb_storage pata_via sata_via [last unloaded: scsi_wait_scan]
Pid: 2237, comm: Xorg Tainted: P 220.127.116.11-112.fc13.x86_64 #1 PT890-8237A/OEM
RIP: 0010:[<ffffffffa0472fff>] [<ffffffffa0472fff>] _nv006601rm+0x20/0x22 [nvidia]
RSP: 0000:ffff880001f03c58 EFLAGS: 00003246
RAX: 00000000ffffffff RBX: ffff8800a7c0dc30 RCX: 0000000000000000
RDX: ffffc90016100000 RSI: ffff8800b01f0000 RDI: ffff8800375bc800
RBP: ffffffff8100a4d3 R08: ffff8800b6520000 R09: 0000000000000001
R10: 000000000000010e R11: 0000000000094d19 R12: ffff880001f03bd0
R13: ffff8800375bc800 R14: ffff8800b01f0000 R15: ffffffff8102022f
FS: 00007fa539460840(0000) GS:ffff880001f00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002631e98 CR3: 00000000b03d5000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process Xorg (pid: 2237, threadinfo ffff8800b56be000, task ffff88009c0e9750)
ffffffffa02f6136 0000000000000b68 000000000000016d ffff8800b01f0000
ffff8800a2278b70 0000000000000200 ffffffffa02be687 ffff8800b56a5000
ffff8800b01f0000 0000000000000200 ffff8800b02e3c00 ffff8800b0258800
Found same problem on nvnews.net: http://www.nvnews.net/vbulletin/showthread.php?t=149056
Try kernel option pcie_aspm=off. This fixes freezes for some people using GF8600 video cards.
Ivan you are a miracle worker! The system passed several youtube tests and is stable now for at least half an hour, since the bug occurs randomly I could be lucky, but I got a good feeling about this one. Will report back later.
pcie_aspm=off really did the trick!!! Many thanks.
Changing the bug name since we know what causes the bug.
I think the bug should be named like "Enabled PCIe ASPM technology causes systems with PCIe GeForce 8+ freezes". There's several bugreports found by keyword "aspm" on Bugzilla. Do we need to start separate thread to put an accent on ASPM error?
Ok, I changed the name.
Please try 18.104.22.168-20 from koji which will disable ASPM if the motherboard does not support it.
I was about to press save changes telling everything was fixed when my computer froze up after half an hour using 22.214.171.124-20.fc13.x86_64. without the pcie_aspm=off.
Hence, this bug does not seem fixed with 126.96.36.199-20.fc13.x86_64.
Another irritating thing using this kernel is that my external hard disks keep on swapping device name each boot (sdb1 becomes sdc1 and vice versa), but that is probably another bug if a bug at all...
Remains a problem in fc14...
Looks like nouveau should be disabling aspm on these adapters?
I don't have this problem any more in Fedora 15, I have a new video-card, tough... So I don't know if the software is fixed, or the hardware is more compatible now...
Old video card: nVidia GF8500 GT
New video card: nVidia GT218