Bug 630765 - System boot hangs often with nouveau error: Pointer to BIT loadval table invalid
Summary: System boot hangs often with nouveau error: Pointer to BIT loadval table invalid
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: 13
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-06 20:58 UTC by Arnold Sutter
Modified: 2011-06-28 13:19 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2011-06-28 13:19:09 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
output of "lspci -v" (15.68 KB, text/plain)
2010-09-07 05:29 UTC, Arnold Sutter
no flags Details
Picture of boot error (79.00 KB, image/jpeg)
2010-09-07 05:30 UTC, Arnold Sutter
no flags Details
dmesg after successful boot (60.67 KB, text/plain)
2010-09-10 09:17 UTC, Arnold Sutter
no flags Details
Hang during booting with nouveau.modeset=1 (285.43 KB, image/jpeg)
2010-09-12 20:41 UTC, Arnold Sutter
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 609557 0 low CLOSED Fedora FC13 x86_64 does not boot with acpi on 2021-02-22 00:41:40 UTC

Description Arnold Sutter 2010-09-06 20:58:32 UTC
Description of problem:
New notebook HP EliteBook 8540w sometimes doesn't boot and has to be power-cycled to reset.

Version-Release number of selected component (if applicable):
up-to-date Fedora 13 (Kernel 2.6.34.6-47.fc13.x86_64, xorg-x11-drv-nouveau-0.0.16-8.20100423git13c1043.fc13.x86_64)

How reproducible:
Power-on the notebook. There is a 50% chance that the system hangs early in the kernel load phase. 

Steps to Reproduce:
1. Power-on the system
The following message appears on the screen: 
  [drm] nouveau 0000:01:00.0: Pointer to BIT loadval table invalid
Sometimes, other related PCI errors appear. 

I've also seen this message (maybe with a previous kernel: 
pci 0000:01:00.0: no compatible bridge window for [mem 0xfff80000-0xffffffff pref]

Here's the associated grub.conf entry: 
title Fedora (2.6.34.6-47.fc13.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.34.6-47.fc13.x86_64 ro root=/dev/mapper/vg0-lv1 rd_LVM_LV=vg0/lv1 rd_LVM_LV=vg0/lv0 rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=sg-latin1 quiet nouveau.noaccel=1

2.
3.
  
Actual results:
The system hangs and has to be hard-reset.

Expected results:
System should boot normally. When it gets over this error (and I don't know yet when this happens), the system boots normally and the display is stable until powered off. 

Additional info:
The EliteBook is equiped with 

01:00.0 VGA compatible controller: nVidia Corporation GT216 [Quadro FX 880M] (rev a2) (prog-if 00 [VGA controller])
        Subsystem: Hewlett-Packard Company Device 1521
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at d2000000 (32-bit, non-prefetchable) [size=16M]
        Memory at c0000000 (64-bit, prefetchable) [size=256M]
        Memory at d0000000 (64-bit, prefetchable) [size=32M]
        I/O ports at 5000 [size=128]
        Expansion ROM at d3080000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [b4] Vendor Specific Information: Len=14 <?>
        Capabilities: [100] Virtual Channel
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Kernel driver in use: nouveau
        Kernel modules: nvidia, nouveau, nvidiafb


It's very annoying to sometimes take up to 5 attempts until the system boots correctly. 

System BIOS is F.0A from June 2010 (=current). 

Anything I can test/provide for you to diagnose and fix this annoying problem? 

Best Regards, 

Arnold

Comment 1 Ben Skeggs 2010-09-06 22:36:20 UTC
It's hard to say if nouveau is responsible for this.  The "Pointer to BIT loadval table invalid" isn't relevant, it's just nouveau complaining about your VBIOS missing some info we usually use, nouveau will fall back to some sane defaults otherwise.

Other pci-related errors sometimes sounds worrying.  Can you see if this still happens with "nouveau.modeset=0" in your boot options?

Comment 2 Arnold Sutter 2010-09-07 05:20:26 UTC
It's not producing the error with the "nouveau.modeset=0" boot opion set.
But with this setting, I'm getting only the large console fonts and a screen resolution of 600x800 or 480x640 only.

I'm attaching a screen shot with some other pcieport errors that happened with the above parameter unset.

Comment 3 Arnold Sutter 2010-09-07 05:29:35 UTC
Created attachment 443402 [details]
output of "lspci -v"

Comment 4 Arnold Sutter 2010-09-07 05:30:56 UTC
Created attachment 443403 [details]
Picture of boot error

I've taken this picture illustrating one possible error when the NB is not booting.

Comment 5 Arnold Sutter 2010-09-09 13:15:52 UTC
It looks just like bug https://bugzilla.redhat.com/show_bug.cgi?id=609557

The only difference is that it SOMETIMES boots despite some messages . . .

Comment 6 Ben Skeggs 2010-09-09 22:14:46 UTC
Does any of the suggestions in that bug help your case too?  Also, can you attach a dmesg log after successfully booting please?

Comment 7 Arnold Sutter 2010-09-10 09:17:45 UTC
Created attachment 446470 [details]
dmesg after successful boot

dmesg output after successful boot

Comment 8 Arnold Sutter 2010-09-10 09:20:12 UTC
Setting "intel_iommu=false" doesn't change anything. 
Setting "acpi=off" does make a difference, but that mode is not an option for  me (low res, etc.).

Comment 9 Luigi Pardey 2010-09-11 14:10:14 UTC
I believe nouveau is not the responsible of the crash, as its related error keeps showing even in the dmesg of a successful boot:

> [drm] nouveau 0000:01:00.0: Pointer to BIT loadval table invalid

And also the PCIe related errors show in the same dmesg

> pcieport 0000:00:03.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0018(Requester ID)
> pcieport 0000:00:03.0:   device [8086:d138] error status/mask=00004000/00000000

So we need more information to see what is causing it. Can you please attach the output of /var/log/messages and the Xorg log file after booting with nouveau.modeset=0? 
And if there is any way to check the output messages when booting (serial console or verbose text boot) and crashing, I believe they would help.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 10 Arnold Sutter 2010-09-12 20:41:02 UTC
Created attachment 446803 [details]
Hang during booting with nouveau.modeset=1

System hung while booting with the following kernel options

title Fedora (2.6.34.6-54.fc13.x86_64)
	root (hd0,0)
	kernel /vmlinuz-2.6.34.6-54.fc13.x86_64 ro root=/dev/mapper/vg0-lv1 rd_L
VM_LV=vg0/lv1 rd_LVM_LV=vg0/lv0 rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SY
SFONT=latarcyrheb-sun16 KEYTABLE=sg-latin1 nouveau.noaccel=1 nouveau.modeset=1
	initrd /initramfs-2.6.34.6-54.fc13.x86_64.img

Comment 11 Arnold Sutter 2010-09-12 20:47:19 UTC
When I boot the NB with nouveau.modeset=0, I have no issues booting whatsoever. The problem is, that this leaves me with an unusable graphics display with 600x800 resolution and thus makes my NB useless. 

When I boot with nouveau.modeset=1 but WITHOUT the "quiet" option, the system hung at the screenshot attached some minutes ago. 

Note that the system doesn't crash, it just hangs there indefinitely and thus asks for a power cycle. When I do that, it boots properly SOMETIMES (I probably need up to 5 attempts).

The system doesn't have a serial port but I do own an USB-to-serial adapter. Please tell me how to boot the system into serial console with this device if that would help you.

Comment 12 Arnold Sutter 2010-10-13 21:12:10 UTC
It seems that I'm having the same issue with the Fedora-14-Beta-x86_64-Live USB Boot media. 

Is there anything that I can do to have this problem resolved with final F14?

Comment 13 Fabien Archambault 2011-02-23 14:02:15 UTC
Also present within the live available for nvidia test day for Fedora 15.
This: http://fedoraproject.org/wiki/Test_Day:2011-02-22_Nouveau

Comment 14 Bug Zapper 2011-05-31 14:18:23 UTC
This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 15 Bug Zapper 2011-06-28 13:19:09 UTC
Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.