Bug 927451 - regression: Invalid ROM contents, signature not found, unable to locate usable image
Summary: regression: Invalid ROM contents, signature not found, unable to locate usabl...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: 19
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-25 23:09 UTC by Chris Murphy
Modified: 2015-02-17 14:54 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-17 14:54:16 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg_3.9.0-0.rc3.git0.3.txt (131.94 KB, text/plain)
2013-03-25 23:11 UTC, Chris Murphy
no flags Details
dmesg_3.9.0-0.rc4.git0.1.txt (120.09 KB, text/plain)
2013-03-25 23:12 UTC, Chris Murphy
no flags Details
proposed patch from Matthew Garrett (13.93 KB, patch)
2013-03-27 20:51 UTC, Bjorn Helgaas
no flags Details | Diff
dmesg_3.9.0-rc4_patch-rhbz927451c3 (160.06 KB, text/plain)
2013-03-28 03:49 UTC, Chris Murphy
no flags Details
photo of corrupt display output (191.05 KB, image/jpeg)
2013-03-28 19:58 UTC, Chris Murphy
no flags Details
dmesg_3.9.0-0.rc3.git0.3.fc19.x86_64.debug_2 (420.44 KB, text/plain)
2013-03-29 05:08 UTC, Chris Murphy
no flags Details
dmesg 3.13.0-0.rc6 (121.75 KB, text/plain)
2014-01-01 18:28 UTC, Chris Murphy
no flags Details
dmesg 3.13.0-0.rc6 nouveau.config=NvMSI=0 (119.68 KB, text/plain)
2014-01-01 20:14 UTC, Chris Murphy
no flags Details

Description Chris Murphy 2013-03-25 23:09:37 UTC
Description of problem:
Computer boots with a black screen after GRUB loads kernel and initramfs


Version-Release number of selected component (if applicable):
Fedora 18
xorg-x11-drv-nouveau-1.0.6-1.fc18.x86_64
kernel-3.9.0-0.rc4.git0.1.fc19.x86_64


How reproducible:
Always

Steps to Reproduce:
1. Fedora 18 livecd install, updated with dnf update (updates-testing repo is not enabled)
2. Install kernels from koji
  
Actual results:
Black screen. Is possible to ssh into the system.

Expected results:
Working display.


Additional info:
This is a regression from kernel-3.9.0-0.rc3.git0.3.fc19.x86_64 which does not exhibit this problem.


Appears to be VBIOS corruption:

[    4.357820] nouveau 0000:01:00.0: enabling device (0006 -> 0007)
[    4.369208] nouveau  [  DEVICE][0000:01:00.0] BOOT0  : 0x084700a2
[    4.369212] nouveau  [  DEVICE][0000:01:00.0] Chipset: G84 (NV84)
[    4.369215] nouveau  [  DEVICE][0000:01:00.0] Family : NV50
[    4.380960] nouveau  [   VBIOS][0000:01:00.0] checking PRAMIN for image...
[    4.380969] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    4.380971] nouveau  [   VBIOS][0000:01:00.0] checking PROM for image...
[    4.380999] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    4.381037] nouveau  [   VBIOS][0000:01:00.0] checking ACPI for image...
[    4.381039] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    4.381042] nouveau  [   VBIOS][0000:01:00.0] checking PCIROM for image...
[    4.381923] nouveau 0000:01:00.0: Invalid ROM contents
[    4.381942] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    4.381944] nouveau E[   VBIOS][0000:01:00.0] unable to locate usable image
[    4.381950] nouveau E[  DEVICE][0000:01:00.0] failed to create 0x10000001, -22
[    4.381980] nouveau E[     DRM] failed to create 0x80000080, -22
[    4.390663] ata_id (187) used greatest stack depth: 4088 bytes left
[    4.398205] nouveau: probe of 0000:01:00.0 failed with error -22

Comment 1 Chris Murphy 2013-03-25 23:11:25 UTC
Created attachment 716274 [details]
dmesg_3.9.0-0.rc3.git0.3.txt

dmesg using kernel 3.9.0-0.rc3.git0.3.fc19.x86_64.debug

Comment 2 Chris Murphy 2013-03-25 23:12:14 UTC
Created attachment 716275 [details]
dmesg_3.9.0-0.rc4.git0.1.txt

dmesg using kernel 3.9.0-0.rc4.git0.1.fc19.x86_64

Comment 3 Bjorn Helgaas 2013-03-27 20:51:14 UTC
Created attachment 717285 [details]
proposed patch from Matthew Garrett

If anybody can test these patches, please report the outcome and attach a dmesg log.  Chris reported the problem on this machine:

DMI: Apple Inc. MacBookPro4,1/Mac-F42C89C8, BIOS    MBP41.88Z.00C1.B03.0802271651 02/27/08

Note that a similar report [1] seems to only happen in UEFI mode.  I assume it's not even possible to turn off UEFI mode on a MacBook, but I'm not sure.

[1] http://marc.info/?l=linux-kernel&m=136148818405871

Comment 4 Chris Murphy 2013-03-28 03:49:28 UTC
Created attachment 717418 [details]
dmesg_3.9.0-rc4_patch-rhbz927451c3

I started with mainline 3.9-rc4 2013-03-23 from kernel.org, and applied the patch in comment 3. I get a different result than either of the previous 3.9.0 fedora kernels. I now get:

[    6.974311] nouveau  [   VBIOS][0000:01:00.0] checking PLATFORM for image...
[    6.974388] nouveau  [   VBIOS][0000:01:00.0] ... appears to be valid
[    6.974395] nouveau  [   VBIOS][0000:01:00.0] using image from PLATFORM
[    6.974601] nouveau  [   VBIOS][0000:01:00.0] BIT signature found
[    6.974609] nouveau  [   VBIOS][0000:01:00.0] version 60.84.49.03.00

But upon executing systemctl isolate graphical.target, the display becomes semi-corrupt then freezes. Remotely, I'm still able to capture dmesg. There are quite a few entries such as this:

[  415.236546] nouveau E[     PFB][0000:01:00.0] trapped write at 0x00005267c0 on channel 0x0001fed0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT

There's no change after 5 minutes, so I rebooted. This behavior is reproduce with each reboot of 3.9.0-rc4_patch-rhbz927451c3.

The partial corruption of display during plymouth or text boot occurs 100% of the time with kernels 3.7, and 3.8; only twice in dozens of boots has kernel 3.8.4 produced the "trapped write" message above, which required a hard reset.

No display corruption or dmesg errors occur with kernels 3.6.10 or 3.6.11.

Comment 5 Chris Murphy 2013-03-28 19:58:08 UTC
Created attachment 717807 [details]
photo of corrupt display output

Example corruption with 3.9.0-rc4_patch-rhbz927451c3, local access text or graphical is useless, X becomes unresponsive.

This occurs to much lesser degree with 3.7 and 3.8 kernels and 3.9-rc3, is still functional text wise; top 40% of the display is adversely affected with the plymouth splash screen; X/Gnome is totally unaffected.

Comment 6 Chris Murphy 2013-03-29 05:08:25 UTC
Created attachment 717948 [details]
dmesg_3.9.0-0.rc3.git0.3.fc19.x86_64.debug_2

"possible recursive locking detected" with a lot of video related issues reported by the debug kernel, first time that's happened with this kernel. Occurred on warm boot to multi-user.target. At time 418 is when I tried to isolate graphical.target and the local screen is frozen in the state it was in, unresponsive keyboard. I was still able to remotely capture this dmesg.

The trapped read and write message may not be related to this bug. There may be more than one problem.

Comment 7 Michele Baldessari 2014-01-01 18:01:23 UTC
Hi Chris,

is this still an issue with the latest Fedora releases?

The initial problem of:
[    4.381944] nouveau E[   VBIOS][0000:01:00.0] unable to locate usable image
[    4.381950] nouveau E[  DEVICE][0000:01:00.0] failed to create 0x10000001, -22

Should be fixed by:
https://bugs.freedesktop.org/show_bug.cgi?id=70208

and
http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-next&id=4c60fac111961e7eb71a08928c22b43cff55f1fb

Not sure if there are different ones now.

Thanks,
Michele

Comment 8 Chris Murphy 2014-01-01 18:19:02 UTC
It's not currently happening with 3.13.0-0.rc6.git0.1.fc21.x86_64. I do still get trapped write messages, and screen artifacts until gnome-shell comes up.

[root@f20s ~]# dmesg | grep 01:00.0
[    0.124933] pci 0000:01:00.0: [10de:0407] type 00 class 0x030000
[    0.124950] pci 0000:01:00.0: reg 0x10: [mem 0xd2000000-0xd2ffffff]
[    0.124965] pci 0000:01:00.0: reg 0x14: [mem 0xc0000000-0xcfffffff 64bit pref]
[    0.124980] pci 0000:01:00.0: reg 0x1c: [mem 0xd0000000-0xd1ffffff 64bit]
[    0.124991] pci 0000:01:00.0: reg 0x24: [io  0x7000-0x707f]
[    0.125004] pci 0000:01:00.0: reg 0x30: [mem 0xd3000000-0xd301ffff pref]
[    0.134292] vgaarb: device added: PCI:0000:01:00.0,decodes=io+mem,owns=none,locks=none
[    0.134292] vgaarb: bridge control possible 0000:01:00.0
[    0.377999] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
[    3.728249] nouveau 0000:01:00.0: enabling device (0006 -> 0007)
[    3.728976] nouveau  [  DEVICE][0000:01:00.0] BOOT0  : 0x084700a2
[    3.728980] nouveau  [  DEVICE][0000:01:00.0] Chipset: G84 (NV84)
[    3.728983] nouveau  [  DEVICE][0000:01:00.0] Family : NV50
[    3.735232] nouveau  [   VBIOS][0000:01:00.0] checking PRAMIN for image...
[    3.735241] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    3.735244] nouveau  [   VBIOS][0000:01:00.0] checking PROM for image...
[    3.735270] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    3.735273] nouveau  [   VBIOS][0000:01:00.0] checking ACPI for image...
[    3.735276] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    3.735278] nouveau  [   VBIOS][0000:01:00.0] checking PCIROM for image...
[    3.735343] nouveau 0000:01:00.0: Invalid ROM contents
[    3.735354] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    3.735357] nouveau  [   VBIOS][0000:01:00.0] checking PLATFORM for image...
[    3.735423] nouveau  [   VBIOS][0000:01:00.0] ... appears to be valid
[    3.735426] nouveau  [   VBIOS][0000:01:00.0] using image from PLATFORM
[    3.735519] nouveau  [   VBIOS][0000:01:00.0] BIT signature found
[    3.735523] nouveau  [   VBIOS][0000:01:00.0] version 60.84.49.03.00
[    3.755661] nouveau 0000:01:00.0: irq 46 for MSI/MSI-X
[    3.755674] nouveau  [     PMC][0000:01:00.0] MSI interrupts enabled
[    3.755711] nouveau  [     PFB][0000:01:00.0] RAM type: GDDR3
[    3.755714] nouveau  [     PFB][0000:01:00.0] RAM size: 512 MiB
[    3.755717] nouveau  [     PFB][0000:01:00.0]    ZCOMP: 1892 tags
[    3.769768] nouveau  [    VOLT][0000:01:00.0] GPU voltage: 1130000uv
[    3.794048] nouveau  [  PTHERM][0000:01:00.0] FAN control: none / external
[    3.794059] nouveau  [  PTHERM][0000:01:00.0] fan management: automatic
[    3.794065] nouveau  [  PTHERM][0000:01:00.0] internal sensor: yes
[    3.794091] nouveau  [     CLK][0000:01:00.0] 20: core 169 MHz shader 338 MHz memory 100 MHz
[    3.794096] nouveau  [     CLK][0000:01:00.0] 21: core 283 MHz shader 566 MHz memory 297 MHz
[    3.794101] nouveau  [     CLK][0000:01:00.0] 22: core 375 MHz shader 750 MHz memory 502 MHz
[    3.794106] nouveau  [     CLK][0000:01:00.0] 23: core 470 MHz shader 940 MHz memory 635 MHz
[    3.794159] nouveau  [     CLK][0000:01:00.0] --: core 275 MHz shader 550 MHz memory 302 MHz
[    3.946352] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[    3.946371] nouveau 0000:01:00.0: registered panic notifier
[    3.946393] [drm] Initialized nouveau 1.1.1 20120801 for 0000:01:00.0 on minor 0
[   12.204278] nouveau E[     PFB][0000:01:00.0] trapped write at 0x0000558400 on channel 0x0001fed0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT
[   12.243533] nouveau E[     PFB][0000:01:00.0] trapped write at 0x000054a4c0 on channel 0x0001fed0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT

Comment 9 Chris Murphy 2014-01-01 18:28:58 UTC
Created attachment 844191 [details]
dmesg 3.13.0-0.rc6

Oops, that was an incomplete filtering. This has drm or nouveau or pci 1:00.0 and the full dmesg is attached.


[    0.124933] pci 0000:01:00.0: [10de:0407] type 00 class 0x030000
[    0.124950] pci 0000:01:00.0: reg 0x10: [mem 0xd2000000-0xd2ffffff]
[    0.124965] pci 0000:01:00.0: reg 0x14: [mem 0xc0000000-0xcfffffff 64bit pref]
[    0.124980] pci 0000:01:00.0: reg 0x1c: [mem 0xd0000000-0xd1ffffff 64bit]
[    0.124991] pci 0000:01:00.0: reg 0x24: [io  0x7000-0x707f]
[    0.125004] pci 0000:01:00.0: reg 0x30: [mem 0xd3000000-0xd301ffff pref]
[    0.134292] vgaarb: device added: PCI:0000:01:00.0,decodes=io+mem,owns=none,locks=none
[    0.134292] vgaarb: bridge control possible 0000:01:00.0
[    0.377999] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
[    3.641382] [drm] Initialized drm 1.1.0 20060810
[    3.719269] fb: conflicting fb hw usage nouveaufb vs EFI VGA - removing generic driver
[    3.728249] nouveau 0000:01:00.0: enabling device (0006 -> 0007)
[    3.728557] [drm] hdmi device  not found 1 0 1
[    3.728976] nouveau  [  DEVICE][0000:01:00.0] BOOT0  : 0x084700a2
[    3.728980] nouveau  [  DEVICE][0000:01:00.0] Chipset: G84 (NV84)
[    3.728983] nouveau  [  DEVICE][0000:01:00.0] Family : NV50
[    3.735232] nouveau  [   VBIOS][0000:01:00.0] checking PRAMIN for image...
[    3.735241] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    3.735244] nouveau  [   VBIOS][0000:01:00.0] checking PROM for image...
[    3.735270] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    3.735273] nouveau  [   VBIOS][0000:01:00.0] checking ACPI for image...
[    3.735276] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    3.735278] nouveau  [   VBIOS][0000:01:00.0] checking PCIROM for image...
[    3.735343] nouveau 0000:01:00.0: Invalid ROM contents
[    3.735354] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[    3.735357] nouveau  [   VBIOS][0000:01:00.0] checking PLATFORM for image...
[    3.735423] nouveau  [   VBIOS][0000:01:00.0] ... appears to be valid
[    3.735426] nouveau  [   VBIOS][0000:01:00.0] using image from PLATFORM
[    3.735519] nouveau  [   VBIOS][0000:01:00.0] BIT signature found
[    3.735523] nouveau  [   VBIOS][0000:01:00.0] version 60.84.49.03.00
[    3.755661] nouveau 0000:01:00.0: irq 46 for MSI/MSI-X
[    3.755674] nouveau  [     PMC][0000:01:00.0] MSI interrupts enabled
[    3.755711] nouveau  [     PFB][0000:01:00.0] RAM type: GDDR3
[    3.755714] nouveau  [     PFB][0000:01:00.0] RAM size: 512 MiB
[    3.755717] nouveau  [     PFB][0000:01:00.0]    ZCOMP: 1892 tags
[    3.769768] nouveau  [    VOLT][0000:01:00.0] GPU voltage: 1130000uv
[    3.794048] nouveau  [  PTHERM][0000:01:00.0] FAN control: none / external
[    3.794059] nouveau  [  PTHERM][0000:01:00.0] fan management: automatic
[    3.794065] nouveau  [  PTHERM][0000:01:00.0] internal sensor: yes
[    3.794091] nouveau  [     CLK][0000:01:00.0] 20: core 169 MHz shader 338 MHz memory 100 MHz
[    3.794096] nouveau  [     CLK][0000:01:00.0] 21: core 283 MHz shader 566 MHz memory 297 MHz
[    3.794101] nouveau  [     CLK][0000:01:00.0] 22: core 375 MHz shader 750 MHz memory 502 MHz
[    3.794106] nouveau  [     CLK][0000:01:00.0] 23: core 470 MHz shader 940 MHz memory 635 MHz
[    3.794159] nouveau  [     CLK][0000:01:00.0] --: core 275 MHz shader 550 MHz memory 302 MHz
[    3.794320] nouveau  [     DRM] VRAM: 512 MiB
[    3.794323] nouveau  [     DRM] GART: 1048576 MiB
[    3.794328] nouveau  [     DRM] TMDS table version 2.0
[    3.794331] nouveau  [     DRM] DCB version 4.0
[    3.794334] nouveau  [     DRM] DCB outp 00: 01000123 00010034
[    3.794337] nouveau  [     DRM] DCB outp 01: 02011210 00000028
[    3.794340] nouveau  [     DRM] DCB outp 02: 02011212 00000030
[    3.794342] nouveau  [     DRM] DCB outp 03: 01011211 0080c070
[    3.794345] nouveau  [     DRM] DCB conn 00: 0040
[    3.794348] nouveau  [     DRM] DCB conn 01: 1120
[    3.820949] nouveau W[     DRM] unknown connector type 20
[    3.820983] nouveau W[     DRM] failed to create encoder 0/1/0: -19
[    3.820987] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    3.820990] [drm] No driver support for vblank timestamp query.
[    3.848816] nouveau  [     DRM] MM: using CRYPT for buffer copies
[    3.904331] nouveau  [     DRM] allocated 1440x900 fb: 0x70000, bo ffff880137068c00
[    3.904428] fbcon: nouveaufb (fb0) is primary device
[    3.946352] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[    3.946371] nouveau 0000:01:00.0: registered panic notifier
[    3.946393] [drm] Initialized nouveau 1.1.1 20120801 for 0000:01:00.0 on minor 0
[   12.204278] nouveau E[     PFB][0000:01:00.0] trapped write at 0x0000558400 on channel 0x0001fed0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT
[   12.243533] nouveau E[     PFB][0000:01:00.0] trapped write at 0x000054a4c0 on channel 0x0001fed0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT

Comment 10 Michele Baldessari 2014-01-01 20:09:35 UTC
Hi Chris,

if you boot with 'nouveau.config=NvMSI=0' does this still happen?

thanks,
Michele

Comment 11 Chris Murphy 2014-01-01 20:14:57 UTC
Created attachment 844236 [details]
dmesg 3.13.0-0.rc6 nouveau.config=NvMSI=0

Yes, both artifacts and the message:
[   10.812029] nouveau E[     PFB][0000:01:00.0] trapped write at 0x0000525500 on channel 0x0001fed0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT

Comment 12 Michele Baldessari 2014-01-01 20:55:59 UTC
Hi Chris,

I've quickly exchanged mails with upstream. Related to this specific "trapped
write" errors, would you be able to do a bisection to see when the issue got 
introduced? I can try and guide you a bit, but it will require a bit of 
time/effort on your part.

It might be particularly challenging here because when we go back in time
too much we might see the other original errors, but it is worth a shot to help upstream pinpoint the commit that introduced the issue.

Let me know if you have time/inclination to do this process and I'll try
to guide you.

Thanks,
Michele

Comment 13 Fedora End Of Life 2015-01-09 17:49:05 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 14 Fedora End Of Life 2015-02-17 14:54:16 UTC
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.