Bug 1956634 - Locks up and drops back to login screen
Summary: Locks up and drops back to login screen
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: mesa
Version: 34
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Adam Jackson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-04 06:11 UTC by Tony Cook
Modified: 2021-05-13 13:24 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug


Attachments (Terms of Use)
Journal excerpt (28.53 KB, text/plain)
2021-05-04 08:28 UTC, Tony Cook
no flags Details
Journal filtered as requested (9.82 KB, application/octet-stream)
2021-05-07 14:20 UTC, Tony Cook
no flags Details

Description Tony Cook 2021-05-04 06:11:27 UTC
Description of problem:
Display freezes and after a long pause returns to the login screen

Version-Release number of selected component (if applicable):
0:21.1.1-1.fc34

How reproducible:
Just use a few applications and pretty soon, BINGO!

Steps to Reproduce:
1. No specific steps, just use the GUI and launch enough apps, usually Google Chrome and a few sites displayed seems to be sufficient.
2. Crapola
3. 

Actual results:
Display freezes

Expected results:
Display has no need to freeze, nothing special required other than that.

Additional info:
https://retrace.fedoraproject.org/faf/problems/bthash/?bth=63c632ec3b3a83d87ee178206a6b21377e02c912&bth=a9e4ecd76ceae6a95241cdb3b160481276d565ba&bth=ee88c35c8fa11dd35f1836ba480a8c4ce577e5d6&bth=f3811d45d8d8b8ea04805337233a1cb42005d688&bth=0baaa0c1d5fbf3e0311b007753682e052d530d74&bth=baf36a955606db80c1b338d6eb3b91327e1d81cf&bth=7f2943e2456ee28ab510d14f0988c924481612f8&bth=84797b519a9ca9c9a7acc16d31ba3108dcd7f8c5&bth=6d49a262868f7659e39b646ebc99dda6b8bad28c&bth=2c84a1cd09f9bf5f4ee6750ff20f62d890359a2d

Comment 1 Olivier Fourdan 2021-05-04 06:28:51 UTC
The retrace you point out is a problem with the nouveau DRI driver in Mesa.

Comment 2 Tony Cook 2021-05-04 06:47:54 UTC
That wouldn't surprise me at all, in fact because I have two machines running Fedora 34 and only the one equipped with the Nvidia hardware does this I kinda expected it would be associated. Interestingly I only started using the nouveau driver because the proprietary driver was cocking up too and of course there is nothing to report when that one fails. The Nvidia driver corrupts the screen with mouse pointers and god knows what other bits of odd pixels. Oh well, just trying to do my bit.

Comment 3 Tony Cook 2021-05-04 06:50:53 UTC
The Fedora bugreport tool suggested that in addition to sending it to them that I should put in a bugzilla report too. If you think this is pointless I am quite happy not to bother in the future.

Comment 4 Olivier Fourdan 2021-05-04 07:30:55 UTC
(In reply to Tony Cook from comment #2)
> That wouldn't surprise me at all, in fact because I have two machines
> running Fedora 34 and only the one equipped with the Nvidia hardware does
> this I kinda expected it would be associated. Interestingly I only started
> using the nouveau driver because the proprietary driver was cocking up too
> and of course there is nothing to report when that one fails. The Nvidia
> driver corrupts the screen with mouse pointers and god knows what other bits
> of odd pixels.

OT, but FWIW the issues with NVidia closed source driver and Xwayland using EGLStream are being worked out, see bug 1949415 or the issue upstream https://gitlab.freedesktop.org/xorg/xserver/-/issues/1156 and the merge request https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/646

> Oh well, just trying to do my bit.

Oh sure, appreciated! I am just moving the bug where I believe it belongs (and there are a few other similar bugs),

(In reply to Tony Cook from comment #3)
> The Fedora bugreport tool suggested that in addition to sending it to them
> that I should put in a bugzilla report too. If you think this is pointless I
> am quite happy not to bother in the future.

No bother at all, just regular triaging. All bugs should be filled.

Comment 5 Karol Herbst 2021-05-04 07:43:12 UTC
mind adding what GPU you have and if there is anything specific you know which might trigger the bug more frequently?

Comment 6 Tony Cook 2021-05-04 07:52:09 UTC
Video streaming appears to be an instigating factor, probably just because the total video throughput goes up I expect.

product: GK104 [GeForce GTX 660 Ti] [10DE:1183]
vendor: NVIDIA Corporation [10DE]
bus info: pci@0000:01:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities:
	Power Management,
	Message Signalled Interrupts,
	PCI Express,
	vga_controller,
	bus mastering,
	PCI capabilities listing,
	extension ROM
configuration:
	driver: nouveau
	latency: 0
resources:
	irq: 58
	memory: fd000000-fdffffff
	memory: f0000000-f7ffffff
	memory: f8000000-f9ffffff
	ioport: e000(size=128)
	memory: c0000-dffff

Comment 7 Karol Herbst 2021-05-04 07:56:29 UTC
(In reply to Tony Cook from comment #6)
> Video streaming appears to be an instigating factor, probably just because
> the total video throughput goes up I expect.
> 
> product: GK104 [GeForce GTX 660 Ti] [10DE:1183]
> vendor: NVIDIA Corporation [10DE]
> bus info: pci@0000:01:00.0
> version: a1
> width: 64 bits
> clock: 33MHz
> capabilities:
> 	Power Management,
> 	Message Signalled Interrupts,
> 	PCI Express,
> 	vga_controller,
> 	bus mastering,
> 	PCI capabilities listing,
> 	extension ROM
> configuration:
> 	driver: nouveau
> 	latency: 0
> resources:
> 	irq: 58
> 	memory: fd000000-fdffffff
> 	memory: f0000000-f7ffffff
> 	memory: f8000000-f9ffffff
> 	ioport: e000(size=128)
> 	memory: c0000-dffff

ahh yeah... especially if it's video accelerated, so that is _most_ likley the multithreading issue we still have in nouveau (and I am working on). So if you have multiple contexts within one application (even GL + VDPAU) you can run into data corruption and GPUs crashing because of random garbage in the command stream... maybe it's just two GL contexts or something.. not quite sure yet.

but could also be a red herring and just a random issue...

did you do the video acceleration steps to get the firmware installed or is that essentially with a stock installation?

Comment 8 Tony Cook 2021-05-04 08:01:04 UTC
Bog standard. I normally use the proprietary driver to get the VDPAU extensions. Never tried with the nouveau

Comment 9 Karol Herbst 2021-05-04 08:07:11 UTC
(In reply to Tony Cook from comment #8)
> Bog standard. I normally use the proprietary driver to get the VDPAU
> extensions. Never tried with the nouveau

yeah.. nvm.. I just tried it with my GTX 660 and I hit at least some bug. Not sure if it's the same as yours :)

I got something like this inside dmesg: 

> [  165.605566] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [  165.612209] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
> [  165.619139] nouveau 0000:01:00.0: fifo: channel 8: killed
> [  165.624650] nouveau 0000:01:00.0: fifo: engine 7: scheduled for recovery
> [  165.631521] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery
> [  165.638408] nouveau 0000:01:00.0: fifo: fault 00 [READ] at 0000000038240000 engine 1b [CE2] client 00 [GPC3/L1_0] reason 04 [UNBOUND_INST_BLOCK] on channel -1 [6142881000 unknown]
> [  165.655712] nouveau 0000:01:00.0: Xwayland[1969]: channel 8 killed!

as your machine doens't seem to actually crash, can you paste your "dmesg" or a "journalctl --dmesg" output from the current boot if the issue already happend? And if it only happened last boot you can pass "boot -1" to journalctl to get older logs (or any other number, depending on when it happened)

Comment 10 Tony Cook 2021-05-04 08:28:10 UTC
Created attachment 1779283 [details]
Journal excerpt

As requested, better to add it as an attachment I think.

Comment 11 Karol Herbst 2021-05-06 19:47:21 UTC
So I looked into this issue with my GK106 a bit and was just hitting a known "firmware" related issue where if we use the nvidias firmware it's all going better. But the way I triggered this issue very fast was to just open chromium and resize it for a couple of seconds and it would crash quite fast.

Mind trying something as well? Because if that's also triggering the issue for you, we could check if using nvidias firmware helps as well on your gk104 and then I'd at least know what issue you are hitting here.

Comment 12 Tony Cook 2021-05-07 05:52:11 UTC
Well after your previous remarks made me aware that I could still have some measure of hardware acceleration whist using nouveau I found out how you did it and used the tools to extract the firmware from an older Nvidia driver and copied them to where they would be found. After that I still had exactly the same failures as before, can't say if it is better, maybe it is a bit better but in the end it still crapped out the same way.

Comment 13 Karol Herbst 2021-05-07 11:03:06 UTC
(In reply to Tony Cook from comment #12)
> Well after your previous remarks made me aware that I could still have some
> measure of hardware acceleration whist using nouveau I found out how you did
> it and used the tools to extract the firmware from an older Nvidia driver
> and copied them to where they would be found. After that I still had exactly
> the same failures as before, can't say if it is better, maybe it is a bit
> better but in the end it still crapped out the same way.

the issue is, that you'd also need to adjust dracut and force those files to be included into the initramfs. So mind checking dmesg if you have any firmware loading issues?

The painful part about this issue is just, that if it takes hours to trigger it's just annoying to debug :( Do you know if the error messages near those "scheduled for recovery" messages are usually the same or are they always different? Maybe you entire dmesg log could help me figure something out.

Mind sharing the dmesg.xz from "journalctl --dmesg --no-hostname --boot all | grep nouveau | xz > dmesg.xz"? (I used xz because otherwise I get a 100MB file on my machine here).

Comment 14 Tony Cook 2021-05-07 14:20:34 UTC
Created attachment 1780760 [details]
Journal filtered as requested

filtered per: journalctl --dmesg --no-hostname --boot all | grep nouveau | xz > dmesg.xz

This is how I know that the firmware is now being loaded:

firstly used to get:
May 03 15:15:59 Athol kernel: nouveau 0000:01:00.0: Direct firmware load for nouveau/nve4_fuc084 failed with error -2
May 03 15:15:59 Athol kernel: nouveau 0000:01:00.0: Direct firmware load for nouveau/nve4_fuc084d failed with error -2
May 03 15:15:59 Athol kernel: nouveau 0000:01:00.0: msvld: unable to load firmware data
May 03 15:15:59 Athol kernel: nouveau 0000:01:00.0: msvld: init failed, -19

after placing the firmware in /lib/firmware/nouveau that no longer occurs

secondly vainfo now gives the following, showing hardware support for some video modes:

$ vainfo
libva info: VA-API version 1.11.0
libva info: Trying to open /usr/lib64/dri/nouveau_drv_video.so
libva info: Found init function __vaDriverInit_1_11
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.11 (libva 2.11.0)
vainfo: Driver version: Mesa Gallium driver 21.0.3 for NVE4
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileNone                   :	VAEntrypointVideoProc

Comment 15 Karol Herbst 2021-05-07 14:48:45 UTC
(In reply to Tony Cook from comment #14)
> Created attachment 1780760 [details]
> Journal filtered as requested
> 
> filtered per: journalctl --dmesg --no-hostname --boot all | grep nouveau |
> xz > dmesg.xz
> 
> This is how I know that the firmware is now being loaded:
> 
> firstly used to get:
> May 03 15:15:59 Athol kernel: nouveau 0000:01:00.0: Direct firmware load for
> nouveau/nve4_fuc084 failed with error -2
> May 03 15:15:59 Athol kernel: nouveau 0000:01:00.0: Direct firmware load for
> nouveau/nve4_fuc084d failed with error -2
> May 03 15:15:59 Athol kernel: nouveau 0000:01:00.0: msvld: unable to load
> firmware data
> May 03 15:15:59 Athol kernel: nouveau 0000:01:00.0: msvld: init failed, -19
> 
> after placing the firmware in /lib/firmware/nouveau that no longer occurs
> 
> secondly vainfo now gives the following, showing hardware support for some
> video modes:
> 
> $ vainfo
> libva info: VA-API version 1.11.0
> libva info: Trying to open /usr/lib64/dri/nouveau_drv_video.so
> libva info: Found init function __vaDriverInit_1_11
> libva info: va_openDriver() returns 0
> vainfo: VA-API version: 1.11 (libva 2.11.0)
> vainfo: Driver version: Mesa Gallium driver 21.0.3 for NVE4
> vainfo: Supported profile and entrypoints
>       VAProfileMPEG2Simple            :	VAEntrypointVLD
>       VAProfileMPEG2Main              :	VAEntrypointVLD
>       VAProfileVC1Simple              :	VAEntrypointVLD
>       VAProfileVC1Main                :	VAEntrypointVLD
>       VAProfileVC1Advanced            :	VAEntrypointVLD
>       VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
>       VAProfileH264Main               :	VAEntrypointVLD
>       VAProfileH264High               :	VAEntrypointVLD
>       VAProfileNone                   :	VAEntrypointVideoProc

oh, I think I just wasn't clear enough the last time. I meant different firmware which you enable by using nouveau.config=NvGrUseFW=1 and this loads even more firmware from the nvidia driver.

Comment 16 Tony Cook 2021-05-07 14:52:16 UTC
Haha, I see you are a wise in the ways of GPU. No, I am not familiar with the more arcane powers available to the nouveau elite.

Comment 17 Tony Cook 2021-05-07 14:53:47 UTC
Where exactly do you place this incantation?

Comment 18 Karol Herbst 2021-05-07 15:04:36 UTC
(In reply to Tony Cook from comment #17)
> Where exactly do you place this incantation?

it's a kernel module parameter. Either add it as a kernel parameter or as a modprobe file as "options nouveau config=NvGrUseFW=1" (+regenerate initramfs and such)

Comment 19 Tony Cook 2021-05-07 15:09:43 UTC
It's been a while but I found where the kernel parameters go these days and added it, unfortunately it gives:

[    2.691662] nouveau 0000:01:00.0: Direct firmware load for nouveau/nve4_fuc409c failed with error -2
[    2.691680] nouveau 0000:01:00.0: Direct firmware load for nouveau/fuc409c failed with error -2
[    2.691683] nouveau 0000:01:00.0: gr: failed to load fuc409c

I checked and the firmware blob nve4_fuc409c is in /lib/firmware/nouveau, so what next?

Comment 20 Karol Herbst 2021-05-10 12:27:31 UTC
(In reply to Tony Cook from comment #19)
> It's been a while but I found where the kernel parameters go these days and
> added it, unfortunately it gives:
> 
> [    2.691662] nouveau 0000:01:00.0: Direct firmware load for
> nouveau/nve4_fuc409c failed with error -2
> [    2.691680] nouveau 0000:01:00.0: Direct firmware load for
> nouveau/fuc409c failed with error -2
> [    2.691683] nouveau 0000:01:00.0: gr: failed to load fuc409c
> 
> I checked and the firmware blob nve4_fuc409c is in /lib/firmware/nouveau, so
> what next?

yeah.. I am not sure why those files are not picked up automatically. I've created a file for dracut to add them on my gk106. You'd need to replace nve6 with nve4 and to regenerate with "dracut -f"

> $ cat /etc/dracut.conf.d/nouveau_fw.conf 
> install_items+=" /usr/lib/firmware/nouveau/nve6_fuc409c /usr/lib/firmware/nouveau/nve6_fuc409d /usr/lib/firmware/nouveau/nve6_fuc41ac /usr/lib/firmware/nouveau/nve6_fuc41ad "

And sorry for the late response.

Comment 21 Tony Cook 2021-05-10 12:45:33 UTC
No worries, I'll give it a go first thing in the morning.

Comment 22 Tony Cook 2021-05-12 05:01:54 UTC
All done, now waiting for outcome
PS. the attempt to make extra debug appears to fail, any comment?

[    0.000000] Command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.11.17-300.fc34.x86_64 root=UUID=e840c76c-d593-4b6d-a04a-d1fb3d365a08 nouveau.config=NvGrUseFW=1 nouveau.debug=DEVICE=debug,PFIFO=debug,PGRAPH=debug,SW=debug,PDISP=debug,DEVINIT=debug ro rhgb quiet
[    0.224797] Kernel command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.11.17-300.fc34.x86_64 root=UUID=e840c76c-d593-4b6d-a04a-d1fb3d365a08 nouveau.config=NvGrUseFW=1 nouveau.debug=DEVICE=debug,PFIFO=debug,PGRAPH=debug,SW=debug,PDISP=debug,DEVINIT=debug ro rhgb quiet
[    2.547519] nouveau 0000:01:00.0: vgaarb: deactivate vga console
[    2.548686] nouveau 0000:01:00.0: NVIDIA GK104 (0e4030a2)
[    2.689871] nouveau 0000:01:00.0: bios: version 80.04.4b.00.14
[    2.693269] nouveau 0000:01:00.0: fb: 2048 MiB GDDR5
[    2.779877] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB
[    2.779879] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
[    2.779882] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
[    2.779884] nouveau 0000:01:00.0: DRM: DCB version 4.0
[    2.779886] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030
[    2.779888] nouveau 0000:01:00.0: DRM: DCB outp 01: 02000f00 00000000
[    2.779890] nouveau 0000:01:00.0: DRM: DCB outp 02: 08011f82 00020030
[    2.779891] nouveau 0000:01:00.0: DRM: DCB outp 03: 02822f62 0f420010
[    2.779893] nouveau 0000:01:00.0: DRM: DCB outp 05: 04833fb6 0f420010
[    2.779894] nouveau 0000:01:00.0: DRM: DCB outp 06: 04033f72 00020010
[    2.779896] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001030
[    2.779898] nouveau 0000:01:00.0: DRM: DCB conn 01: 00020131
[    2.779899] nouveau 0000:01:00.0: DRM: DCB conn 02: 00002261
[    2.779900] nouveau 0000:01:00.0: DRM: DCB conn 03: 00010346
[    2.779902] nouveau 0000:01:00.0: DRM: DCB conn 04: 00000460
[    2.781676] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
[    2.980294] nouveau 0000:01:00.0: DRM: allocated 1920x1200 fb: 0xa0000, bo 0000000078442478
[    2.994863] fbcon: nouveaudrmfb (fb0) is primary device
[    3.138737] nouveau 0000:01:00.0: [drm] fb0: nouveaudrmfb frame buffer device
[    3.190390] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0
[    5.251740] snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops nv50_audio_component_bind_ops [nouveau])

$ vainfo
libva info: VA-API version 1.11.0
libva info: Trying to open /usr/lib64/dri/nouveau_drv_video.so
libva info: Found init function __vaDriverInit_1_11
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.11 (libva 2.11.0)
vainfo: Driver version: Mesa Gallium driver 21.0.3 for NVE4
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileNone                   :	VAEntrypointVideoProc

Comment 23 Karol Herbst 2021-05-12 11:12:02 UTC
(In reply to Tony Cook from comment #22)
> All done, now waiting for outcome

yeah.. good luck with that. If it helps it would indicate that the firmware issue is indeed more widespread than initially believed to be. It at least would explain why you are hitting the bug.

> PS. the attempt to make extra debug appears to fail, any comment?
> 

I think for most subsystems it's without the p, not sure.. docs might be outdated.. mostly I just enable debugging for everything and then you get also the proper channel names.

> [    0.000000] Command line:
> BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.11.17-300.fc34.x86_64
> root=UUID=e840c76c-d593-4b6d-a04a-d1fb3d365a08 nouveau.config=NvGrUseFW=1
> nouveau.debug=DEVICE=debug,PFIFO=debug,PGRAPH=debug,SW=debug,PDISP=debug,
> DEVINIT=debug ro rhgb quiet
> [    0.224797] Kernel command line:
> BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.11.17-300.fc34.x86_64
> root=UUID=e840c76c-d593-4b6d-a04a-d1fb3d365a08 nouveau.config=NvGrUseFW=1
> nouveau.debug=DEVICE=debug,PFIFO=debug,PGRAPH=debug,SW=debug,PDISP=debug,
> DEVINIT=debug ro rhgb quiet
> [    2.547519] nouveau 0000:01:00.0: vgaarb: deactivate vga console
> [    2.548686] nouveau 0000:01:00.0: NVIDIA GK104 (0e4030a2)
> [    2.689871] nouveau 0000:01:00.0: bios: version 80.04.4b.00.14
> [    2.693269] nouveau 0000:01:00.0: fb: 2048 MiB GDDR5
> [    2.779877] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB
> [    2.779879] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
> [    2.779882] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
> [    2.779884] nouveau 0000:01:00.0: DRM: DCB version 4.0
> [    2.779886] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030
> [    2.779888] nouveau 0000:01:00.0: DRM: DCB outp 01: 02000f00 00000000
> [    2.779890] nouveau 0000:01:00.0: DRM: DCB outp 02: 08011f82 00020030
> [    2.779891] nouveau 0000:01:00.0: DRM: DCB outp 03: 02822f62 0f420010
> [    2.779893] nouveau 0000:01:00.0: DRM: DCB outp 05: 04833fb6 0f420010
> [    2.779894] nouveau 0000:01:00.0: DRM: DCB outp 06: 04033f72 00020010
> [    2.779896] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001030
> [    2.779898] nouveau 0000:01:00.0: DRM: DCB conn 01: 00020131
> [    2.779899] nouveau 0000:01:00.0: DRM: DCB conn 02: 00002261
> [    2.779900] nouveau 0000:01:00.0: DRM: DCB conn 03: 00010346
> [    2.779902] nouveau 0000:01:00.0: DRM: DCB conn 04: 00000460
> [    2.781676] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
> [    2.980294] nouveau 0000:01:00.0: DRM: allocated 1920x1200 fb: 0xa0000,
> bo 0000000078442478
> [    2.994863] fbcon: nouveaudrmfb (fb0) is primary device
> [    3.138737] nouveau 0000:01:00.0: [drm] fb0: nouveaudrmfb frame buffer
> device
> [    3.190390] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on
> minor 0
> [    5.251740] snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops
> nv50_audio_component_bind_ops [nouveau])
> 
> $ vainfo
> libva info: VA-API version 1.11.0
> libva info: Trying to open /usr/lib64/dri/nouveau_drv_video.so
> libva info: Found init function __vaDriverInit_1_11
> libva info: va_openDriver() returns 0
> vainfo: VA-API version: 1.11 (libva 2.11.0)
> vainfo: Driver version: Mesa Gallium driver 21.0.3 for NVE4
> vainfo: Supported profile and entrypoints
>       VAProfileMPEG2Simple            :	VAEntrypointVLD
>       VAProfileMPEG2Main              :	VAEntrypointVLD
>       VAProfileVC1Simple              :	VAEntrypointVLD
>       VAProfileVC1Main                :	VAEntrypointVLD
>       VAProfileVC1Advanced            :	VAEntrypointVLD
>       VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
>       VAProfileH264Main               :	VAEntrypointVLD
>       VAProfileH264High               :	VAEntrypointVLD
>       VAProfileNone                   :	VAEntrypointVideoProc

Comment 24 Tony Cook 2021-05-13 04:17:33 UTC
I have bad news I'm afraid. Since loading the firmware as you suggested I have had no further occurrences of anything untoward with the behaviour of the display. I have now been up continuously since starting at 2021-05-12 05:01:54 UTC and dmesg yields exactly what it did at that time. Now, maybe some other update outside of nouveau has had this effect, perhaps I should revert to bare metal and nouveau to see what happens then.

Comment 25 Karol Herbst 2021-05-13 13:24:33 UTC
(In reply to Tony Cook from comment #24)
> I have bad news I'm afraid. Since loading the firmware as you suggested I
> have had no further occurrences of anything untoward with the behaviour of
> the display. I have now been up continuously since starting at 2021-05-12
> 05:01:54 UTC and dmesg yields exactly what it did at that time. Now, maybe
> some other update outside of nouveau has had this effect, perhaps I should
> revert to bare metal and nouveau to see what happens then.

okay, thanks for testing! Well, I wouldn't say it's bad news, it could explain a lot of other random issues we are seeing and we can now assume that this issue doesn't only affect gk106, which was weird from the start, but can potentially affect all 1st gen kepler cards. I will try to take a look and see if there is anything weird in our firmware.


Note You need to log in before you can comment on or make changes to this bug.