Bug 1376107 - kmail etc crash with 16.08.1 using nouveau
Summary: kmail etc crash with 16.08.1 using nouveau
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1416623 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-14 17:07 UTC by Sammy
Modified: 2018-05-07 14:27 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-07 14:27:18 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Qt Bug Tracker QTBUG-41242 None None None 2016-09-19 11:31:57 UTC
FreeDesktop.org 91632 None None None 2016-09-19 11:31:29 UTC

Description Sammy 2016-09-14 17:07:03 UTC
Kmail crashes pretty much immediately when using the kdepim 16.08.1.
This seems to be originating from the use of one of QtWebEngine, QT, nouveau.
I think this bug was reported earlier. Starting kmail with LIBGL_ALWAYS_SOFTWARE=1
prevents the crash. Things work fine with intel graphics.

Comment 1 Kevin Kofler 2016-09-14 22:06:41 UTC
I already reassigned several QtWebEngine crash reports with full backtraces pointing clearly to Nouveau code to xorg-x11-drv-nouveau. I still have not seen any reply by a Nouveau maintainer.

Comment 2 Herman Grootaers 2016-09-19 02:26:16 UTC
Sorry, I was to quick with saving the changes.

This bug broke my F24 on an AMD-based system on the 18th of September. Not funny, because I like to have no updates waiting.

Is there a way to stop this kind of situation(s), because I think this bug breaks all kinds of stuff, akregator is affected in the same way.

Comment 3 Kevin Kofler 2016-09-19 11:30:37 UTC
> This bug broke my F24 on an AMD-based system on the 18th of September.

Wait, with AMD-based, you mean including the GPU? Then it cannot possibly be this bug, which is only on NVidia hardware. Or do you mean AMD CPU + NVidia GPU?


Anybody seeing QtWebEngine (i.e., QupZilla 2, PIM 16.08, etc.) crashes with the Nouveau driver can try these experimental Mesa packages:
https://copr.fedorainfracloud.org/coprs/kkofler/qtwebengine/build/454558/

This applies the experimental patches from:
https://github.com/imirkin/mesa/commits/locking
(except the s/nouveau/noouveau/ hack the author added to allow testing multithreading with Warsow). The feedback in the upstream bug report looks positive. But the author of the patches says they're incomplete. I think it needs to be a really high priority to complete them. The patchset hasn't been touched for a few weeks now, unfortunately.

Comment 4 Rex Dieter 2016-09-19 11:54:28 UTC
Re: comment #2

if you can provide a backtrace for your crash, that would help diagnose what is going wrong (which is most likely different than the nouveau-specific issue being tracked here).

Comment 5 Rex Dieter 2016-09-19 11:54:49 UTC
sorry, clearing needinfo

Comment 6 Sammy 2016-09-19 13:30:43 UTC
Trying the 12.0.3-1.fc24.nouvlock packages from comment #3 seems to solve the
problem (at least up to this point). No crash seen using kmail for 10 minutes to
do a bunch of things. FYI

Comment 7 Herman Grootaers 2016-09-19 14:55:49 UTC
Re: comment 3

Yep, AMD-processor with 4 cores; GForce GTX650 on the PCI-bus for the main or primairy screen and Radeon HD6530D on the motherboard for the secundairy screen. Works well when you are editing text from PDF to ODT to TEX, but I need my e-mail and feedreader for the news.

Setup: main->secundairy, repositioning them is not possible due to space under the screens. Under the secundary is storage, under the main is legroom en thus my keyboard.

Comment 8 Kevin Kofler 2016-09-19 15:27:08 UTC
> GForce GTX650 on the PCI-bus

So you are probably seeing the Nouveau bug too. Can you please try my Mesa build I linked to above?

Comment 9 Kevin Kofler 2016-09-21 02:12:06 UTC
So, all the users who tried my builds from:
https://copr.fedorainfracloud.org/coprs/kkofler/qtwebengine/build/454558/
so far reported that it fixed the crashes for them. Any chance we can get these patches applied in the official Fedora mesa packages? And also merged into upstream Mesa?

Comment 10 Herman Grootaers 2016-09-22 18:38:38 UTC
Re: comment 9

Well, I have installed the coprs, took me a bit longer to get started, and am running it now for over a day. No crashes of kmail and akregator detected.

Seems a good plan.

Also is it possible to get this bug on the F25-beta-blocker or F25-final-blocker? in this way we ensure that KDE is working and not giving a head-ache when upgrading the system. I caught this because I run a fully patched c.q. updated system; but many update only once on the systemupgrade.

Comment 11 Sammy 2016-10-11 14:31:01 UTC
I noticed that there is a new mesa build is coming with some new patches. Could
we get the nouveau patches in? If the update updates this version than the
problem will resurface.

Comment 12 Kevin Kofler 2016-10-11 15:34:21 UTC
I can update my build. I could technically also commit the patches to the official package (since I'm a provenpackager), but I really don't want to do that without the maintainer's approval, people would probably not like it.

Comment 13 Sammy 2016-10-11 16:49:20 UTC
That would be good. Thanks.

Comment 14 Kevin Kofler 2016-10-11 21:50:44 UTC
I built an updated mesa in the qtwebengine Copr, but the F25 x86_64 build is failing due to some buildroot inconsistency (mismatched systemd-devel vs. systemd), the other targets succeeded.

Comment 15 Sammy 2016-10-12 16:05:51 UTC
Did they all fail?

Comment 16 Kevin Kofler 2016-10-12 19:21:16 UTC
It's fixed now.

Comment 17 Sammy 2016-11-22 20:41:11 UTC
I am updating to Fedora 25 and could not find the f25 versions of these. Could
you point me to the link please. Thanks.

Comment 18 Kevin Kofler 2016-11-22 20:50:12 UTC
The latest is always at:
https://copr.fedorainfracloud.org/coprs/kkofler/qtwebengine/package/mesa/
(or just enable the kkofler/qtwebengine Copr altogether, but I might want to also put other testing packages there).

Comment 19 info@kobaltwit.be 2016-12-03 16:10:48 UTC
Just to add my voice here: the patches do allow me to use kontact normally with nouveau. Without the patches, it was completely unusable.

So please, even if upstream mesa maintainers say they need more work, add them to the fedora rpms in the interim.

Comment 20 Herman Grootaers 2016-12-07 19:45:56 UTC
Well, I am not sure this is a new bug but something very ugly happened when I upgraded my system to F25.

KDE does not start correctly, it tries to start kwallet, but then for some reason the dbus-deamon goes defunct, screen 2 and the others screens offer a text-login, screen 1, by default the KDE-gui freezes, returning form screen 2 gives a new start-screen.

Luckily fluxbox works, but I did not get it working with two screens.

Strange behaviour for my secondary screen: it shows the last screen when I retsrted the system for the reboot.

System is up-to date with patches/upgrades.

Systemupgrade went without problems.

I do not know if this is a new bug or something related to my hardware and screen-layout.

In the beginning I got this message: "Could not start D-Bus. Can you call qdbus?" in a window, baut after an upgrade or install it disappeared or is not shown again. 

Problem is I cannot get any clues from google.

Please help!

Comment 21 Rex Dieter 2016-12-08 12:58:50 UTC
Re: comment #20

that is a new/different issue (if dbus service isn't starting, you have bigger problems)

Comment 22 Herman Grootaers 2016-12-08 13:52:15 UTC
Yes, I thought so.

But now how to proceed correctly:

dbus has started, but is going defunct (that is crashing) for some reason, and I do not know where to start in finding the starting-point to hunt down this bug, or get help in resolving this bug.

All help is needed, because I do not know enough to start the hunt on my own.

Comment 23 Rex Dieter 2016-12-08 14:02:05 UTC
https://fedoraproject.org/wiki/Communicating_and_getting_help

outlines several methods, I'd suggest using forums and/or irc to start with

Comment 24 Ali Akcaagac 2016-12-13 15:50:43 UTC
I think there is even more to that!

I am experiencing similar crashes and lockups with google-chrome (stable, beta, devel) *since* I switched from Fedora 24 to Fedora 25.

Running google-chrome and after a few minutes (there is no strict rule to it) google-chrome freezes and renders the entire dektop in a unusable state.

I started googling for this issue and hit a couple of other people who reported this issue on either their blogs or some random bugentering systems.

I also came across the page from the freedesktop people.

It was said in one bugreport that this is a known issue in the nouveau driver that comes with mesa.

I was curious because I don't use the nouveau drivers from mesa. My system uses the radeon drivers from mesa.

After deinstalling mesa-dri-drivers-13.0.2-1.fc25.x86_64, which forces google-chrome to run in software mode (or disabled gpu mode) the freezing and locking went away.

So it looks like that the error is not just limited to nouveau but also extends to radon mesa drivers. Or any other part within mesa, that may trigger this issue for both drivers. Or some other parts that deals with DRI or however.

Comment 25 Anthony Messina 2016-12-18 22:01:45 UTC
(In reply to Kevin Kofler from comment #18)
> The latest is always at:
> https://copr.fedorainfracloud.org/coprs/kkofler/qtwebengine/package/mesa/
> (or just enable the kkofler/qtwebengine Copr altogether, but I might want to
> also put other testing packages there).

Kevin's nouvlock mesa builds fix this issue for me.

Sammy, can you update the version to 25?

Comment 26 Ben Skeggs 2016-12-18 23:52:24 UTC
The patches are definitely not recommended for general use, as the writer of them will tell you himself.

Also a note that I am currently working on fixing these issues.

Comment 27 Rex Dieter 2017-01-27 16:28:35 UTC
*** Bug 1416623 has been marked as a duplicate of this bug. ***

Comment 28 Anthony Messina 2017-01-29 04:26:28 UTC
Kevin, would you be willing to spin up another set of your nouvlock RPMs? After the mesa upgrade (https://bodhi.fedoraproject.org/updates/FEDORA-2017-9c9c0899f9), Kontact and QupZilla break again.

Comment 29 Kevin Kofler 2017-01-29 09:00:27 UTC
Yawn, don't update your mesa? :-)

I'll have a look at yet another new build tonight or tomorrow, after I'm back from DevConf Brno.

Comment 30 Anthony Messina 2017-01-29 18:03:23 UTC
(In reply to Kevin Kofler from comment #29)
> Yawn, don't update your mesa? :-)

Well, I keep hoping that one of these mesa updates will address the issue ;)
 
> I'll have a look at yet another new build tonight or tomorrow, after I'm
> back from DevConf Brno.

Take your time.  I've downgraded the mesa RPMs.

Comment 31 Kevin Kofler 2017-01-29 23:31:07 UTC
I updated my package, but I cannot build it because libglvnd is not in the stable updates. It will have to wait until they fix that fuckup.

Comment 32 info@kobaltwit.be 2017-02-28 18:36:38 UTC
(In reply to Ben Skeggs from comment #26)
> Also a note that I am currently working on fixing these issues.

Any update on this ? Is there a place we can track your work ?

Comment 33 info@kobaltwit.be 2017-05-07 12:19:23 UTC
A recent mesa update broke my Fedora 25/nouveau system once again this week :(

I keep hoping the next update would fix this issue. But it appears to be stalled. Last notification by a developer was from December. We're May now. Any update on this ?

Comment 34 Kevin Kofler 2017-05-07 13:39:04 UTC
You have to downgrade to the latest build from my Copr. I am going to build a new one now. Never upgrade Mesa from Fedora, I don't think it will ever be fixed.

Comment 35 Kevin Kofler 2017-05-07 14:49:54 UTC
mesa-17.0.5-2.fc25.nouvlock is now built in my Copr.

Comment 36 Sammy 2017-05-07 15:19:27 UTC
Why don't they accept these patches into Fedora mesa? I switched to using
nvidia rpmfusion binaries instead and have no problems.

Comment 37 info@kobaltwit.be 2017-05-07 16:57:13 UTC
Thanks Kevin for yet another nouvlock build. I was aware I could downgrade. My comment 33 was not intended to criticize you in any way.

by the way I also failed to notice the workaround posted by the original reporter:
Starting the misbehaving application(s) with LIBGL_ALWAYS_SOFTWARE=1 also avoids the crashes. I presume this is because the nouveau driver is simply avoided in this case...

So while there are ways around it I really would like some feedback from the Assignee or someone else with sufficient knowledge on whether there are plans to get this fixed or not.

Comment 38 Colin J Thomson 2017-09-08 18:31:04 UTC
FWIW, still seeing this in Fedora 26 and the current mesa/nouveau builds and the latest kernel from koji - 4.12.11-300.fc26.x86_64

I see this in the shell, most likely not helpful at all but here is a snippet:

kontact: pushbuf.c:727: nouveau_pushbuf_data: Assertion `kref' failed.
*** KMail got signal 6 (Exiting)
*** Dead letters dumped.
nouveau: kernel rejected pushbuf: Bad file descriptor
nouveau: ch3: krec 0 pushes 1 bufs 1 relocs 0
nouveau: ch3: buf 00000000 00000002 00000004 00000004 00000000
nouveau: ch3: psh 00000000 0000033604 00000337dc
nouveau:        0x80000671
nouveau:        0x800004b9
nouveau:        0x80ff0e04
nouveau:        0x200504d0
nouveau:        0x00008006

<snip>

[4859:4883:0908/191639.545897:ERROR:gles2_cmd_decoder.cc(2439)] [.RenderWorker-0x55bcc2525380]GL ERROR :GL_OUT_OF_MEMORY : ScopedTextureBinder::dtor: <- error from previous GL command
[4859:4883:0908/191639.545943:ERROR:gles2_cmd_decoder.cc(5202)] Error: 5 for Command kCopySubTextureCHROMIUM
[4859:4883:0908/191639.545982:ERROR:gles2_cmd_decoder.cc(4169)]   GLES2DecoderImpl: Trying to make lost context current.
[4859:4883:0908/191639.545997:ERROR:gles2_cmd_decoder.cc(4169)]   GLES2DecoderImpl: Trying to make lost context current.
[4859:4883:0908/191639.546009:ERROR:gles2_cmd_decoder.cc(4169)]   GLES2DecoderImpl: Trying to make lost context current.
[4859:4883:0908/191639.548041:ERROR:gles2_cmd_decoder.cc(4169)]   GLES2DecoderImpl: Trying to make lost context current.
KCrash: Application 'kontact' crashing...

Comment 39 Frank Danapfel 2017-09-27 09:23:18 UTC
(In reply to Colin J Thomson from comment #38)
> FWIW, still seeing this in Fedora 26 and the current mesa/nouveau builds and
> the latest kernel from koji - 4.12.11-300.fc26.x86_64
> 
> I see this in the shell, most likely not helpful at all but here is a
> snippet:
> 
> kontact: pushbuf.c:727: nouveau_pushbuf_data: Assertion `kref' failed.
> *** KMail got signal 6 (Exiting)
> *** Dead letters dumped.
> nouveau: kernel rejected pushbuf: Bad file descriptor
> nouveau: ch3: krec 0 pushes 1 bufs 1 relocs 0
> nouveau: ch3: buf 00000000 00000002 00000004 00000004 00000000
> nouveau: ch3: psh 00000000 0000033604 00000337dc
> nouveau:        0x80000671
> nouveau:        0x800004b9
> nouveau:        0x80ff0e04
> nouveau:        0x200504d0
> nouveau:        0x00008006
> 
> <snip>
> 
> [4859:4883:0908/191639.545897:ERROR:gles2_cmd_decoder.cc(2439)]
> [.RenderWorker-0x55bcc2525380]GL ERROR :GL_OUT_OF_MEMORY :
> ScopedTextureBinder::dtor: <- error from previous GL command
> [4859:4883:0908/191639.545943:ERROR:gles2_cmd_decoder.cc(5202)] Error: 5 for
> Command kCopySubTextureCHROMIUM
> [4859:4883:0908/191639.545982:ERROR:gles2_cmd_decoder.cc(4169)]  
> GLES2DecoderImpl: Trying to make lost context current.
> [4859:4883:0908/191639.545997:ERROR:gles2_cmd_decoder.cc(4169)]  
> GLES2DecoderImpl: Trying to make lost context current.
> [4859:4883:0908/191639.546009:ERROR:gles2_cmd_decoder.cc(4169)]  
> GLES2DecoderImpl: Trying to make lost context current.
> [4859:4883:0908/191639.548041:ERROR:gles2_cmd_decoder.cc(4169)]  
> GLES2DecoderImpl: Trying to make lost context current.
> KCrash: Application 'kontact' crashing...

Seeing the same issue on F26. Downgrading the mesa packages to the version from the COPR repo provided by Kevin Kofler fixed it for me:

# dnf repolist
Last metadata expiration check: 18 days, 23:42:38 ago on Free 08. Sep 2017 11:38:22 CEST.
repo id                                                                                                 repo name                                                                                                                      status
*fedora                                                                                                 Fedora 26 - x86_64                                                                                                             53.912
kkofler-qtwebengine                                                                                     Copr repo for qtwebengine owned by kkofler                                                                                         27
*updates
# sudo dnf downgrade mesa-libgbm-17.1.2-2.fc26.nouvlock mesa-filesystem-17.1.2-2.fc26.nouvlock mesa-libEGL-17.1.2-2.fc26.nouvlock mesa-libGL-17.1.2-2.fc26.nouvlock mesa-libglapi-17.1.2-2.fc26.nouvlock mesa-libwayland-egl-17.1.2-2.fc26.nouvlock mesa-libxatracker-17.1.2-2.fc26.nouvlock mesa-dri-drivers-17.1.2-2.fc26.nouvlock
...
# rpm -qa|grep -i mesa
mesa-libxatracker-17.1.2-2.fc26.nouvlock.x86_64
mesa-libglapi-17.1.2-2.fc26.nouvlock.x86_64
mesa-libEGL-17.1.2-2.fc26.nouvlock.x86_64
mesa-filesystem-17.1.2-2.fc26.nouvlock.x86_64
mesa-dri-drivers-17.1.2-2.fc26.nouvlock.x86_64
mesa-libwayland-egl-17.1.2-2.fc26.nouvlock.x86_64
mesa-libgbm-17.1.2-2.fc26.nouvlock.x86_64
mesa-libGL-17.1.2-2.fc26.nouvlock.x86_64
mesa-libGLU-9.0.0-11.fc26.x86_64

Comment 40 Kevin Kofler 2017-09-27 11:06:29 UTC
I pushed updated mesa builds to my Copr. You shouldn't ever have to downgrade mesa, ask me to update the packages instead. (Unfortunately, the Fedora mesa is a moving target, it is updated very frequently for issues affecting maybe a handful users while this showstopper is being systematically ignored.)

Comment 41 Frank Danapfel 2017-09-28 13:31:16 UTC
Kevin, thanks a lot for the updated mesa builds. Ive upgrade the mesa package to the latest version from your COPR repo and everything still works.

Is there anything that can be done to convince the mesa maintainers to stop ignoring this showstopper?

Comment 42 Colin J Thomson 2017-10-07 20:57:30 UTC
Thanks Kevin for keeping up with your builds.
I'm now testing F27 and on occasion get a total lockup of the box and a hard reset is needed. Related or not I don't know.
Here is some info I managed to recover but most likely no useful info at all for the Devs:

[ 7959.189037] WARNING: CPU: 3 PID: 5277 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_commit_hw_done+0x93/0xa0 [drm_kms_helper]
[ 7959.189038] Modules linked in: rfcomm ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep hwmon_vid btusb btrtl btbcm btintel bluetooth edac_mce_amd ecdh_generic kvm_amd rfkill kvm irqbypass ppdev joydev snd_hda_codec_via snd_hda_codec_generic k10temp snd_hda_codec_hdmi pcspkr parport_pc acpi_cpufreq snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep parport shpchp asus_atk0110 i2c_nforce2 snd_seq snd_seq_device snd_pcm snd_timer snd soundcore raid1
[ 7959.189055]  nouveau ata_generic video mxm_wmi wmi i2c_algo_bit drm_kms_helper ttm drm serio_raw pata_acpi forcedeth sata_nv pata_amd
[ 7959.189061] CPU: 3 PID: 5277 Comm: kworker/u12:2 Not tainted 4.13.3-301.fc27.x86_64 #1
[ 7959.189062] Hardware name: System manufacturer System Product Name/M4N68T-M-V2, BIOS 1001    12/21/2011
[ 7959.189138] Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
[ 7959.189139] task: ffff8d15f3b58000 task.stack: ffffb6c3475cc000
[ 7959.189144] RIP: 0010:drm_atomic_helper_commit_hw_done+0x93/0xa0 [drm_kms_helper]
[ 7959.189145] RSP: 0018:ffffb6c3475cfda8 EFLAGS: 00010282
[ 7959.189145] RAX: ffff8d15eca5d000 RBX: 0000000000000000 RCX: ffff8d16f0f07000
[ 7959.189146] RDX: ffff8d161c049b00 RSI: 000000000000004c RDI: ffff8d15f7128600
[ 7959.189146] RBP: ffffb6c3475cfdc8 R08: ffffffffc01bad90 R09: 0000000000000004
[ 7959.189147] R10: 0000000000000000 R11: ffff8d1662ea4400 R12: ffff8d16f0d94000
[ 7959.189147] R13: ffff8d15f7128a80 R14: ffff8d15f7128600 R15: ffff8d16f1c77810
[ 7959.189149] FS:  0000000000000000(0000) GS:ffff8d16ffcc0000(0000) knlGS:0000000000000000
[ 7959.189149] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7959.189150] CR2: 00007fc0fc372140 CR3: 00000001338ea000 CR4: 00000000000006e0
[ 7959.189151] Call Trace:
[ 7959.189174]  nv50_disp_atomic_commit_tail+0x847/0x39e0 [nouveau]
[ 7959.189195]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
[ 7959.189199]  process_one_work+0x193/0x3c0
[ 7959.189201]  worker_thread+0x4a/0x3a0
[ 7959.189203]  kthread+0x125/0x140
[ 7959.189204]  ? process_one_work+0x3c0/0x3c0
[ 7959.189205]  ? kthread_park+0x60/0x60
[ 7959.189208]  ret_from_fork+0x25/0x30
[ 7959.189209] Code: e8 49 8d 7d 30 e8 6e fc fa e7 41 c6 84 24 28 04 00 00 00 49 8b 4e 08 83 c3 01 39 99 38 03 00 00 7f 9d 5b 41 5c 41 5d 41 5e 5d c3 <0f> ff eb c5 f3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 
[ 7959.189220] ---[ end trace 37278333e2e4231b ]---
[ 7959.190969] nouveau 0000:02:00.0: gr: TRAP ch 2 [003fa63000 Xorg[1026]]
[ 7959.190979] nouveau 0000:02:00.0: gr: GPC0/TPC0/TEX: 80000049
[ 7959.190983] nouveau 0000:02:00.0: gr: GPC0/TPC1/TEX: 80000049
[ 7959.190994] nouveau 0000:02:00.0: fifo: read fault at 0006c80000 engine 00 [GR] client 01 [GPC0/T1_0] reason 02 [PTE] on channel 2 [003fa63000 Xorg[1026]]
[ 7959.191003] nouveau 0000:02:00.0: fifo: channel 2: killed
[ 7959.191004] nouveau 0000:02:00.0: fifo: runlist 0: scheduled for recovery
[ 7959.191008] nouveau 0000:02:00.0: fifo: engine 0: scheduled for recovery
[ 7959.191018] nouveau 0000:02:00.0: Xorg[1026]: channel 2 killed!
[ 7969.254103] [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:38:head-0] flip_done timed out
[ 7974.198059] nouveau 0000:02:00.0: Xorg[1026]: failed to idle channel 9 [Xorg[1026]]
[ 7974.198162] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:38:head-0] flip_done timed out
[ 7989.198122] nouveau 0000:02:00.0: Xorg[1026]: failed to idle channel 9 [Xorg[1026]]
[ 7989.198253] nouveau 0000:02:00.0: fifo: read fault at 0000013000 engine 07 [HOST0] client 07 [HOST_CPU] reason 02 [PTE] on channel 9 [003f611000 Xorg[1026]]
[ 7989.198502] nouveau 0000:02:00.0: fifo: channel 9: killed
[ 7989.198506] nouveau 0000:02:00.0: fifo: runlist 0: scheduled for recovery
[ 7989.198518] nouveau 0000:02:00.0: fifo: engine 7: scheduled for recovery
[ 7989.198649] nouveau 0000:02:00.0: Xorg[1026]: channel 9 killed!
[ 8004.234169] nouveau 0000:02:00.0: Xorg[1026]: failed to idle channel 7 [Xorg[1026]]
[ 8019.234210] nouveau 0000:02:00.0: Xorg[1026]: failed to idle channel 7 [Xorg[1026]]
[ 8019.234355] nouveau 0000:02:00.0: fifo: read fault at 0000013000 engine 07 [HOST0] client 07 [HOST_CPU] reason 02 [PTE] on channel 7 [003f74d000 Xorg[1026]]
[ 8019.234592] nouveau 0000:02:00.0: fifo: channel 7: killed
[ 8019.234597] nouveau 0000:02:00.0: fifo: runlist 0: scheduled for recovery
[ 8019.234874] nouveau 0000:02:00.0: Xorg[1026]: channel 7 killed!

Comment 43 Kevin Kofler 2017-10-08 12:19:15 UTC
I think that (comment #42) is an unrelated upstream bug. It seems to be the same as bug #1463157, though that one is on EL7.

Comment 44 Frank Danapfel 2017-10-17 09:31:56 UTC
The official mesa packages for F26 have been updated again, causing kmail to crash:

$ rpm -qa --last|grep -i mesa
mesa-libxatracker-17.2.2-2.fc26.x86_64        Free 13. Okt 2017 17:06:54 CEST
mesa-libwayland-egl-17.2.2-2.fc26.x86_64      Free 13. Okt 2017 17:06:54 CEST
mesa-libGL-17.2.2-2.fc26.x86_64               Free 13. Okt 2017 17:06:47 CEST
mesa-libEGL-17.2.2-2.fc26.x86_64              Free 13. Okt 2017 17:06:43 CEST
mesa-dri-drivers-17.2.2-2.fc26.x86_64         Free 13. Okt 2017 17:06:43 CEST
mesa-filesystem-17.2.2-2.fc26.x86_64          Free 13. Okt 2017 17:06:39 CEST
mesa-libgbm-17.2.2-2.fc26.x86_64              Free 13. Okt 2017 17:06:38 CEST
mesa-libglapi-17.2.2-2.fc26.x86_64            Free 13. Okt 2017 17:06:25 CEST
mesa-libGLU-9.0.0-11.fc26.x86_64              Maan 04. Sep 2017 17:43:06 CEST

Kevin, could you provide an updated version containing the fix via Copr?

Comment 45 Kevin Kofler 2017-10-20 00:24:21 UTC
Done:
https://copr.fedorainfracloud.org/coprs/kkofler/qtwebengine/build/645030/
mesa-17.2.3-1.fc2[678].nouvlock built, based on mesa-17.2.3-1.fc28 straight from Rawhide.

Comment 46 Sergio Monteiro Basto 2017-10-28 01:29:07 UTC
(In reply to Kevin Kofler from comment #3)
> Anybody seeing QtWebEngine (i.e., QupZilla 2, PIM 16.08, etc.) crashes with
> the Nouveau driver can try these experimental Mesa packages:
> https://copr.fedorainfracloud.org/coprs/kkofler/qtwebengine/build/454558/
> 
> This applies the experimental patches from:
> https://github.com/imirkin/mesa/commits/locking

This link is broken , anyway I'm confused if the patch if for mesa you should assign the bug to mesa not to Nouveau driver, IMO. 

> (except the s/nouveau/noouveau/ hack the author added to allow testing
> multithreading with Warsow). The feedback in the upstream bug report looks
> positive. But the author of the patches says they're incomplete. I think it
> needs to be a really high priority to complete them. The patchset hasn't
> been touched for a few weeks now, unfortunately.

what is the patch ? in [1] 

[1] 
http://copr-dist-git.fedorainfracloud.org/cgit/kkofler/qtwebengine/mesa.git/tree/

(In reply to Kevin Kofler from comment #9)
> So, all the users who tried my builds from:
> https://copr.fedorainfracloud.org/coprs/kkofler/qtwebengine/build/454558/
> so far reported that it fixed the crashes for them. Any chance we can get
> these patches applied in the official Fedora mesa packages? And also merged
> into upstream Mesa?

yes, you should try upstream the patch ( but where is it ? (the patch) ) again not to nouveau maintainers but to mesa maintainer ...

Comment 48 Kevin Kofler 2017-10-28 14:57:40 UTC
That's the main one, but there are 2 followups:
http://copr-dist-git.fedorainfracloud.org/cgit/kkofler/qtwebengine/mesa.git/tree/0003-nouveau-more-locking-make-sure-that-fence-work-is-al.patch
http://copr-dist-git.fedorainfracloud.org/cgit/kkofler/qtwebengine/mesa.git/tree/0004-nv30-locking-fixes.patch

There was a fourth patch in the branch (the 0002 that is skipped here), which was a hack changing the driver identifier reported to applications (introducing a misspelling in the word "nouveau": the patch makes nouveau_screen_get_vendor report "noouveau" (sic) instead) so that workarounds for Nouveau (e.g., in the game Warsow) would not be applied anymore. I decided against applying that patch in my builds.

Comment 49 Sergio Monteiro Basto 2017-10-28 16:02:07 UTC
may I change the bug component to mesa ?

Comment 50 Sergio Monteiro Basto 2017-10-31 02:24:46 UTC
Kevin the question is for you , may/should I change the bug component to mesa ?

Comment 51 Kevin Kofler 2017-10-31 03:22:50 UTC
No.

Last time I asked a question like that, I was told by the X maintainers that hardware-specific bugs in Mesa should be assigned to the driver whose Mesa part is at fault so that it gets assigned to the right people.

So unless they reassign it themselves or tell you to reassign it, please don't.

Comment 52 Fedora End Of Life 2017-11-16 19:32:41 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 53 Frank Danapfel 2017-11-17 10:50:11 UTC
Changed Fedora version in bugzilla to 26 to avoid auto-closing of this bug due to F25 becoming EOL soon.

Comment 54 Frank Danapfel 2017-11-17 12:51:21 UTC
@Kevin: unfortunately mesa on F26 has been updated again removing your patches:

$ rpm -qa --last|grep -i mesa
mesa-libwayland-egl-17.2.4-2.fc26.x86_64      Dunn 16. Nov 2017 17:59:00 CET
mesa-libxatracker-17.2.4-2.fc26.x86_64        Dunn 16. Nov 2017 17:58:48 CET
mesa-dri-drivers-17.2.4-2.fc26.x86_64         Dunn 16. Nov 2017 17:58:43 CET
mesa-filesystem-17.2.4-2.fc26.x86_64          Dunn 16. Nov 2017 17:56:15 CET
mesa-libGL-17.2.4-2.fc26.x86_64               Dunn 16. Nov 2017 17:56:11 CET
mesa-libglapi-17.2.4-2.fc26.x86_64            Dunn 16. Nov 2017 17:56:10 CET
mesa-libEGL-17.2.4-2.fc26.x86_64              Dunn 16. Nov 2017 17:56:08 CET
mesa-libgbm-17.2.4-2.fc26.x86_64              Dunn 16. Nov 2017 17:56:07 CET
mesa-libGLU-9.0.0-11.fc26.x86_64              Maan 04. Sep 2017 17:43:06 CEST

Could you therefore provide another update with the fixes via Copr?

Comment 55 Kevin Kofler 2017-11-17 15:41:38 UTC
So it's Groundhog Day again, sigh… :-(

And FESCo also refuses to do anything about this screwed up situation: https://pagure.io/fesco/issue/1785

Fedora stubbornly insists on shipping a non-working Mesa, even more than a whole year after this was brought to the maintainers' attention.

Comment 56 Kevin Kofler 2017-11-17 15:42:18 UTC
If at least they would stop shipping useless updates that do not fix the main showstopper and creating extra work for me that way…

Comment 57 Rex Dieter 2017-11-17 15:47:42 UTC
Comment #55 is not constructive in this thread, please make that the last of that type here.

Comment 58 Kevin Kofler 2017-11-17 16:34:50 UTC
New build completed:
https://copr.fedorainfracloud.org/coprs/kkofler/qtwebengine/build/676273/

Comment 59 Neal Gompa 2017-11-26 18:50:37 UTC
This is still not resolved and persists even in Rawhide.

Comment 60 Colin J Thomson 2018-01-13 22:57:31 UTC
Qt upstream have worked around this bug, its working well! F27..

https://bugzilla.redhat.com/show_bug.cgi?id=1350275#c30


Note You need to log in before you can comment on or make changes to this bug.