Bug 1678897 - [F29] firefox on wayland crash when selecting text with click-scroll-shift+click
Summary: [F29] firefox on wayland crash when selecting text with click-scroll-shift+click
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: firefox
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Martin Stransky
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ffwayland
TreeView+ depends on / blocked
 
Reported: 2019-02-19 20:19 UTC by Chris Murphy
Modified: 2019-07-26 05:53 UTC (History)
12 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-07-26 05:53:34 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
coredumpctl info (48.82 KB, text/plain)
2019-02-28 23:39 UTC, Chris Murphy
no flags Details
crash_bt (500.16 KB, text/plain)
2019-03-01 18:06 UTC, Chris Murphy
no flags Details
ffwayland coredumpctl info (48.47 KB, text/plain)
2019-03-28 18:24 UTC, Chris Murphy
no flags Details

Description Chris Murphy 2019-02-19 20:19:36 UTC
Description of problem:

When I try to select text by clicking to set cursor location, then scrolling to a new location, then shift+clicking to select that range of text - there is a crash.


Version-Release number of selected component (if applicable):
firefox-65.0-4.fc29.x86_64

How reproducible:
100%


Steps to Reproduce:
1. Wayland session; launch "Firefox on Wayland" app icon
2. Open gmail, open a draft or reply to email that has enough text to scroll through; I'm always selecting at least 50 lines.
3.

Actual results:

Crash

Expected results:

No crash

Additional info:

- Crash doesn't happen if the regular Firefox app icon is used
- Crash may not happen for smaller ranges of text, untested.
- Crash may only  happen in gmail, untested.
- This is the bug alluded to in bug 1676331 that I'm unable to gather any crash information; maybe someone else will have more luck by reproducing it

This is the crash report the built-in mozilla crash reporter produced.
https://crash-stats.mozilla.com/report/index/e321bb3a-0865-4aff-8923-086a90190219#tab-telemetryenvironment

Comment 1 Martin Stransky 2019-02-20 22:33:30 UTC
Can you please try to obtain a backtrace according to https://fedoraproject.org/wiki/Debugging_guidelines_for_Mozilla_products#Application_crash ?

Comment 2 Chris Murphy 2019-02-20 22:40:21 UTC
That fails as I described in bug 
https://bugzilla.redhat.com/show_bug.cgi?id=1676331#c3

Comment 3 Chris Murphy 2019-02-22 17:26:37 UTC
I've reproduced the bug in qemu-kvm using virt-manager, although it's more difficult to reproduce, maybe 1 in 10 attempts. I left gdb running for 11 hours and still "crash_bt" file is zero bytes; I've done 'coredumpctl gdb' on firefox coredumps before and they take maybe a minute.

Comment 4 Martin Stransky 2019-02-28 12:27:25 UTC
Please install this build when it's done - https://koji.fedoraproject.org/koji/buildinfo?buildID=1217476 - and try to attach ABRT report of the crash. Thanks.

Comment 5 Chris Murphy 2019-02-28 23:37:13 UTC
Feb 28 16:28:51 flap.local firefox-wayland.desktop[1688]: Exiting due to channel error.
Feb 28 16:28:51 flap.local firefox-wayland.desktop[1688]: Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=35.7221) [>
Feb 28 16:28:51 flap.local firefox-wayland.desktop[1688]: Exiting due to channel error.
Feb 28 16:28:51 flap.local firefox-wayland.desktop[1688]: [Child 3908, Chrome_ChildThread] WARNING: pipe error (33): Connection reset by peer: file /builddir/build/BUILD/fi>
Feb 28 16:28:51 flap.local firefox-wayland.desktop[1688]: [Child 3908, Chrome_ChildThread] WARNING: pipe error (3): Connection reset by peer: file /builddir/build/BUILD/fir>
Feb 28 16:28:51 flap.local firefox-wayland.desktop[1688]: Exiting due to channel error.
Feb 28 16:28:52 flap.local systemd-coredump[4117]: Process 3782 (firefox) of user 1000 dumped core.
[snipped stack trace due to length]                                              
Feb 28 16:28:53 flap.local abrtd[811]: Size of '/var/spool/abrt' >= 5000 MB (MaxCrashReportsSize), deleting old directory 'ccpp-2019-02-14-19:37:06.612833-11893'
Feb 28 16:28:54 flap.local abrt-server[4124]: Package 'firefox' isn't signed with proper key
Feb 28 16:28:54 flap.local abrt-server[4124]: 'post-create' on '/var/spool/abrt/ccpp-2019-02-28-16:28:53.316936-3782' exited with 1
Feb 28 16:28:54 flap.local abrt-server[4124]: Deleting problem directory '/var/spool/abrt/ccpp-2019-02-28-16:28:53.316936-3782'

I'm trying to process this coredump file with `coredumpctl gdb`...

Comment 6 Chris Murphy 2019-02-28 23:39:22 UTC
Created attachment 1539673 [details]
coredumpctl info

Comment 7 Chris Murphy 2019-03-01 00:19:15 UTC
Ok so processing that coredump file with `coredump gdb` fails also, it loads symbols, gets to the segmentation fault, and that's it, gdb and kswapd0 hog the system and turn it into an unusable hair dryer. But I have a coredump file.

61MB LZ4
https://drive.google.com/open?id=1JVzM26Qy3CfynAT_1vELvSlZWQdEcq9V

Comment 8 Martin Stransky 2019-03-01 07:08:49 UTC
I see. That may be caused by the PGO+LTO optimization enabled for F28/29. I'll produce a build without those optimizations.

Comment 9 Martin Stransky 2019-03-01 07:19:13 UTC
New test builds are here, I hope the debug setup goes through koji - https://koji.fedoraproject.org/koji/taskinfo?taskID=33114131

Comment 10 Martin Stransky 2019-03-01 07:23:09 UTC
Anyway, from the backtrace you posted it looks like the bug happens in gtk_im_context module - I'll try to find related parts in widget/gtk. Thanks!

Comment 11 Martin Stransky 2019-03-01 07:26:29 UTC
My candidate is gtk_im_context_set_surrounding() call from IMContextWrapper.cpp but I'd need more detailed trace to be sure.

Comment 12 Martin Stransky 2019-03-01 14:57:03 UTC
Can you please try this build? https://koji.fedoraproject.org/koji/taskinfo?taskID=33114171
It has disabled PGO/LTO and upstream crash reporter, ABRT should catch the crash. Also don't forget to install debuginfo packages from this build.
Thanks.

Comment 13 Chris Murphy 2019-03-01 18:06:05 UTC
Created attachment 1539885 [details]
crash_bt

OK super I think this worked finally.

1.
$ firefox-wayland --name ffwayland --safe-mode --ProfileManager
2.
Choose a "test" profile that's clean (no settings changes from defaults)
3.
Trigger the crash
4.
In the console I see
Type 'gdb /usr/lib64/firefox/firefox 16008' to attach your debugger to this thread.
So I go do that in a separate shell
5.
collect crash_bt file as described in the debugging guidelines for mozilla products wiki, and that's the file attached.

Comment 14 Martin Stransky 2019-03-04 07:39:22 UTC
Great, Thanks!

Comment 15 Martin Stransky 2019-03-28 14:21:41 UTC
Hm, I'm still unable to reproduce. Do you run any special desktop localization (locale), any special GTK input module? (GTK_IM_MODULE) What does print "gsettings get org.gnome.desktop.interface gtk-im-module" on console?

Comment 16 Chris Murphy 2019-03-28 17:34:50 UTC
(In reply to Martin Stransky from comment #15)
> Hm, I'm still unable to reproduce. Do you run any special desktop
> localization (locale), any special GTK input module? (GTK_IM_MODULE)

$ gsettings get org.gnome.desktop.interface gtk-im-module
'ibus'

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

However, as mentioned in comment 3, I was able to reproduce in a VM booted with Workstation Live ISO. I just tried this again with 20190326 compose of Workstation Live which has Firefox 66.0.1-1 and can't reproduce it there after manually installing Firefox-Wayland in the live environment.

Back on baremetal, which has firefox-66.0.1-1.fc30.x86_64, I also can no longer get a crash. However, if the initial cursor placement point is out of the display area, I must double click to get a selection. e.g.

1. click to place cursor in body of text
2. scroll so the cursor is no longer visible
3. shift+click does nothing
4. 2nd shift+click selects range of text

whereas if I skip step 2, cursor is visible, shift+click selects range on the first try. Seems plausible this is related to the crash, where shift+click with a non-visible cursor is somehow causing confusion, before it was crashing and now its behaving as a one time no op.

Comment 17 Chris Murphy 2019-03-28 17:41:42 UTC
(In reply to Chris Murphy from comment #16)
> 1. click to place cursor in body of text
> 2. scroll so the cursor is no longer visible
> 3. shift+click does nothing
> 4. 2nd shift+click selects range of text

This behavior happens with Firefox and Firefox Wayland.

Also, I just realized the crashers were always on Fedora 29. I'm not certain I tested on Fedora 30 until now.

Comment 18 Chris Murphy 2019-03-28 18:22:53 UTC
OK back on Fedora 29, update to
firefox-66.0.1-1.fc29.x86_64
firefox-wayland-66.0.1-1.fc29.x86_64

And ffwayland still crashes:

1. click to place cursor in body of text
2. scroll so the cursor is no longer visible
3. shift+click does nothing immediately, but in about 2-3 seconds it crashes

Comment 19 Chris Murphy 2019-03-28 18:24:19 UTC
Created attachment 1549183 [details]
ffwayland coredumpctl info

firefox-wayland-66.0.1-1.fc29.x86_64

Comment 20 Martin Stransky 2019-03-29 12:56:23 UTC
I suspect it's https://bugzilla.mozilla.org/show_bug.cgi?id=1539773 which is already fixed in Fedora 30. May be also reason I can't reproduce it here.

Comment 21 Chris Murphy 2019-03-29 18:13:22 UTC
Confirmed that control-a and right-click select all also cause the crash. Koji has gtk3 3.24.3-1 for fc29, but bodhi says it was obsoleted by 3.24.1-2.

Comment 22 Martin Stransky 2019-07-25 14:24:36 UTC
Please can you try F30 when you get a chance? It may be also related to different input method/module. Thanks.

Comment 23 Chris Murphy 2019-07-25 20:25:33 UTC
I haven't seen this problem since upgraded to Fedora 30, it's been a couple months at least.

Comment 24 Martin Stransky 2019-07-26 05:53:34 UTC
Great, Thanks.


Note You need to log in before you can comment on or make changes to this bug.