Bug 1332926 - Program terminated with signal SIGSEGV, Segmentation fault.
Summary: Program terminated with signal SIGSEGV, Segmentation fault.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: icecat
Version: 24
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Antonio T. (sagitter)
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-05-04 11:35 UTC by poma
Modified: 2016-07-02 15:24 UTC (History)
5 users (show)

Fixed In Version: icecat-38.8.0-5.fc23 icecat-38.8.0-12.fc24
Clone Of:
Environment:
Last Closed: 2016-07-02 15:24:25 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
coredumpctl gdb (18.79 KB, text/plain)
2016-05-04 11:38 UTC, poma
no flags Details
coredumpctl icecat-38.8 (20.05 KB, text/plain)
2016-05-19 06:49 UTC, poma
no flags Details
coredumpctl icecat-38.8.0-5 (6.93 KB, text/plain)
2016-06-20 23:06 UTC, poma
no flags Details
bt of crash on fc24 (x86_64) (7.19 KB, text/plain)
2016-06-22 09:50 UTC, Jens Lody
no flags Details
workaround for fc > 23 (1.23 KB, patch)
2016-06-26 08:47 UTC, Jens Lody
no flags Details | Diff
coredump icecat 38.8.0-12 - video acceleration (57.61 KB, text/plain)
2016-06-30 09:58 UTC, poma
no flags Details

Description poma 2016-05-04 11:35:16 UTC
Description of problem:
Program terminated with signal SIGSEGV, Segmentation fault.

Version-Release number of selected component (if applicable):
icecat-38.7.1-2.fc24.x86_64

How reproducible:
101%

Steps to Reproduce:
1. Command Line: icecat

Actual results:
Program terminated with signal SIGSEGV, Segmentation fault.

Expected results:
Program -NOT- terminated with signal SIGSEGV, Segmentation fault.

Additional info:
$ coredumpctl list 
TIME                            PID   UID   GID SIG PRESENT EXE
Wed 2016-05-04 07:17:48 EDT    1546  1000  1000  11 * /usr/lib64/icecat-38.7.1/icecat
Wed 2016-05-04 07:17:51 EDT    1583  1000  1000  11 * /usr/lib64/icecat-38.7.1/icecat
Wed 2016-05-04 07:17:53 EDT    1621  1000  1000  11 * /usr/lib64/icecat-38.7.1/icecat
Wed 2016-05-04 07:17:55 EDT    1660  1000  1000  11 * /usr/lib64/icecat-38.7.1/icecat

Comment 1 poma 2016-05-04 11:38:40 UTC
Created attachment 1153815 [details]
coredumpctl gdb

Comment 2 poma 2016-05-19 06:49:28 UTC
Created attachment 1159258 [details]
coredumpctl icecat-38.8

Comment 3 Fedora Update System 2016-06-18 08:41:06 UTC
icecat-38.8.0-5.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-32eb3c0fa5

Comment 4 Fedora Update System 2016-06-18 08:41:14 UTC
icecat-38.8.0-5.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0a274f518b

Comment 5 Fedora Update System 2016-06-19 09:28:04 UTC
icecat-38.8.0-5.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-32eb3c0fa5

Comment 6 Fedora Update System 2016-06-19 09:28:51 UTC
icecat-38.8.0-5.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-0a274f518b

Comment 7 poma 2016-06-20 23:06:29 UTC
Created attachment 1170015 [details]
coredumpctl icecat-38.8.0-5

Comment 8 Jens Lody 2016-06-22 09:50:24 UTC
Created attachment 1170652 [details]
bt of crash on fc24 (x86_64)

I attached a gdb-session with the crash on fc24.

The direct cause of the crash looks obvious:

<snip>
0x00007ffff4873bb3 in nsLayoutUtils::GetLastSibling (aFrame=0xe5e5e5e5e5e5e5e5) at /usr/src/debug/icecat-38.8.0/layout/base/nsLayoutUtils.cpp:1586
1586	  if (!aFrame) {
</snip>

That's the code where the crash happens:
<code>
// static
nsIFrame* nsLayoutUtils::GetLastSibling(nsIFrame* aFrame) {
  if (!aFrame) {
    return nullptr;
  }

  nsIFrame* next;
  while ((next = aFrame->GetNextSibling()) != nullptr) {
    aFrame = next;
  }
  return aFrame;
}
</code>

It looks lke icecat crashes, because it tries to access uninitialised memory (0xe5e5e5e5e5e5e5e5) and checks the memory against NULL (!aFrame), that must crash.
But I don't know the cause might be gcc6 related.
I can try to dig into it deeper hopefully this evening (UTC+2).

By the way: debuginfos are now available in icecat 38.8.0-5 .

Jens

Comment 9 Antonio T. (sagitter) 2016-06-22 09:59:03 UTC
(In reply to Jens Lody from comment #8)
> Created attachment 1170652 [details]
> bt of crash on fc24 (x86_64)
> 
> I attached a gdb-session with the crash on fc24.
> 
> The direct cause of the crash looks obvious:
> 
> <snip>
> 0x00007ffff4873bb3 in nsLayoutUtils::GetLastSibling
> (aFrame=0xe5e5e5e5e5e5e5e5) at
> /usr/src/debug/icecat-38.8.0/layout/base/nsLayoutUtils.cpp:1586
> 1586	  if (!aFrame) {
> </snip>
> 
> That's the code where the crash happens:
> <code>
> // static
> nsIFrame* nsLayoutUtils::GetLastSibling(nsIFrame* aFrame) {
>   if (!aFrame) {
>     return nullptr;
>   }
> 
>   nsIFrame* next;
>   while ((next = aFrame->GetNextSibling()) != nullptr) {
>     aFrame = next;
>   }
>   return aFrame;
> }
> </code>
> 
> It looks lke icecat crashes, because it tries to access uninitialised memory
> (0xe5e5e5e5e5e5e5e5) and checks the memory against NULL (!aFrame), that must
> crash.
> But I don't know the cause might be gcc6 related.
> I can try to dig into it deeper hopefully this evening (UTC+2).
> 
> By the way: debuginfos are now available in icecat 38.8.0-5 .
> 
> Jens

Okay.
You are more experienced than me on that. Would you like to co-maintain Icecat ?

Comment 10 Jens Lody 2016-06-22 11:08:46 UTC
(In reply to Antonio Trande from comment #9)
> Okay.
> You are more experienced than me on that. Would you like to co-maintain
> Icecat ?

In general: yes.

But please let me think about it a little more, it's a great responsibility, therefore the decision should be well thought.

Comment 11 Jens Lody 2016-06-26 08:47:12 UTC
Created attachment 1172466 [details]
workaround for fc > 23

Sorry for the delay.

The direct cause for the crash is, that gcc6 seems to be more aggressive in optimization, and sometimes (not always) optimizes out (at least) the mNextSibling-member of the nsIFrame-class.

Attached is a patch for the spec-file that lowers the optimization level on Fedora > 23, if arch is not arm (already lowered) and it's not a special debug-build (special optimization level).

And as answer to your question about co-maintaining: yes.

Comment 12 Fedora Update System 2016-06-26 19:01:33 UTC
icecat-38.8.0-10.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-16997a2a59

Comment 13 Fedora Update System 2016-06-27 22:56:33 UTC
icecat-38.8.0-5.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

Comment 14 Fedora Update System 2016-06-28 04:25:31 UTC
icecat-38.8.0-10.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-16997a2a59

Comment 15 Hin-Tak Leung 2016-06-28 22:09:00 UTC
While icecat-38.8.0-10.fc24.x86_64 does not segfault like icecat-38.7.1-2.fc24.x86_64 and 38.8.0-5.fc24.x86_64 , it is also noticeably slower than 38.8.0-5.fc23.x86_64 which I was using before upgrading to f24. So I am afraid the workaround is a bit poor.

How about switching to use clang to build?

Comment 16 Hin-Tak Leung 2016-06-29 01:54:05 UTC
I rather think it might be a good idea to try adding 
-fno-delete-null-pointer-checks to the compiler flags, instead of disabling all optimization - like
https://bugzilla.redhat.com/show_bug.cgi?id=1349856

since this problem is optimization related, and it seems that icecat is trying to access uninitiated memory. In fact the backtrace I got is rather different from the post above, which suggests the problem is elsewhere. gcc6 aggressively removing null checks seems to be a good candidate.

see "Optimizations remove null pointer checks for this" in
https://gcc.gnu.org/gcc-6/porting_to.html

How about giving -fno-delete-null-pointer-checks a try? That's fairly simple (same sort of change as switching optimization). I just don't have the CPU power/time to do it.

Comment 17 Jens Lody 2016-06-29 04:43:30 UTC
(In reply to Hin-Tak Leung from comment #16)
> I rather think it might be a good idea to try adding 
> -fno-delete-null-pointer-checks to the compiler flags, instead of disabling
> all optimization - like
> https://bugzilla.redhat.com/show_bug.cgi?id=1349856
> 
> since this problem is optimization related, and it seems that icecat is
> trying to access uninitiated memory. In fact the backtrace I got is rather
> different from the post above, which suggests the problem is elsewhere. gcc6
> aggressively removing null checks seems to be a good candidate.
> 
> see "Optimizations remove null pointer checks for this" in
> https://gcc.gnu.org/gcc-6/porting_to.html
> 
> How about giving -fno-delete-null-pointer-checks a try? That's fairly simple
> (same sort of change as switching optimization). I just don't have the CPU
> power/time to do it.

That was the first thing I tried, but it did not work.
Are you sure it's just the optimization thing or another conflict ?

Did you get any meaningful messages if you start from console ?

I had problems with librejs, which was disabled in the addons-settings, but still visible in the toolbar.

A clean profile is/was much faster but still slower than ff.

Nevertheless: I'm still working on a better solution, but the build-time of icecat is quite long and my time is limited due to my "real world" job unfortunately.

Comment 18 Hin-Tak Leung 2016-06-29 07:44:04 UTC
Your work-around of dropping optimization from -O2 to -O0 does work - at least, icecat-38.8.0-10.fc24.x86_64 does not segfault (and 38.8.0-5.fc24.x86_64 did). However icecat-38.8.0-10.fc24.x86_64 is noticeably slow and unresponsive.

By slow, I mean it is noticeably slower than how it was on f23 - I only just upgraded from f23 to f24 system-wide yesterday.

I took the easier way out - tried downgrading the rpm alone to f23 and did not work (icu and vpx dependencies), but I got a binary from ftp.gnu.org/gnu/icecat/ . The preb-built binary is probably less function-rich - I don't see vpx and icu in there - but it is fast and useable.

Unless it is code difference between 38.8.0-5.fc24.x86_64 and 38.8.0-10.fc24.x86_64 (what does the -gnu2 stands for?) , it seems it is just -O2 vs -O0 . Does -O1 give a non-segfault binary?

It would be a good idea to see which of these are newly introduced, besides -fno-delete-null-pointer-checks - compare
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
and
https://gcc.gnu.org/onlinedocs/gcc-5.4.0/gcc/Optimize-Options.html
?

Comment 19 Antonio T. (sagitter) 2016-06-29 14:24:15 UTC
In effect, with -O1 optimization IceCat looks faster.
Please, try this new release: http://koji.fedoraproject.org/koji/taskinfo?taskID=14699929

Comment 20 Jens Lody 2016-06-29 14:38:37 UTC
I'm currently bisecting the additional "-O2" flags.
I should find the problematic flags this afternoon/evening (UTC+1).

Comment 21 Hin-Tak Leung 2016-06-29 14:50:40 UTC
Oh, just in case it is important, I looked and saw that -fno-delete-null-pointer-checks was present in gcc 5.4 , and in fact 5.3 also, which was the version in f23. So the switch is not new, it just works much more aggressively in gcc 6.x. It is therefore not a good idea comparing the switches between 5.x and 6.x. Rather, I hope you are looking at the list between -01 and -02 instead on 6.x alone instead.
I haven't looked at https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html in details, but it is *possible* that some of the switches listed in the latter part of the page is switched on at -02, and not listed at the -02 list.

Good luck, and I appreciate the time you guys are taking to look at which switch breaks, to make everything all better... eventually !

Comment 22 Antonio T. (sagitter) 2016-06-29 14:56:28 UTC
Good work guys !

Comment 23 Jens Lody 2016-06-29 15:09:59 UTC
(In reply to Hin-Tak Leung from comment #21)
> [...]
> Rather, I hope you are looking at the list
> between -01 and -02 instead on 6.x alone instead.
> I haven't looked at https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> in details, but it is *possible* that some of the switches listed in the
> latter part of the page is switched on at -02, and not listed at the -02
> list.
> [...]

Currently I bisect the switches mentioned as additional "-O2" optimizations.
Unfortunately I got another SIGSEV, not directly at startup.
But this one might (hopefully) be related to the delete-null-pointer-check, so I try it with this flag turned off (and six others).
But building icecat is not done in five minutes (unfortunately).

Comment 24 Jens Lody 2016-06-29 19:51:01 UTC
I found the problematic switch and currently use local (mock-)build of icecat and it runs really gfast again.
This build: http://koji.fedoraproject.org/koji/taskinfo?taskID=14705189 should really fix the crash and the slowdown and also the crash I encountered while bisecting.

The problematic switch is:
> -finline-small-functions
>     Integrate functions into their callers when their body is smaller than
> expected function call code (so overall size of program gets smaller). The
> compiler heuristically decides which functions are simple enough to be worth
> integrating in this way. This inlining applies to all functions, even those
> not declared inline.
> 
>     Enabled at level -O2.

I also added "-fno-delete-null-pointer-checks", that indeed avoids (at least) another crash (most likely more depending on the workflow).

Jens

Comment 25 Fedora Update System 2016-06-30 03:53:40 UTC
icecat-38.8.0-12.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-118e6f7b5d

Comment 26 poma 2016-06-30 09:58:37 UTC
Created attachment 1174453 [details]
coredump icecat 38.8.0-12 - video acceleration

Comment 27 Antonio T. (sagitter) 2016-06-30 10:18:05 UTC
Apparently, no problem for me with icecat-38.8.0-12.fc24 until now.

Comment 28 Jens Lody 2016-06-30 10:19:42 UTC
(In reply to poma from comment #26)
> Created attachment 1174453 [details]
> coredump icecat 38.8.0-12 - video acceleration

That is a new one.
I have a nvidia-card in my laptop and can test it this evening.
Can you give me some steps to reproduce the issue ?

Comment 29 Fedora Update System 2016-06-30 22:26:25 UTC
icecat-38.8.0-12.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-118e6f7b5d

Comment 30 Hin-Tak Leung 2016-07-01 22:36:48 UTC
I had a crash after about 90 minutes of light use with icecat-38.8.0-12.fc24 . Apologies the backtrace was posted to the wrong bug:

https://bugzilla.redhat.com/show_bug.cgi?id=1347420#c19

So unfortunately icecat-38.8.0-12.fc24 is still not usable for me.

Comment 31 Jens Lody 2016-07-02 07:51:39 UTC
(In reply to Hin-Tak Leung from comment #30)
> I had a crash after about 90 minutes of light use with icecat-38.8.0-12.fc24
> . Apologies the backtrace was posted to the wrong bug:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1347420#c19
> 
> So unfortunately icecat-38.8.0-12.fc24 is still not usable for me.

Is there a way to reproduce this crash ?
I keep icecat in gdb open several days with a lots of tabs open (currently ~70 tabs grouped using "AutoGroup" and "Tab Groups Helper").
Your crash seems to have happened somewhere in javascript related sources.
Do you use LibreJS or another JS-blocker ?
I use uMatrix and uBlockOrigin and block all scripts on all sites by default.

Comment 32 Hin-Tak Leung 2016-07-02 14:41:46 UTC
I have No Script and Spyblock and libreJS - the latter two shipped with icecat, the former from Fedora. I don't actively use the latter two (libreJS seems very zealous!) , but tends to occasionally unblocks things with No Script.

LbreJS seems quite annoying as it even breaks github :-). I use a different browser when I access gitbub...

Sorry I don't have a way of producing the crash - have switched back to using the pre-built binary from www.gnu.org . OTOH, I am hanging on to the core file (3.3GB! in /var/spool/abrt/ccpp-2016-07-01-05:34:35-2542 ) for a while and would be happy to run any gdb commands you want me to run.

Comment 33 Fedora Update System 2016-07-02 15:24:20 UTC
icecat-38.8.0-12.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.