Bug 156647 - (gcc -O2) elinks segfault on ppc & ia64
(gcc -O2) elinks segfault on ppc & ia64
Product: Fedora
Classification: Fedora
Component: elinks (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Karel Zak
Depends On:
Blocks: FC4Blocker
  Show dependency treegraph
Reported: 2005-05-02 16:58 EDT by Jeremy Katz
Modified: 2007-11-30 17:11 EST (History)
3 users (show)

See Also:
Fixed In Version: 0.10.3-2
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-05-10 08:37:08 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Add debugging printouts to find_in_cache() (1.16 KB, patch)
2005-05-07 18:11 EDT, Miloslav Trmač
no flags Details | Diff
Log from -O0 (2.88 KB, text/plain)
2005-05-07 18:12 EDT, Miloslav Trmač
no flags Details
Log from -O2 (2.72 KB, text/plain)
2005-05-07 18:13 EDT, Miloslav Trmač
no flags Details

  None (edit)
Description Jeremy Katz 2005-05-02 16:58:52 EDT
elinks seems to segfault on ppc when going to http://gate.crashing.org/~benh/xorg

#0  0x10058934 in doc_loading_callback ()
#1  0x100507c0 in connect_info ()
#2  0x100507c0 in connect_info ()
Comment 1 Karel Zak 2005-05-05 15:01:05 EDT
... and on ia64 too.
Comment 2 Miloslav Trmač 2005-05-07 18:11:00 EDT
Created attachment 114132 [details]
Add debugging printouts to find_in_cache()

The failure on ppc goes away after changing
CFLAGS="-O2 -g -W -Wall $(getconf LFS_CFLAGS)"
by s/-O2/-O0/.

Maybe the attached data will help somebody figure it out,
I don't know the code at all: applying the attached patch
shows that find_in_cache() starts returning NULL with -O2.
Comment 3 Miloslav Trmač 2005-05-07 18:12:03 EDT
Created attachment 114133 [details]
Log from -O0
Comment 4 Miloslav Trmač 2005-05-07 18:13:46 EDT
Created attachment 114134 [details]
Log from -O2

Both logs are from running 'elinks http://gate.crashing.org/~benh/xorg 2>$log';

I wasn't able to test elinks on ia64.
Comment 5 Karel Zak 2005-05-07 20:10:09 EDT
Note, I think it's possible test it on arbitrary HTML page. I had a problem with
elinks from actual FC4 and with upstream version 0.10.5 on pages like
Comment 6 Warren Togami 2005-05-08 00:15:57 EDT
If this happens with -O2 and not -O0, shouldn't this be assigned to gcc?
Comment 7 Jakub Jelinek 2005-05-09 05:56:02 EDT
Generally, if something works with -O0 and does not with -O2, it is more often
an application bug than GCC bug.  Only when you debug it and prove it is indeed
a GCC bug it should be reassigned to GCC.
Particularly in this case, the bug goes away with -O2 -fno-strict-aliasing,
and there are 94 places where GCC warns about aliasing problems:
grep warning.*type-punned elinks.log | sort -u | wc -l
Plus there are several places where the code violates those but GCC does not
Say in find_in_cache, all the lists.h macros used there are buggy.
And error.h even shows that the authors see the problems, just for unknown
reason can't admit it is their bug and not a compiler bug:
/* This function does nothing, except making compiler not to optimize certains
 * spots of code --- this is useful when that particular optimization is buggy.
 * So we are just workarounding buggy compilers. */
/* This function should be always used only in context of compiler version
 * specific macros. */
void do_not_optimize_here(void *x);

#if defined(__GNUC__) && __GNUC__ == 2 && __GNUC_MINOR__ <= 7
#define do_not_optimize_here_gcc_2_7(x) do_not_optimize_here(x)
#define do_not_optimize_here_gcc_2_7(x)

#if defined(__GNUC__) && __GNUC__ == 3
#define do_not_optimize_here_gcc_3_x(x) do_not_optimize_here(x)
#define do_not_optimize_here_gcc_3_x(x)

#if defined(__GNUC__) && __GNUC__ == 3 && __GNUC_MINOR__ == 3
#define do_not_optimize_here_gcc_3_3(x) do_not_optimize_here(x)
#define do_not_optimize_here_gcc_3_3(x)

The lists implementation is broken by design, it just can't work that way.
You can't access the same object through aliasing incompatible types.
But lists.h is doing that a lot, it sometimes accesses next/prev as void *,
sometimes as struct cache_entry *, etc.
Cleanest fix IMHO would be to use a void *next; void *prev; structure and
put that structure as first field into the various structures that are chained
into lists, say:
struct cache_entry
  struct list_head_elinks head;
and then the macro use cached->head.prev, etc.  What will also work
is just make the prev/next pointers void *, but directly in the structure, say
struct cache_entry
  void *next; void *prev;
and have
struct list_head_elinks
  void *next; void *prev;

But writing/reading through void ** pointer and then writing/reading through
struct cache_entry ** pointer is violation of ISO C99 6.5 (6,7).
Comment 8 Miloslav Trmač 2005-05-10 08:37:08 EDT
Jakub, thanks again.

Note You need to log in before you can comment on or make changes to this bug.