Red Hat Bugzilla – Bug 156647
(gcc -O2) elinks segfault on ppc & ia64
Last modified: 2007-11-30 17:11:05 EST
elinks seems to segfault on ppc when going to http://gate.crashing.org/~benh/xorg
#0 0x10058934 in doc_loading_callback ()
#1 0x100507c0 in connect_info ()
#2 0x100507c0 in connect_info ()
... and on ia64 too.
Created attachment 114132 [details]
Add debugging printouts to find_in_cache()
The failure on ppc goes away after changing
CFLAGS="-O2 -g -W -Wall $(getconf LFS_CFLAGS)"
Maybe the attached data will help somebody figure it out,
I don't know the code at all: applying the attached patch
shows that find_in_cache() starts returning NULL with -O2.
Created attachment 114133 [details]
Log from -O0
Created attachment 114134 [details]
Log from -O2
Both logs are from running 'elinks http://gate.crashing.org/~benh/xorg 2>$log';
I wasn't able to test elinks on ia64.
Note, I think it's possible test it on arbitrary HTML page. I had a problem with
elinks from actual FC4 and with upstream version 0.10.5 on pages like
If this happens with -O2 and not -O0, shouldn't this be assigned to gcc?
Generally, if something works with -O0 and does not with -O2, it is more often
an application bug than GCC bug. Only when you debug it and prove it is indeed
a GCC bug it should be reassigned to GCC.
Particularly in this case, the bug goes away with -O2 -fno-strict-aliasing,
and there are 94 places where GCC warns about aliasing problems:
grep warning.*type-punned elinks.log | sort -u | wc -l
Plus there are several places where the code violates those but GCC does not
Say in find_in_cache, all the lists.h macros used there are buggy.
And error.h even shows that the authors see the problems, just for unknown
reason can't admit it is their bug and not a compiler bug:
/* This function does nothing, except making compiler not to optimize certains
* spots of code --- this is useful when that particular optimization is buggy.
* So we are just workarounding buggy compilers. */
/* This function should be always used only in context of compiler version
* specific macros. */
void do_not_optimize_here(void *x);
#if defined(__GNUC__) && __GNUC__ == 2 && __GNUC_MINOR__ <= 7
#define do_not_optimize_here_gcc_2_7(x) do_not_optimize_here(x)
#if defined(__GNUC__) && __GNUC__ == 3
#define do_not_optimize_here_gcc_3_x(x) do_not_optimize_here(x)
#if defined(__GNUC__) && __GNUC__ == 3 && __GNUC_MINOR__ == 3
#define do_not_optimize_here_gcc_3_3(x) do_not_optimize_here(x)
The lists implementation is broken by design, it just can't work that way.
You can't access the same object through aliasing incompatible types.
But lists.h is doing that a lot, it sometimes accesses next/prev as void *,
sometimes as struct cache_entry *, etc.
Cleanest fix IMHO would be to use a void *next; void *prev; structure and
put that structure as first field into the various structures that are chained
into lists, say:
struct list_head_elinks head;
and then the macro use cached->head.prev, etc. What will also work
is just make the prev/next pointers void *, but directly in the structure, say
void *next; void *prev;
void *next; void *prev;
But writing/reading through void ** pointer and then writing/reading through
struct cache_entry ** pointer is violation of ISO C99 6.5 (6,7).
Jakub, thanks again.