Description of problem:
I have a collection of object files that manipulate a variable in thread-local
storage (TLS). Some of these object files were compiled with "-fpic" and some
where not. The final executable is being linked using "-Wl,--export-dynamic".
I find that the object files compiled without "-fpic" see one address for the
TLS variable, while the object files compiled without "-fpic" see a different
address. Changing the value at one address has no affect on the other, of
course. It's like I have two completely different variables. Of the two, only
the one seen by the non-fpic code has the expected initial value. That suggests
that the non-fpic one is the "right" one in some sense, and that the
pic-compiled code is getting a "wrong" address located somewhere else.
If I compile everything with "-fpic", the problem goes away. If I compile
nothing with "-fpic", the problem goes away. If I omit "-Wl,--export-dynamic"
at link time, the problem goes away. If I don't make the variable thread-local,
the problem goes away. Only with all of these things going on at the same time
does the bug appear.
Perhaps now you understand why the summary for this report is so densely worded.
It's not clear to me whether this is a gcc bug, a linker bug, or perhaps even a
runtime loader bug. I'm starting this with gcc, somewhat arbitrarily. We might
need to reassign it if we determine the root cause is elsewhere.
Version-Release number of selected component (if applicable):
The problem as described is 100% reproducible. I will attach a small collection
of files, including test script, that can be used to demonstrate the bug.
Steps to Reproduce:
1. Unpack the "bug.tar.gz" archive attached to this report.
2. Run the "run" script.
TLS variable has a different address in main.o and init.o. Only the one in
main.o has the proper initial value (92), and assigning to one does not affect
main.c: 7: in main(): before init: *0xb7effa9c == 92
init.c: 9: in init(): initial value: *0xb7effaa4 == -1209008392
init.c:11: in init(): after assignment: *0xb7effaa4 == 14
main.c: 9: in main(): after init: *0xb7effa9c == 92
All object files should agree on the variable's address and should see its
initial value as 92. After init() changes this to 14, main() should also see
the value as 14:
main.c: 7: in main(): before init: *0xb7f46a9c == 92
init.c: 9: in init(): initial value: *0xb7f46a9c == 92
init.c:11: in init(): after assignment: *0xb7f46a9c == 14
main.c: 9: in main(): after init: *0xb7f46a9c == 14
It doesn't matter which object is compiled with "-fpic". Whichever object was
compiled using "-fpic", that's the object that will see the copy of the variable
that was not initialized to 92.
I named a few actions above that eliminate the problem (e.g. removing the linker
flag or not mixing pic/non-pic). However, none of these are really viable
workarounds for me. The example I've attached to this bug report is *much*
simplified from the original context in which I'm seeing the problem. The
original context is an instrumented build of gnome-panel for the Cooperative Bug
Isolation Project (http://www.cs.wisc.edu/cbi/). gnome-panel does not use
"-fpic" but does use "-Wl,--export-dynamic", whereas CBI's instrumentation
infrastructure has to be compiled using "-fpic" since it is sometimes linked in
with shared libraries. So in this original context, I really don't have any
choice but to mix pic/non-pic and to use the "--export-dynamic" linker flag.
Until this bug is resolved somehow, I cannot post working instrumented
gnome-panel packages. :-(
Created attachment 131294 [details]
source files and script to demonstrate the problem
For FC-5 binutils the patch reversion can't be done (the bogus patch was applied
only after FC-5 froze), so only the last hunk in bfd/elf32-i386.c together
with ld/testsuite/ld-i386/tlsbin.dd fix is needed.
Should be fixed in binutils-22.214.171.124.2-4 in rawhide.
Thank you for the speedy response to this somewhat obscure problem, Jakub!
Regarding the fix in rawhide, can you tell me if this affects build-time tools
only, or is this also a change to the run-time loader? That is, if I were to
use the rawhide binutils to *create* the executable, would it work properly on a
different machine that was still using the standard FC5 binutils without this fix?
Yes, the bug is only in binutils, not glibc nor gcc.
Ah, OK. Somehow I'd convinced myself that binutils included the runtime loader.
I see now that glibc provides that. So a binutils-only bug means it only needs
to be fixed on the developer's machine. Got it.