Bug 90002

Summary: binary compatibility for '_res' broken in glibc 2.3.x
Product: [Retired] Red Hat Linux Reporter: Gurusamy Sarathy <gsar>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 9CC: bdraco, drepper.fsp, fweimer, nick
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 2.3.2-27.9.4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-11-20 12:30:39 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Gurusamy Sarathy 2003-04-30 19:55:46 EDT
Description of problem:
The new glibc in RedHat 9 results in "incorrectly built binary"
errors when attempting to run executables that were built under
RedHat 6.2 (and possibly later versions--I haven't checked
if versions later than 6.2 have the same issue).

There appears to be no way (other than with low-level tricks
using dlopen/dlsym) to build a single binary that will work
under both RedHat 9 and under RedHat 6.x versions.  Vendors
are forced to build two separate distributions; one for
RedHat 9 and another for earlier versions.

Note that this affects code that follows all the rules about
not explicitly declaring _res and including the right
headers.  The headers in RedHat 6.x implement _res as a
variable, not as a #define that dereferences the pointer
returned by a function.  So this is tantamount to
discontinuing support for running RedHat 6.x binaries under
RedHat 9.

I see that this issue was already raised as bug#89286, but
was closed without really citing any justification for
why compatibility was broken.

Finally, I fail to see the benefit of this heavy-handed policy
of forcing a recompile of binaries that are single-threaded,
and therefore have nothing to gain from the new-fangled
_res implementation.


Version-Release number of selected component (if applicable):
glibc-2.3.2-11.9

How reproducible:
Every time.

Steps to Reproduce:
/* build this on RedHat 6.2 and run it on RedHat 9.0
 * to see the problem.
 * For a real-world example, try sendmail 8.12.9.
 */
#include <resolv.h>
#include <stdio.h>

int main(int ac, char **av) {
    printf("%lx\n", _res.options);
    return 0;
}

    
Actual results:
When the program is built on RedHat 6.2 and run on RedHat 9,
it produces the following message on stdout:
    Incorrectly built binary which accesses errno, h_errno or _res directly. 
Needs to be fixed.

Expected results:
The program should not produce any errors.

If you insist on producing errors, they should go to stderr,
not stdout.

Additional info:
Comment 1 Jakub Jelinek 2003-06-10 02:50:44 EDT
For stderr vs. stdout, I completely agree and changed _dl_printf to _dl_error_printf
in my copy.
The error is due, because /lib/tls/libc.so.6 does not have _res, errno and h_errno
variables and cannot have them without slowing all programs down considerably.
You can compile programs using _res on RHL 6.2 such that they will run
on RHL 6.2 and RHL 9, you just need to pass -D_REENTRANT.
Comment 2 Gurusamy Sarathy 2003-06-10 12:00:18 EDT
>You can compile programs using _res on RHL 6.2 such that they will run
>on RHL 6.2 and RHL 9, you just need to pass -D_REENTRANT.

I tried this with the three-line test case, and it doesn't work.

    pepper% gcc -D_REENTRANT res-test.c -o res-test
    caliper% ./res-test
    Incorrectly built binary which accesses errno, h_errno or _res directly. 
Needs to be fixed.

'pepper' is a RedHat 6.2 box, and 'caliper' is RedHat 9 with
glibc-2.3.2-11.9.
Comment 3 Scott Johnson 2003-09-27 05:00:35 EDT
It seems that Free Pascal is affected by this as well.  Not that it affects RH 
much, but as of today I've removed RH9 support from my supported platforms list 
until it's fixed.  Same code compiles fine under debian and others.  Under RH, 
I get this.

Linking playtime
/usr/lib/fpc/1.0.10/units/linux/inet/inet.o(.text+0x3a2): In function 
`_INET$$_$$_THOST_$$_NAMELOOKUP$STRING':
: undefined reference to `h_errno'
/usr/lib/fpc/1.0.10/units/linux/inet/inet.o(.text+0x40d): In function 
`_INET$$_$$_THOST_$$_ADDRESSLOOKUP$THOSTADDR':
: undefined reference to `h_errno'
/usr/lib/fpc/1.0.10/units/linux/inet/inet.o(.text+0x742): In function 
`_INET$$_$$_TNET_$$_NAMELOOKUP$STRING':
: undefined reference to `h_errno'
/usr/lib/fpc/1.0.10/units/linux/inet/inet.o(.text+0x79a): In function 
`_INET$$_$$_TNET_$$_ADDRESSLOOKUP$LONGINT':
: undefined reference to `h_errno'
/usr/lib/fpc/1.0.10/units/linux/inet/inet.o(.text+0xa6c): In function 
`_INET$$_$$_TSERVICE_$$_NAMELOOKUP$STRING$STRING':
: undefined reference to `h_errno'
/usr/lib/fpc/1.0.10/units/linux/inet/inet.o(.text+0xb37): more undefined 
references to `h_errno' follow
playtime.pas(1644) Error: Error while linking
Closing script ppas.sh
Comment 4 Jakub Jelinek 2003-09-27 05:11:38 EDT
That's Free Pascal bug though. We cannot do anything about it.
Comment 5 Scott Johnson 2003-09-27 05:27:07 EDT
It's been submitted to the folks at freepascal as well, however it does only 
appear on RH9.  Other platforms are running as expected.  I respectfully remove 
myself from the cc list and will discontinue support for RH9 in future 
releases. Thanks for the response.
Comment 6 Jakub Jelinek 2003-10-01 16:44:16 EDT
macro@freepascal.org wrote:
(I work on the FPC unix runtime, and will operate as contact for this problem)

Show me a unified way of accessing errno over all common Linux distro's  and
*BSD and I'll commit it.

The problem with Unix (POSIX actually) is that it never says what a symbol is,
macro or variable/function.

And only true symbols (as opposed to macro's) are accesable.
Comment 7 Jakub Jelinek 2003-10-01 16:46:51 EDT
#include <errno.h>
int *get_errno_address (void)
{
  return &errno;
}
is portable and you can use that get_errno_address fn in non-C/C++/ObjC languages
(which cannot #include <errno.h> themselves).

(for errno; for h_errno s/errno.h/netdb.h/g;s/errno/h_errno/g ).
Comment 8 Marco van de Voort 2003-10-02 03:01:58 EDT
Yes, I think that is the solution (Jakub's). Such routines (getter
+setters)should be included in glibc. Maybe for other important symbols that are
often implemented via macro's too. (though I can't think of one quickly, at
least not in the core functionality)

Using the code above as a C stub is not an option though, since that would add a
dependancy (of FPC) on the entire C building system. A dependancy that doesn't
exist now.
Comment 9 Jakub Jelinek 2003-10-02 07:09:12 EDT
C dependency just at FPC build time, not when you actually use FPC.
Is FPC not written in C at least partly?
Such accessors don't belong into glibc, glibc should not be bloated,
but can certainly be a separate library (but such library certainly can be
shipped with FPC).
Comment 10 Marco van de Voort 2003-10-02 09:14:58 EDT
FPC is written in itself (though Delphi and Kylix can probably compile it, but
that is proprietary and moreover not easy  since the makefiles don't help with that)

FPC has its own (syscall based) runtime. Release versions are static, and
therefore not distribution dependant. (we have one linux release, though vendors
usually adapt it to their own FS hierarchy and package format) 
A redhat only solution therefore is not an option, since this so called minor
quirk would force us to start making separate releases for each and every Linux
distro.

glibc is only needed when interfacing to C libs (libx11, gtk, qt etc, there are
several tens of interface units) and the few more OS related calls in libc, not
for the core system.

Using a separate C library _with FPC_ keeps the same problem (requiring
distribution specific builds), so a solution must be vendor supplied and
preferably universal over the linux distro's. Otherwise there comes a build
depandancy on the entire C build structure, one that doesn't exist, only to use
e.g. gtk.

IMHO the usage of libc as a central operating system library is wrong as long it
is specified at the C level, not as a universal application interface on a
binary level. I can't be sure that a symbol that is "in libc" is actually
accessable if I link to libc, it might be some daft macro.

And libc's API specification (libc headers) is impossible to parse/convert
automatically, leaving you to tedious manual conversion (and each version
again). (QT does this better, they have a header conversion app, and keep their
master headers in an relatively abstract format, it would be nice if this was
universal)

However usually most symbols on the C level map to the binary interface 1:1, so
all this has be relatively problem free.

RedHat breaks this now, for a major symbol, and this is only the second time FPC
has a problem like this since FPC is available on linux (since '95)

And the first time was a very major change (conversion to glibc, which broke the
startup files)


Comment 11 Jakub Jelinek 2003-10-02 09:25:21 EDT
Can you explain why it needs to be distribution specific in the errno/h_errno
cases? errno/h_errno macro definitions have not changed in glibc since mid 1996
(ie. all glibc 2.1.0+ systems will have errno defined to *__errno_location()
and h_errno to *__h_errno_location()).
Comment 12 Gurusamy Sarathy 2003-10-02 11:35:44 EDT
Can we please not confuse the _res issue with the errno issue?

This bug is about it not being possible to produce a single
binary that works across RedHat 6.x through 9 if the program
uses the _res symbol (see the simple 3-line test case in
comment #3 which proves Jakub's suggested "solution" isn't one).
It would be nice to know if RedHat intends to fix this in
glibc or not.

Thank you.
Comment 13 Jakub Jelinek 2003-10-02 11:42:07 EDT
_res is fixed in glibc-2.3.2-64 (ATM glibc-2.3.2-97), available in RHEL3/Fedora Core Test 2 etc.
When a glibc errata for RHL9 is made, this will surely be one of the things fixed
there.
Comment 14 Marco van de Voort 2003-10-03 07:23:02 EDT
Jakub: There changed something for the worse in the glibc packaged with RH9, I
can't see here if this is RH or glibc who did this. This is simply a fact, since
FPC doesn't work anymore

If the replacement code is universal, and goes back a couple of versions, that
means I at least keep compability within linux versions (but maybe still loose
Linux and *BSD doing this in the same way)

However I set up a RH9 install myself, and will do some testing with
alternatives first.

Comment 15 Need Real Name 2003-10-31 10:40:57 EST
I'm just a users and don't pretend to understand all the issues, but wanted to 
add a thank you for the chnaged glibc version. I had upgraded my RH7.3 glibc 
to 2.3.1 so that I could get a visual debugger to work but then found that my 
printer was printing an extra page, with the "incorrectly built binary" 
message.  This occurred for lpr, appplications such as OpenOffice, etc.  Since 
there was no indication of the program name actually causing the problem I 
could not see an easy fix.  But a Google search found this bugzilla so I 
upgraded again to glibc 2.3.2-74 and the problem disappeared, saving me 
considerable frustration.  
Comment 16 Ulrich Drepper 2003-11-04 16:50:07 EST
Should be fixed in the RHL9 errata.  Test version is available at

  ftp://people.redhat.com/jakub/glibc/errata/2.3.2-27.9.4/           
                                                                    

Try it and let us know.
Comment 17 Birol Aktas 2003-11-06 08:49:34 EST
I have installed the above test version of the RHL9 errata on my
system.  All the binaries that came on an application CD and
previously failing to work are seemed to be working just fine now.

Thanks.
Comment 18 Birol Aktas 2003-11-06 09:24:51 EST
I have installed the above test version of the RHL9 errata on my
system.  All the binaries that came on an application CD and
previously failing to work are seemed to be working just fine now.

Thanks.
Comment 19 Ulrich Drepper 2003-11-20 12:30:39 EST
Closing as fixed in current version.
Comment 20 Freddy Boisseau 2003-12-11 09:25:55 EST
I need this same fix for the Enterprise Version 3.