Applix 4.4.1 (announcing itself as "4.41 (1021.221.13)") segvs on startup with the glibc-2.2 errata applied. The entire result of running applix is: >>> [sct@spock] ~ $ /opt/applix/applix [sct@spock] ~ $ axmain: signal Segmentation fault axnet error, axmain already started. Try Again. axmain: signal Segmentation fault >>> glibc-2.1.94-3 is fine.
Additional info, in case it helps to track down an obvious error: The failure is always in the same place: doing a stat. The ltrace shows [pid 4803] fopen("/net/opt/applix/axdata/fontmetri"..., "r") = 0x084c3938 [pid 4803] fileno(0x084c3938) = 11 [pid 4803] _fxstat(1, 11, 0xbfffd17c, 0x082e0030, 21 <unfinished ...> [pid 4803] --- SIGSEGV (Segmentation fault) --- and the corresponding strace gives [pid 4852] open("/net/opt/applix/axdata/fontmetrics/gallium/fs/aplxfont/fonts.dir", O_RDONLY) = 11 [pid 4852] --- SIGSEGV (Segmentation fault) --- so the _fxstat is failing before it gets as far as the system call. Using glibc-2.1.94-3, the ltrace shows [pid 4833] fopen("/net/opt/applix/axdata/fontmetri"..., "r") = 0x084c3938 [pid 4833] fileno(0x084c3938) = 11 [pid 4833] _fxstat(1, 11, 0xbfffd17c, 0x082e0030, 21) = 0 at the same location --- the arguments are precisely the same but the call fails.
Can you please run gdb on the generated core? It looks like __fxstat with vers == _STAT_VER_KERNEL which does just: if (vers == _STAT_VER_KERNEL) return INLINE_SYSCALL (fstat, 2, fd, CHECK_1 ((struct kernel_stat *) buf)); so I'm interested if you could find out crashing $pc and disas that routine to make sure we're looking at the same code.
There's no core file --- axmain has set up a sig11 handler. However, sending the process a SIGSTOP lets me attach gdb to it, and I can get as far as the segv that way. I see: >>>> (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x0 in ?? () (gdb) bt #0 0x0 in ?? () #1 0x83082f4 in fstat () #2 0x8307510 in FontFileReadDirectory () #3 0x82e0046 in FontFileInitFPE () [ lots more application stack frames ] >>>> which is calling fstat from deep in font server country. Dumping the args list for frame 1 on the stack shows the args (0x0000000b, 0x08307510), so we look like we are in the right territory. The fstat disassembly begins (gdb) disas 0x83082f4 Dump of assembler code for function fstat: 0x83082e0 <fstat>: push %ebp 0x83082e1 <fstat+1>: mov %esp,%ebp 0x83082e3 <fstat+3>: push %esi 0x83082e4 <fstat+4>: push %ebx 0x83082e5 <fstat+5>: mov 0x8(%ebp),%ebx 0x83082e8 <fstat+8>: mov 0xc(%ebp),%esi 0x83082eb <fstat+11>: push %esi 0x83082ec <fstat+12>: push %ebx 0x83082ed <fstat+13>: push $0x1 0x83082ef <fstat+15>: call 0x80918b4 <_fxstat> 0x83082f4 <fstat+20>: add $0xc,%esp so we're definitely seeing the _fxstat entry point here. _fxstat itself is (gdb) disas 0x80918b4 Dump of assembler code for function _fxstat: 0x80918b4 <_fxstat>: jmp *0x8424b08 0x80918ba <_fxstat+6>: push $0x8b8 0x80918bf <_fxstat+11>: jmp 0x8090734 <_init+52> but (gdb) x 0x8424b08 0x8424b08 <__DTOR_END__+1132>: 0x00000000 (gdb) so no wonder we have jumped into hyperspace. The memory around 0x8424b08 all looks valid so this isn't a complete memory wipe we're looking at. Running from scratch with a breakpoint at main shows that the jump vector at 0x8424b08 points to <_fxstat+6> initially, so it looks like we're doing a fixup to NULL at some point.
Btw, according to the lstat, this is the process's first invocation of _fxstat, although there were several prior successful _xstat calls.
_fxstat (and likewise _xstat and _lxstat) are not exported from glibc since glibc 2.1, so I wonder why it worked with earlier glibc's. Is Applix glibc 2.0 application, right? Even then, it is pretty strange, because /usr/lib/libc_nonshared.a in RH5.2 referenced __fxstat (that symbol is exported from glibc), so probably Applix just used its own magic fstat or whatever (because libc_nonshared used vers 3 (aka _STAT_VER_LINUX) while Applix uses vers 1 (aka _STAT_VER_KERNEL). Can you perhaps run it both with glibc 2.2 and glibc-2.1.94-3 with LD_DEBUG=all set in environment?
glibc-2.1.94-3: 05099: symbol=_fxstat; lookup in file=/net/opt/applix/axdata/axmain 05099: symbol=_fxstat; lookup in file=/lib/libNoVersion.so.1 05099: binding file /net/opt/applix/axdata/axmain to /lib/libNoVersion.so.1: normal symbol `_fxstat' glibc-2.2-5: 05071: symbol=_fxstat; lookup in file=/net/opt/applix/axdata/axmain 05071: symbol=_fxstat; lookup in file=/usr/X11R6.local/lib/libX11.so.6 05071: symbol=_fxstat; lookup in file=/lib/libdl.so.2 05071: symbol=_fxstat; lookup in file=/lib/libcrypt.so.1 05071: symbol=_fxstat; lookup in file=/usr/lib/libstdc++.so.2.8 05071: symbol=_fxstat; lookup in file=/lib/libm.so.6 05071: symbol=_fxstat; lookup in file=/lib/libc.so.6 05071: symbol=_fxstat; lookup in file=/lib/ld-linux.so.2 <then continues with next symbol --- no bind happens> So we are finding _fxstat in /lib/libNoVersion.so.1 in glibc-2.1.94, but we aren't even looking in that library with glibc-2.2. _fxstat _is_ in the libNoVersion for glibc-2.2-5, but ld.so isn't looking there. I can attach the full debug file from 2.2 if you want: it's only 140k compressed.
I'll look at the libNoVersion loading code over the weekend. Thanks for tracing this down.
Do you want the LD_DEBUG log?
No, I think I've nailed it down (_dl_map_object interface is changing all the time), will see after glibc-2.2-6 build (after I nail down one more issue).
ETA?
Most probably tomorrow will push it through the build system, I spent today some time fixing other glibc issues and with the exception of one unreproduceable report all are fixed.
Fixed in glibc-2.2-9 errata.