Bug 86432

Summary: Update to glibc-2.3.2-4.80 breaks netscape communicator
Product: [Retired] Red Hat Linux Reporter: Abramo Bagnara <abramo.bagnara>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 8.0CC: fweimer, mitr
Target Milestone: ---   
Target Release: ---   
Hardware: athlon   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-11-05 19:40:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
gdb traceback, instruction decode, reg value, NULL pointer reference none

Description Abramo Bagnara 2003-03-21 21:27:40 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030314

Description of problem:
After upgrade to glibc-2.3.2-4.80 netscape give Bus Error at invocation time.


Version-Release number of selected component (if applicable):
glibc-2.3.2-4.80

How reproducible:
Always

Steps to Reproduce:
1. type netscape and press enter ;-)


Actual Results:  Bus error

Expected Results:  Netscape Communicator window appears

Additional info:

A gdb session reveal that _pthread_cleanup_upto is the location of the
segmentation fault:

$ gdb /usr/lib/netscape/netscape-communicator 
GNU gdb Red Hat Linux (5.2.1-4)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(no debugging symbols found)...
(gdb) r
Starting program: /usr/lib/netscape/netscape-communicator 
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...[New Thread 16384 (LWP 1388)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16384 (LWP 1388)]
0x407f492f in __pthread_cleanup_upto () from /lib/libpthread.so.0
(gdb) c
Continuing.

Program received signal SIGBUS, Bus error.
0x40239f81 in kill () from /lib/libc.so.6

Kernel running is vanilla 2.4.20 but also trying with 2.4.18-27.8.0 from Redhat
Updates netscape gives the same error.

Comment 1 Jakub Jelinek 2003-03-26 17:50:58 UTC
Can you please try ftp://people.redhat.com/jakub/glibc/errata/8.0/ ?

Comment 2 Abramo Bagnara 2003-03-26 21:49:34 UTC
I get exactly the same results also after installing RPMS in
ftp://people.redhat.com/jakub/glibc/errata/8.0/ as you suggest.

I've also tried to reboot the machine after this test upgrade, but nothing has
changed.

Comment 3 Jakub Jelinek 2003-03-29 08:49:14 UTC
One more attempt: ftp://people.redhat.com/jakub/glibc/errata/8.0/*4.80.3*
There were apparently more things needed on kernels which provide broken
AT_PLATFORM aux vector element.

Comment 4 Abramo Bagnara 2003-03-29 22:50:05 UTC
Using 4.80.3 as you suggest I get the following results:

$ /usr/bin/netscape
Bus error

_but_

$ /usr/lib/netscape/netscape-communicator 

works.

I've traced the culprit in:

$ LD_ASSUME_KERNEL=2.2.5 /usr/lib/netscape/netscape-communicator 
Bus error

Does this means this is a bug in netscape wrapper script?

Comment 5 Ulrich Drepper 2003-11-05 19:40:23 UTC
The latter command line makes the runtime use a glibc with an older
ABI.  The difference is in how thread stacks are handled.  It seems
the netscape version relies on something weird in the memory layout. 
This certainly isn't guaranteed.

Given how badly the netscpae 4 code was (judging from the first
mozilla code) I am not at all surprised.  If the code without
LD_ASSUME_KERNEL works, fine, use it.  It's not worth spending time on
the other case.  Use mozilla with has much better code and actually works.

I'm closing the bug.  I don't think there is anything we should do. 
Reopen if you disagree and have a proposal for how to go forward.

Comment 6 Anil Chandiramani 2004-07-29 22:00:24 UTC
Created attachment 102306 [details]
gdb traceback, instruction decode, reg value, NULL pointer reference

This problem is reproducible with pthreaded code.
It involves attempting a longjmp from a signal handler after SIGSEGV.
It seems to depend on where the setjmp() was done i.e., not all SIGSEGV
handling cause it.
Core file is available.