Bug 23538

Summary: ../sysdeps/generic/strcpy.c missing - address book core dumps - latest RHN update
Product: [Retired] Red Hat Linux Reporter: Bryce Nesbitt <bryce>
Component: netscapeAssignee: Bill Nottingham <notting>
Status: CLOSED RAWHIDE QA Contact: David Lawrence <dkl>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: rvokal, skokoska, tbs
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2001-01-12 13:56:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bryce Nesbitt 2001-01-07 15:29:14 UTC
1> Add something to the address book.
2> Start a new message.
3> Click on "address book"
4> Double click on an address book entry (to add the address to the email).
5> Crash!

[bryce@hardhat bin]$ gdb /usr/lib/netscape/netscape-communicator 
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(no debugging symbols found)...
(gdb) run
Starting program: /usr/lib/netscape/netscape-communicator 

Program received signal SIGSEGV, Segmentation fault.
strcpy (dest=0x8d64d00 "", src=0x0) at ../sysdeps/generic/strcpy.c:39
39	../sysdeps/generic/strcpy.c: No such file or directory.
(gdb) Quit
(gdb) quit
The program is running.  Exit anyway? (y or n) y

[bryce@hardhat bin]$ rpm -q netscape-communicator
netscape-communicator-4.76-1

[bryce@hardhat bin]$ rpm -q glibc
glibc-2.1.92-14

[bryce@hardhat bin]$ rpm -q libgtop
libgtop-1.0.9-2

Comment 1 Bryce Nesbitt 2001-01-07 15:41:53 UTC
Same results after forcing past RHN Update's skip of:

[bryce@hardhat ~]$ rpm -q glibc
glibc-2.2-9


Comment 2 Bill Nottingham 2001-01-07 17:51:37 UTC

*** This bug has been marked as a duplicate of 10433 ***

Comment 3 Bryce Nesbitt 2001-01-08 02:08:22 UTC
Nope, this is not the same. That other gaggle of horrible bugs is bad enough
(you must be sick of them), but this one is different.  The same "fix" does not
work.  With all locale lines taken out the behavior is identical.  Note the
exception:

Program received signal SIGSEGV, Segmentation fault.
strcpy (dest=0x8d64d00 "", src=0x0) at ../sysdeps/generic/strcpy.c:39
39	../sysdeps/generic/strcpy.c: No such file or directory.
(gdb) run

Comment 4 Bryce Nesbitt 2001-01-08 02:19:00 UTC
Here is a stack backtrace.  Need a core file?  Need any other gdb output?  Need
a copy of IE 5.0? (oops, sorry)...

(gdb) bt
#0  strcpy (dest=0x8d64d00 "", src=0x0) at ../sysdeps/generic/strcpy.c:39
#1  0x836b1cf in XFE_AddresseeView::getColumnText ()
#2  0x8360a28 in XFE_Outliner::contentCellDraw ()
#3  0x8360e85 in XFE_Outliner::celldraw ()
#4  0x8361922 in XFE_Outliner::celldrawCallback ()
#5  0x40035778 in XtCallCallbackList () from /usr/X11R6/lib/libXt.so.6
#6  0x8391c55 in XmLFolderSetActiveTab ()
#7  0x400413e5 in _XtEventInitialize () from /usr/X11R6/lib/libXt.so.6
#8  0x40041147 in XtDispatchEventToWidget () from /usr/X11R6/lib/libXt.so.6
#9  0x40040cf0 in XtDispatchEventToWidget () from /usr/X11R6/lib/libXt.so.6
#10 0x40041820 in _XtOnGrabList () from /usr/X11R6/lib/libXt.so.6
#11 0x40041b99 in XtDispatchEvent () from /usr/X11R6/lib/libXt.so.6
#12 0x4004ca3e in XtAppProcessEvent () from /usr/X11R6/lib/libXt.so.6
#13 0x82bd5ad in fe_EventLoop ()
#14 0x82bffc5 in main ()
#15 0x40217fd1 in __libc_start_main (main=0x82be7a4 <main>, argc=1, 
    ubp_av=0xbffff9c4, init=0x827f548 <_init>, fini=0x894914c <_fini>, 
    rtld_fini=0x4000e254 <_dl_fini>, stack_end=0xbffff9bc)
    at ../sysdeps/generic/libc-start.c:118

Interesting, interesting.  Note that I'm running accellerated X 6.0 from
http://www.xigraphics.com.

Comment 5 Bill Nottingham 2001-01-08 02:26:40 UTC
As far as I can tell, it's the same bug; it's from the same cause, and
it's the same fault (netscape is trying to run strcpy on a NULL pointer.)

This is a bug in the netscape program itself; we don't have the source
to it. If the previous workarounds don't work for you, the best
I can suggest is that you report this to netscape yourself (We've
reported #10433 to them before.)

http://help.netscape.com/forms/bug-client.html


Comment 6 Bryce Nesbitt 2001-01-08 02:37:24 UTC
Netscape has never cared about bug reports.  Ever.

Is the workaround more complex than removing the locale lines from
.netscape/*.js?  It is not documented well.  From a user perspective, my RH
update has broken a rather important feature.

I see the null pointer in the copy, but why the  "../sysdeps/generic/strcpy.c
missing" message?

Note that the same setup under XF86 does not crash (it does not work either, but
does not crash)

I've put support in the loop on this one also, just in case.

Comment 7 Bryce Nesbitt 2001-01-08 02:47:08 UTC
Sorry, you're right.  This is the same damn bug.  Note that it has been reported
at least a dozen times.

Just deleting the lines, as suggested, won't do it.  Netscape rewrites them.
To "fix" this bug, edit all *.js lines in your .netscape directory, changing the
locale from en_US to en:

< user_pref("ldap_2.servers.pab.locale", "en_US");
> user_pref("ldap_2.servers.pab.locale", "en");

< user_pref("ldap_2.servers.pab.locale", "en_US");
> user_pref("ldap_2.servers.pab.locale", "en");

As a packager, redhat should certainly be doing this for us poor user slobs who
just want a stable machine.

Comment 8 Bill Nottingham 2001-01-08 02:48:16 UTC
glibc has debugging symbols in it that associate parts of the functions with
locations in source files; the references to those files are to wherever
they were when glibc was built. If you have the glibc sources in that
location, gdb will show you the corresponding line int the source; if you
don't, you'll get that message. It's not a fatal error.

The 'removing the pab.locale' line from preferences.js is the best
workaround; you can also try running netscape with LANG set to
just 'en' or 'C'; the latter at least would disable all support for
alternate locales entirely, which is why it's not the recommended solution.

FWIW, I've run netscape-4.76 here with glibc-2.2-9, and I can't
reproduce this particular crash.

Comment 9 Bill Nottingham 2001-01-08 02:48:58 UTC
Also, we can't really change the defaults that the netscape binary
writes for locale support. no source == bad.

Comment 10 Bill Nottingham 2001-01-08 02:51:51 UTC
I also don't see netscape-4.76 writing those lines for me here; earlier
versions did (and would re-write them each time the program started.); I've
never had netscape-4.76 re-add the line.

Comment 11 Bryce Nesbitt 2001-01-08 03:12:01 UTC
I can't not reproduce the crash under XiGraphics X, and under XF86 I always get
subtle bad behavior. I defintely can reproduce the lines getting added back (as
have others in the record).  This seems like a real one.

If I were RedHat, given the magnitude of the problem, given the number of times
the report has been made, given RedHat's willingness to patch the world for
other issues, I'd fix this one.

The netscape wrapper scripts can easily scan for this problem and offer the user
a chance to fix it.  Or fix it automatically.  This is not outside of
patchable!!!!!!!!!

And, without question, at the very very very least, this needs to be a alert and
something a user would run into with the RHN update agent.

Comment 12 Bill Nottingham 2001-01-08 03:17:54 UTC
The workarounds like patching the users own preferences file, aside from
being ugly, break the locale support in interesting ways. For example, it
will start looking for the translation files in different places, requiring
the addition of various symlinks in the package, which can break upgrades.
Since the netscape wrapper runs from various scripts and is called by
other programs, making it interactive is *not* a viable solution.


Comment 13 Bill Nottingham 2001-01-08 03:23:35 UTC
Just to double-check; what do you get if you run 'set' before
you run netscape?

Comment 14 Bryce Nesbitt 2001-01-08 13:29:41 UTC
Yeah, but keep in mind right now users of your product are faced with a broken
address book in the only package considered important enough to be on the
toolbar by default.  This is bad.

What do you mean by "set"?:
-----------------------------------------------------------------
[bryce@hardhat ~/.netscape]$ /usr/bin/locale
LANG=en_US
LC_CTYPE="en_US"
LC_NUMERIC="en_US"
LC_TIME="en_US"
LC_COLLATE="en_US"
LC_MONETARY="en_US"
LC_MESSAGES="en_US"
LC_PAPER="en_US"
LC_NAME="en_US"
LC_ADDRESS="en_US"
LC_TELEPHONE="en_US"
LC_MEASUREMENT="en_US"
LC_IDENTIFICATION="en_US"
LC_ALL=

[bryce@hardhat bin]$ setenv LANG en
[bryce@hardhat bin]$ locale
LANG=en
LC_CTYPE="en"
LC_NUMERIC="en"
LC_TIME="en"
LC_COLLATE="en"
LC_MONETARY="en"
LC_MESSAGES="en"
LC_PAPER="en"
LC_NAME="en"
LC_ADDRESS="en"
LC_TELEPHONE="en"
LC_MEASUREMENT="en"
LC_IDENTIFICATION="en"
LC_ALL=

Comment 15 Bryce Nesbitt 2001-01-08 14:19:19 UTC
What about this.  It uses the same technique ALREADY in the wrapper.  And it has
a bonus.  Netscape tends to corrupt the .js files, this gives the user a backup
of the (probably) virgin copies.


pref_locale_cpck=$prefdir/nswrapper.hack_locale
liprefs=$prefdir/liprefs.js


# Sigh.  Netscrape address book craps out with a locale of en_US:
#
#       strcpy (dest=0x8d64d00 "", src=0x0) at ../sysdeps/generic/strcpy.c:39
#        #1  0x836b1cf in XFE_AddresseeView::getColumnText ()
#        #2  0x8360a28 in XFE_Outliner::contentCellDraw ()
#        #3  0x8360e85 in XFE_Outliner::celldraw ()Fix this once for the user.
#
# Hack the user's preference file to change the locale.  Do this just once.
#
if [ ! -f $pref_locale_cpck ]; then
        echo "Modifying $pref and\n$liprefs to fix address book crash";
        if [ -f $pref ]; then
                sed $pref > ${pref}.tmp -e 's/.pab.locale",
"en_US"/.pab.locale", "en"/'
                sed $liprefs > ${liprefs}.tmp -e 's/.pab.locale",
"en_US"/.pab.locale", "en"/'

                cp $pref $pref.bak
                cp $liprefs $liprefs.bak

                mv -f $pref.tmp $pref
                mv -f $liprefs.tmp $liprefs
        fi
        touch $pref_locale_cpck
fi

Comment 16 Bill Nottingham 2001-01-08 15:06:41 UTC
Well...

a) that only fixes the English case
b) as I said before, modifying the locale specified breaks the locale support

Meanwhile, so far this is the *only* report of netscape continuing to add the
locale lines after they've been removed. That's why I'm trying to figure out
what's different about your environment, because this does *NOT* occur for the
vast majority of users, and we can not reproduce your problem here.

To put it simply, we are not going to add a hack workaround that breaks other
support to fix a problem that we can't even reproduce here.

By 'set', I mean run 'set' at the command line to dump the environment.

Comment 17 Bryce Nesbitt 2001-01-08 16:21:35 UTC
Given that the just the bug database has over a dozen reports of the same
problem, this is a real issue, not something that happens to just me.   Where is
the solution to any of the variations of this problem?  How has the problem been
fixed for anyone?

An obscure comment in a closed BugZilla report is hardly a complete "fix".  It
is certainly not a fix for such a frequently reported and serious user-level
issue.

It is true that I fix only the english case...
	why is this not a problem with other distributions of Linux?
	is it actually broken for other locales?
	what's wrong with a locale of "en" vs "en_US"?
	why is this a problem with "en_US" locale, not locale "en"?  Is "en_US" broken?
	what does netscape say?

I'd heard the RedHat 7.0 is more unstable than previous versions, this is
probably one of the big ones (because it is such a user thing, and so visible).

Comment 18 Bryce Nesbitt 2001-01-08 16:23:25 UTC
My environment is a hardly-customized RedHat 7.0.  I unfortunately had to
upgrade kernels to 2.2.18 (but can test with the original if you so desire):

[bryce@hardhat ~]$ set
COLORS	/etc/DIR_COLORS
_	

addsuffix	
argv	()
cwd	/home/bryce
dirstack	/home/bryce
echo_style	both
edit	
file	/home/bryce/.i18n
gid	501
group	bryce
history	100
home	/home/bryce
owd	
path	(/home/bryce/bin /usr/kerberos/bin /usr/kerberos/bin /usr/local/bin /bin
/usr/bin /usr/X11R6/bin)
prompt	[%n@%m %c]$ 
prompt2	%R? 
prompt3	CORRECT>%R (y|n|e|a)? 
shell	/bin/tcsh
shlvl	4
sourced	1
status	0
tcsh	6.09.00
term	xterm
tty	pts/0
uid	501
user	bryce
version	tcsh 6.09.00 (Astron) 1999-08-16 (i386-intel-linux) options
8b,nls,dl,al,kan,rh,color,dsp

Comment 19 Bill Nottingham 2001-01-08 16:27:51 UTC
A dozen other people have reported the original bug.

*None* have reported that it persists that after applying the fix to their
preferences file, except for you. *That* is why I'm saying this appears to happen
in just your case.

As for your other comments:

- *as stated before*, setting the locale to just the two letter locale variant
because it starts looking for translation files in different diretories
- the problem occurs with en_US because something in the netscape source code
is broken. Without the source I can't really tell you what.
- netscape has not said anything on the matter
- this is not a 7.0 problem. This problem started around Netscape 4.72 or so.
  Why it did not happen until 7.0 in your case, I couldn't tell you.



Comment 20 Bill Nottingham 2001-01-08 16:28:17 UTC
Oh, tcsh. 'setenv' in that case - sorry about the confusion.

Comment 21 Bryce Nesbitt 2001-01-08 17:24:20 UTC
[bryce@hardhat ~/.ssh]$ setenv
PWD=/home/bryce/.ssh
VENDOR=intel
HOSTNAME=hardhat
PVM_RSH=/usr/bin/rsh
QTDIR=/usr/lib/qt-2.2.0
LESSOPEN=|/usr/bin/lesspipe.sh %s
KDEDIR=/usr
USER=bryce
LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:
MACHTYPE=i386
MAIL=/var/mail/bryce
CVS_RSH=ssh
LANG=en_US
HOST=hardhat
DISPLAY=:0
LOGNAME=bryce
SHLVL=4
GROUP=bryce
SESSION_MANAGER=local/hardhat:/tmp/.ICE-unix/3762
SHELL=/bin/tcsh
HOSTTYPE=i386-linux
OSTYPE=linux
PVM_ROOT=/usr/share/pvm3
HOME=/home/bryce
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
PATH=/home/bryce/bin:/usr/local/Acrobat4/bin:/usr/kerberos/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin
_=/usr/bin/gnome-terminal
TERM=xterm
COLORTERM=gnome-terminal
WINDOWID=67109234


Comment 22 Bryce Nesbitt 2001-01-09 14:11:14 UTC
I understand that *I* am the only person reporting a persistent version of the
bug.

But the non-persistent version of the bug, as far as I can tell, remains
unsolved.  A user experiencing the bug is expected to go to RedHat's bugzilla,
and search for CLOSED bugs relating to the topic, then wade through long
descriptions to locate hack fix that solves the total crash breaks other things.

Other distributions seem to solve the problem, somehow, perhaps by mucking with
or using different locale definitions.

Comment 23 Need Real Name 2001-01-09 17:58:38 UTC
I am experiencing a very similar bug using RedHat 7.0 and Netscape 4.76.   I
cannot get the address book to work.  Selecting an adress (or other option in
the address book) will routinely close (crash) the program.  I cannot double
click an address to open a compose window, and the key bindings (like alt-d for
delete) do not work in the address book.

Comment 24 Bill Nottingham 2001-01-09 18:03:38 UTC
Apply the fix listed in bug #10433.

Comment 25 Bryce Nesbitt 2001-01-12 13:56:29 UTC
Sadly, the fix in 10433 does not fully solve the address book problem.  Try
creating a group, then adding members.  You get:

Program received signal SIGSEGV, Segmentation fault.
0x85c25ee in CNeoPersist::referTo ()
(gdb) bt
#0  0x85c25ee in CNeoPersist::referTo ()
#1  0x85bd6e7 in CNeoNode::findObject ()
#2  0x85d485a in CNeoClass::FindObject ()
#3  0x85d230c in CNeoClass::DoUntilClass ()
#4  0x85d43f1 in CNeoClass::FindObject ()
#5  0x85b245d in CNeoDatabase::findObject ()
#6  0x85c9a5b in ab_NeoDbRef::FindEntryByEmailOrName ()
#7  0x85cbdee in ab_NeoRowContent::FindRow ()
#8  0x85a0f43 in ab_Row::FindRowUid ()
#9  0x85aa67e in AB_Table_FindRowUid ()
#10 0x858bebc in AB_ContainerInfo::ConfirmToReplaceEntry ()
#11 0x858bd17 in AB_ContainerInfo::AddEntry ()
#12 0x8598dbb in AB_PersonPane::CommitChanges ()
#13 0x8586b02 in AB_CommitChanges ()
#14 0x8371e3f in XFE_ABNameFolderDlg::apply ()
#15 0x834a7a1 in XFE_XmLFolderDialog::ok ()
#16 0x8368184 in XFE_ViewDialog::ok_cb ()
#17 0x40035778 in XtCallCallbackList () from /usr/X11R6/lib/libXt.so.6
#18 0x88d6f23 in _XmSelectionBoxNoGeoRequest ()
#19 0x40035778 in XtCallCallbackList () from /usr/X11R6/lib/libXt.so.6
#20 0x88ac497 in _XmClearBCompatibility ()
#21 0x88ae968 in XmCreatePushButtonGadget ()
#22 0x88ac25b in _XmClearBCompatibility ()
#23 0x888f162 in _XmDispatchGadgetInput ()
#24 0x8919eba in _XmGadgetActivate ()
#25 0x400649b5 in _XtMatchAtom () from /usr/X11R6/lib/libXt.so.6
#26 0x40065347 in _XtMatchAtom () from /usr/X11R6/lib/libXt.so.6
#27 0x400653fe in _XtTranslateEvent () from /usr/X11R6/lib/libXt.so.6
#28 0x40040ecb in XtDispatchEventToWidget () from /usr/X11R6/lib/libXt.so.6
#29 0x40041944 in _XtOnGrabList () from /usr/X11R6/lib/libXt.so.6
#30 0x40041b99 in XtDispatchEvent () from /usr/X11R6/lib/libXt.so.6
#31 0x4004ca3e in XtAppProcessEvent () from /usr/X11R6/lib/libXt.so.6
#32 0x82bd5ad in fe_EventLoop ()
#33 0x82bffc5 in main ()
#34 0x40217fd1 in __libc_start_main (main=0x82be7a4 <main>, argc=1, 
    ubp_av=0xbffff9d4, init=0x827f548 <_init>, fini=0x894914c <_fini>, 
    rtld_fini=0x4000e254 <_dl_fini>, stack_end=0xbffff9cc)
    at ../sysdeps/generic/libc-start.c:118

And address book entries come and go.  Netscape == crap.

Comment 26 Bill Nottingham 2001-01-19 21:52:24 UTC
After some more testing, I *still* can't reproduce how you're managing
to get the data to reappear.

However, while changing the locale as you suggest does break things,
nothing in particular breaks if you remove the locale line entirely.

netscape-4.76-5 will do so.

Comment 27 Bryce Nesbitt 2001-01-22 17:28:23 UTC
Thanks for making a fix!

Given the number of times I've seen this reported (not just here) and the number
of emails I got from other people based on my bug report (several) this will
help a lot.

I've been running with a deleted locale line for a while now, without a repeat
of the bug.  There are other address book bugs, of course, but deleting the
locale lines seems to be a solution for the major crashing problem.