232247 – unixODBC2.2.8-2.3.0.2 has regression

Bug 232247 - unixODBC2.2.8-2.3.0.2 has regression

Summary: unixODBC2.2.8-2.3.0.2 has regression

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	unixODBC
Sub Component:
Version:	3.8
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	urgent
Target Milestone:	---
Assignee:	Tom Lane
QA Contact:	David Lawrence
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-03-14 15:41 UTC by Rajesh John Almeida
Modified:	2013-07-03 03:12 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-04-02 20:26:41 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
repro file (1.28 MB, application/octet-stream) 2007-03-14 15:41 UTC, Rajesh John Almeida	no flags	Details
View All

Description Rajesh John Almeida 2007-03-14 15:41:35 UTC

Description of problem:
Sybase ODBC driver is encoutering Crashes when moved to RHEL3.x with 
unixODBC2.2.8-2.3.0.2 .
With RHEL2.1 unixODBC 2.0.7 it works 
and when we move to RHEL4.x unixODBC 2.2.11 it works ..so something regressed 
between RHEL2.1 and RHEL4.x for unixODBC .
We cannot move to RHEL4.x where it works as its not a option for some of our 
clients.
Attached is a tar file for the repro. You will need to have Sybase ASE to run 
the repro.

Version-Release number of selected component (if applicable):

RHEL3.x unixODBC2.2.8-2.3.0.2

How reproducible:
ODBC Basic Sample

Purpose
-------
This directory holds a sample ODBC application that illustrates
how to connect to a datasource and execute a SQL statement.
Source code for the application is in simple.cpp.

ODBC Driver Manager location
----------------------------
Linux / MacOSX
--------------
The sample code assumes ODBC Driver Manager libraries to be installed
in /usr/lib directory. If they are installed in a different directory,
edit "build" and "makefile" files to correct the location.

Procedure
---------
Linux / MacOSX
--------------

To compile the application:
# ./build sample

To run the application:
# ./build run

To delete all generated files:
# ./build clean



----------------
Before run the sample code, we should edit the /etc/odbcinst.ini file to add 
the driver (you should
have the su permission to edit that file)
[Adaptive Server Enterprise]
Description		= Sybase ODBC Driver
Driver		= /the path you put the driver so/libsybdrvodb.so
FileUsage		= 1

You also need to add an sampledsn in $HOME/.odbc.ini
For example:

[sampledsn]
Driver          = Adaptive Server Enterprise
UserID          = sa
Server          = qablade2 
Port            = 6001
Database        = pubs2
Password        = 
UseCursor       = 1

Steps to Reproduce:
see above for reproducible.  
Actual results:
should not get a crash 

Expected results:


Additional info:

Comment 1 Rajesh John Almeida 2007-03-14 15:41:36 UTC

Created attachment 150054 [details]
repro file

Comment 2 Tom Lane 2007-03-14 15:58:42 UTC

Sorry, but I don't have Sybase.  Please provide a reproducer that does not
involve any proprietary software, and I'll be glad to look into it.

Comment 3 Barry Marson 2007-03-14 18:02:35 UTC

Red Hat QA has access to Sybase as well as having the Sybase engineer come on
site in Westford.  We can setup the reproducer and give you access and
instructions to demonstrate the problem.

Barry

Comment 7 Tom Lane 2007-03-20 21:14:01 UTC

Yipes, you put a root password into a publicly readable bugzilla entry?

Please change it forthwith, preferably to something less guessable,
and then send me the password in private mail.

Comment 8 Tom Lane 2007-03-21 05:24:11 UTC

The crash appears to be happening in an atexit callback routine:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1076472512 (LWP 18428)]
0x403007d2 in ?? ()
(gdb) bt
#0  0x403007d2 in ?? ()
#1  0x401888f3 in exit () from /lib/tls/libc.so.6
#2  0x401737fc in __libc_start_main () from /lib/tls/libc.so.6
#3  0x08048a09 in _start ()

Now, there are no atexit callbacks in unixODBC, nor in the "simple.cpp" source code you provided.
A string search suggests that libsybdrvodb.so contains atexit calls.  So at this point my position is
that you have the source code needed to debug the problem, and I don't ...

Comment 9 Rajesh John Almeida 2007-03-22 03:58:03 UTC

yongzhi
I am looking at it now. But there should some in unixODBC related to the
segmentation fault(Maybe indirectly cause the issue). For our driver works
fine with RHEL 4.0, and our unit tests work well with our driver if call
ODBC driver directly but if calls go throught unixODBC(driver manager), the
same Segmentation fault will happen. Could you research on what happens in
unixODBC after our Driver's SQLDisconnect return.

Comment 10 Rajesh John Almeida 2007-03-23 19:26:41 UTC

yongzhi
Ran the test with LD_DEBUG set to symbols, the exit() was called from unixODBC.
And it was called after shmdt method. 
How do I get this?
First, I ran the test with our driver build on rehl3.0, get information like this:
     20636:     symbol=uodbc_close_stats;  lookup in file=./simple
     20636:     symbol=uodbc_close_stats;  lookup in file=/usr/lib/libodbc.so.1
     20636:     symbol=shmdt;  lookup in file=./simple
     20636:     symbol=shmdt;  lookup in file=/usr/lib/libodbc.so.1
     20636:     symbol=shmdt;  lookup in file=/usr/lib/libstdc++.so.5
     20636:     symbol=shmdt;  lookup in file=/lib/tls/libm.so.6
     20636:     symbol=shmdt;  lookup in file=/lib/libgcc_s.so.1
     20636:     symbol=shmdt;  lookup in file=/lib/tls/libc.so.6
Segmentation fault (core dumped) 

Then I ran the test in the same machine with the driver build on rehl2(the
driver works fine):
17642:  symbol=shmdt;  lookup in file=./simple
17642:  symbol=shmdt;  lookup in file=/usr/lib/libodbc.so.1
17642:  symbol=shmdt;  lookup in file=/usr/lib/libstdc++-libc6.2-2.so.3
17642:  symbol=shmdt;  lookup in file=/lib/i686/libm.so.6
17642:  symbol=shmdt;  lookup in file=/lib/i686/libc.so.6
17642:  symbol=exit;  lookup in file=./simple
17642:  symbol=exit;  lookup in file=/usr/lib/libodbc.so.1
17642:  symbol=exit;  lookup in file=/usr/lib/libstdc++-libc6.2-2.so.3
17642:  symbol=exit;  lookup in file=/lib/i686/libm.so.6
17642:  symbol=exit;  lookup in file=/lib/i686/libc.so.6
17642:  symbol=__deregister_frame_info;  lookup in file=./simple
17642:  symbol=__deregister_frame_info;  lookup in file=/usr/lib/libodbc.so.1
17642:  symbol=__deregister_frame_info;  lookup in
file=/usr/lib/libstdc++-libc6.2-2.so.3
17642:
17642:  calling fini: /usr/lib/libodbc.so.1

Because the driver manager is the same, it should behavors consistently. So
I think the exit is the one who caused the segmentation fault. 
What's more, in the two cases, the libsybdrvodb.so has already "calling fini"
before reach shmdt

Comment 11 Rajesh John Almeida 2007-03-23 19:41:34 UTC

yongzhi
I should correct my words, I should say exit() was not called from
libsybdrvodb.so, it happens between 
"calling fini libsybdrvodb.so" and "calling fini: /usr/lib/libodbc.so"

Comment 12 Tom Lane 2007-03-23 20:38:27 UTC

OK, so reading between the lines I guess you are saying that (a) the sybase
driver relies on some shared memory, and (b) the crash is happening because
it tries to touch the shared memory after it's already been shmdt()'d?

The stack trace I showed indicates that the test program isn't calling exit()
explicitly at all, but rather that is happening implicitly after return from
main().  So I think that the problem must be one of atexit callbacks happening
in a different order than you are expecting.  The man page says that atexit
callbacks are supposed to happen in reverse order of registration, so either
that's not happening (in which case this is a glibc bug) or there is some
difference in the order in which the callbacks get set up.  So I recommend
tracing the startup part of the test to see what order things happen in.

Comment 13 Rajesh John Almeida 2007-03-26 23:15:57 UTC

yongzhi
After making more research, I think the I know what's the issue is. But
I need your fix. When our libsybdrvodb.so 's size is bigger or equal
3173295, the segmentation fault will happen; but if the libsybdrvodb.so is 
less or equal 3108985, the segmentation is gone. So it crashes on some magic
number between the previous two numbers.
It maybe a redhat 3.0 issue or a unixODBC issue, could you have a look at it?

Comment 14 Tom Lane 2007-03-27 04:05:47 UTC

That seems moderately unlikely --- I am not aware of anything that would depend directly on the size of a 
.so file.  What exactly did you change to cause the change in .so file size?

Comment 15 Rajesh John Almeida 2007-03-27 15:11:15 UTC

I made changes in odbc driver, and make it really do nothing in connect and
disconnect. So I can random remove any objections from a static (say foo.a)
library which our driver will link to in build time. Note, the changed driver
just use no library in the foo.a in the test(the test only does connect and
disconnect).  By that way, I can get different size of .so. When the .so size 
is small enough, the segmentation fault gone. 
The SQLConnect and SQLDisconnect did nothing but let the unixODBC load and unload
odbc driver .so, and only difference in each test is the size. And segmentation
fault is exactly the same one as the real functional driver

Comment 16 Tom Lane 2007-03-27 15:32:54 UTC

Hmmm ... maybe the size of the .so affects the memory layout, specifically the
address at which the shmem segment gets mapped?  Not sure how that would
translate into a problem for you, but something to think about.

Comment 17 Rajesh John Almeida 2007-03-27 23:02:36 UTC

yongzhi
I ran a test which just dlopen("libsybdrvodb.so") and dlclose it. There is
a segmentation fault at the exit()if the libsybdrvodb.so build in RHDL3.0. 

I also noticed that:
unixODBC2.2.8(the one in RHDL3.0) called dlclose to unload libsybdrvodb.so, so
the segmentation fault happen in exit(); (I knows that by checking bindings)
unixODBC2.2.11(the one in RHDL4.0) somehow does not call dlclose, so the repro
code could pass in RHDL4.0 machine.

There maybe two ways to solve the problem:
1.Some linux expert tell us how to fix our driver to solve the dlclose cause 
segmentation fault issue. (the same source code built in RHDL2 does not have the
dlclose problem).
or
2. Fix unixODBC2.2.8 to do something similar to 2.2.11

Comment 18 Tom Lane 2007-03-28 00:41:01 UTC

So what's your code doing at dlopen and dlclose times?  (These will call _init and _fini functions, or
constructor/destructor routines, if you have them...)  This sounds to me like nothing so much as a bug in 
the _fini function --- perhaps depending on a variable that isn't really initialized, but chances to have the 
right value in the RHEL2 environment?

Comment 19 Rajesh John Almeida 2007-04-02 19:07:48 UTC

yongzhi
We change a build flag to fix our drive. Now the issue is solved.

Comment 20 Russell Doty 2007-04-02 20:26:41 UTC

Problem solved by Sybase. Closing as "Notabug" for Red Hat.

Note You need to log in before you can comment on or make changes to this bug.