Bug 113968

Summary: glibc 2.3.2, /lib64/tls library bug in pthread_rwlock_init
Product: Red Hat Enterprise Linux 3 Reporter: Jani Tolonen <jani>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: cperry, tgl
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-08-26 05:34:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jani Tolonen 2004-01-20 20:04:51 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1)
Gecko/20020903

Description of problem:
We observed a problem when running MySQL 4.0.17 application
on RH AS 3.0, insert statements locked up on a very small table.

Platform CPU info:
vendor_id       : AuthenticAMD
model name      : AMD Opteron(tm) Processor 244
cpu MHz         : 1792.138
cache size      : 1024 KB

kernel info:
2.4.21-4.0.1.EL #1 Thu Oct 23 01:24:43 EDT 2003 x86_64 GNU/Linux

During debugging we found out that the problem occurred
with MySQL my_rwlock_init() code, which was used from /lib64/tls/
library, as the binary was dynamic. It seemed that sometimes
the argument had uninitialized values when it was passed to the
locking function. This caused the lock to be ignored and further
code was executed, which was wrong.

We found out that just by moving /lib64/tls to /lib64/tls.disabled
fixed the problem. MySQL server was then using the corresponding
shared libraries from elsewhere on the system.



Version-Release number of selected component (if applicable):
glibc-2.3.2-95.6

How reproducible:
Always

Steps to Reproduce:
1. Install MySQL 4.0.17 (or higher), must be a 64bit version, made
   for Linux AMD 64bit.
2. Create a table with a key field in it, not primary key.
   CREATE TABLE t1 (i int, key (i));
3. Run simultaneous inserts to the table, into the above key field
   INSERT INTO t1 VALUES (1);
   INSERT INTO t1 VALUES (2); etc. Inserts must be done via separate
   threads.

Actual Results:  If two or more inserts have been run close enough to
each other,
one can notice the problem with 'mysqladmin processlist', which
says that the first thread is in state 'update' while the others
are locked. The threads will stay at this state for ever.

Expected Results:  Insert statements should have gone through.

Additional info:

The exact place in the mysql code where the problem occurs
is at myisam/mi_open.c, around line 434, which looks like this:
for (i=0; i<keys; i++)
      VOID(my_rwlock_init(&share->key_root_lock[i], NULL));

my_rwlock_init is defined as
#define my_rwlock_init(A,B) pthread_rwlock_init((A),(B)),
which is a system function call, which is defined in and used
from /lib64/tls/libpthread-0.60.so

We suspect that the bug is somewhere in this library.
Just by disabling this library things will work.

Regards,
Jani

Comment 1 Rackspace Bugzilla 2004-01-23 21:50:53 UTC
This issue has also been seen with MySQL 3 rpm packages as released by
Red Hat within the Extra's channel for AS 3 AMD64. A resolution on
this would be real nice ! :)

Comment 2 Patrick Macdonald 2004-01-28 21:14:34 UTC
This has nothing to do with Red Hat Application Server.  I'm sending it
over to RHEL for further investigation.

Comment 3 Patrick Macdonald 2004-01-28 21:16:46 UTC
Tom... please take a look at this.  It may very well be a glibc bug,
but try to reproduce first.

Comment 4 Tom Lane 2004-01-30 00:53:06 UTC
I was not able to reproduce this on an AMD64 4-way box running RHEL3,
using MySQL 4.0.17 built from source.  I made a file containing 10000
insert commands and fed it into two mysql processes simultaneously
in two shell windows.  The performance was pretty abysmal (a single
mysql could run the file in about a second, but two parallel instances
would take 13 to 15 seconds to finish), but I observed no lockup in
several dozen trials.  Can anyone suggest what I must do differently
to see the failure?

Note that if the problem occurs only with MySQL-supplied binaries, and
not in a source build, I'd be inclined to blame the variant version of
glibc that they use in their binary packages.

Comment 6 Tom Lane 2004-02-06 21:54:37 UTC
I've continued to try to reproduce this, without success.  I've tried
MySQL 4.0.17 built from MySQL AB's SRPM, and I've tried the
mysql-server-3.23.58-1 RPM from the RHN Extras channel as suggested by
rackspace.  No luck.  I am using a more recent kernel:
2.4.21-9.ELsmp #1 SMP Thu Jan 8 16:52:31 EST 2004 x86_64 x86_64 x86_64
GNU/Linux
but other than that I think I am testing the same software and
hardware as the complainant.

The only other possibility that comes to mind is that my testing
methodology might be wrong.  As mentioned above, I'm feeding script
files containing many INSERT commands to mysql clients running in
parallel.  I tried 2,3,4,5 clients (this is a 4-way machine btw). 
Doesn't lock up.

Unless someone can tell me how to reproduce this, I'm going to have to
close it WORKSFORME.

Comment 7 Jani Tolonen 2004-02-10 10:09:14 UTC
I will try to get a backtrace of where MySQL hangs.
We were testing this on RedHat machine with MySQL compiled
there, so it's definitely a RedHat problem.

However, I understand that you need more information about
the problem before you can check the code or debug it.

I will discuss with our contact if I can login to the machine
and get more information about the bug. I'd appreciate if
you can hold this report in status 'NEEDINFO' for now, I will
get back to you asap.

Regards,
Jani

Comment 8 Tom Lane 2004-03-16 19:03:11 UTC
I'm going to set this in state NEEDINFO until I hear more.  Jani, if
you file followup info please remember to change the state back to
ASSIGNED.

Comment 9 Tom Lane 2004-03-18 17:05:09 UTC
Rackspace sent me login info and reproduction instructions for their
own x86_64 machine running Taroon Update 1.  I have confirmed that
following their recipe it is possible to get mysql to freeze up with a
stack trace pointing to pthread_rwlock_wrlock.  Obviously I can't put
the login details in this bugzilla entry, but will send them by
private mail to anyone who needs to look into this.  It does seem to
be a pthread issue.

Comment 10 Tom Lane 2004-03-18 17:15:32 UTC
I am relabeling this as a glibc issue.  I'm not sure who bugzilla will
reassign it to (maybe Jakub?) but whoever wants to look at it please
contact me by mail for info about reproducing the problem.

Comment 11 Jakub Jelinek 2004-03-18 17:21:40 UTC
This looks very much like:
http://sources.redhat.com/ml/libc-hacker/2004-02/msg00019.html
which is fixed in glibc-2.3.2-95.12 and later (U2 beta
includes glibc-2.3.2-95.17).

Comment 12 Tom Lane 2004-03-18 17:39:59 UTC
Ah, very interesting.  [looks...]  The Rackspace machine that shows
the problem is running glibc-2.3.2-95.6.  I will ask them to update
and see if problem goes away.  Thanks.

Comment 13 Jakub Jelinek 2004-03-19 16:30:57 UTC
ftp://people.redhat.com/jakub/glibc/2.3.2-95.17/

Comment 14 Ulrich Drepper 2004-08-26 05:19:42 UTC
Ping!?  I want to close this bug if there is nothing more to report.

Comment 15 Tom Lane 2004-08-26 05:34:58 UTC
I've not heard any more complaints from Rackspace, so let's assume the issue is gone.
They can reopen if not...