Red Hat Bugzilla – Full Text Bug Listing
|Summary:||glibc 2.3.2, /lib64/tls library bug in pthread_rwlock_init|
|Product:||Red Hat Enterprise Linux 3||Reporter:||Jani Tolonen <jani>|
|Component:||glibc||Assignee:||Jakub Jelinek <jakub>|
|Status:||CLOSED CURRENTRELEASE||QA Contact:||Brian Brock <bbrock>|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2004-08-26 01:34:58 EDT||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Jani Tolonen 2004-01-20 15:04:51 EST
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020903 Description of problem: We observed a problem when running MySQL 4.0.17 application on RH AS 3.0, insert statements locked up on a very small table. Platform CPU info: vendor_id : AuthenticAMD model name : AMD Opteron(tm) Processor 244 cpu MHz : 1792.138 cache size : 1024 KB kernel info: 2.4.21-4.0.1.EL #1 Thu Oct 23 01:24:43 EDT 2003 x86_64 GNU/Linux During debugging we found out that the problem occurred with MySQL my_rwlock_init() code, which was used from /lib64/tls/ library, as the binary was dynamic. It seemed that sometimes the argument had uninitialized values when it was passed to the locking function. This caused the lock to be ignored and further code was executed, which was wrong. We found out that just by moving /lib64/tls to /lib64/tls.disabled fixed the problem. MySQL server was then using the corresponding shared libraries from elsewhere on the system. Version-Release number of selected component (if applicable): glibc-2.3.2-95.6 How reproducible: Always Steps to Reproduce: 1. Install MySQL 4.0.17 (or higher), must be a 64bit version, made for Linux AMD 64bit. 2. Create a table with a key field in it, not primary key. CREATE TABLE t1 (i int, key (i)); 3. Run simultaneous inserts to the table, into the above key field INSERT INTO t1 VALUES (1); INSERT INTO t1 VALUES (2); etc. Inserts must be done via separate threads. Actual Results: If two or more inserts have been run close enough to each other, one can notice the problem with 'mysqladmin processlist', which says that the first thread is in state 'update' while the others are locked. The threads will stay at this state for ever. Expected Results: Insert statements should have gone through. Additional info: The exact place in the mysql code where the problem occurs is at myisam/mi_open.c, around line 434, which looks like this: for (i=0; i<keys; i++) VOID(my_rwlock_init(&share->key_root_lock[i], NULL)); my_rwlock_init is defined as #define my_rwlock_init(A,B) pthread_rwlock_init((A),(B)), which is a system function call, which is defined in and used from /lib64/tls/libpthread-0.60.so We suspect that the bug is somewhere in this library. Just by disabling this library things will work. Regards, Jani
Comment 1 Rackspace Bugzilla 2004-01-23 16:50:53 EST
This issue has also been seen with MySQL 3 rpm packages as released by Red Hat within the Extra's channel for AS 3 AMD64. A resolution on this would be real nice ! :)
Comment 2 Patrick Macdonald 2004-01-28 16:14:34 EST
This has nothing to do with Red Hat Application Server. I'm sending it over to RHEL for further investigation.
Comment 3 Patrick Macdonald 2004-01-28 16:16:46 EST
Tom... please take a look at this. It may very well be a glibc bug, but try to reproduce first.
Comment 4 Tom Lane 2004-01-29 19:53:06 EST
I was not able to reproduce this on an AMD64 4-way box running RHEL3, using MySQL 4.0.17 built from source. I made a file containing 10000 insert commands and fed it into two mysql processes simultaneously in two shell windows. The performance was pretty abysmal (a single mysql could run the file in about a second, but two parallel instances would take 13 to 15 seconds to finish), but I observed no lockup in several dozen trials. Can anyone suggest what I must do differently to see the failure? Note that if the problem occurs only with MySQL-supplied binaries, and not in a source build, I'd be inclined to blame the variant version of glibc that they use in their binary packages.
Comment 6 Tom Lane 2004-02-06 16:54:37 EST
I've continued to try to reproduce this, without success. I've tried MySQL 4.0.17 built from MySQL AB's SRPM, and I've tried the mysql-server-3.23.58-1 RPM from the RHN Extras channel as suggested by rackspace. No luck. I am using a more recent kernel: 2.4.21-9.ELsmp #1 SMP Thu Jan 8 16:52:31 EST 2004 x86_64 x86_64 x86_64 GNU/Linux but other than that I think I am testing the same software and hardware as the complainant. The only other possibility that comes to mind is that my testing methodology might be wrong. As mentioned above, I'm feeding script files containing many INSERT commands to mysql clients running in parallel. I tried 2,3,4,5 clients (this is a 4-way machine btw). Doesn't lock up. Unless someone can tell me how to reproduce this, I'm going to have to close it WORKSFORME.
Comment 7 Jani Tolonen 2004-02-10 05:09:14 EST
I will try to get a backtrace of where MySQL hangs. We were testing this on RedHat machine with MySQL compiled there, so it's definitely a RedHat problem. However, I understand that you need more information about the problem before you can check the code or debug it. I will discuss with our contact if I can login to the machine and get more information about the bug. I'd appreciate if you can hold this report in status 'NEEDINFO' for now, I will get back to you asap. Regards, Jani
Comment 8 Tom Lane 2004-03-16 14:03:11 EST
I'm going to set this in state NEEDINFO until I hear more. Jani, if you file followup info please remember to change the state back to ASSIGNED.
Comment 9 Tom Lane 2004-03-18 12:05:09 EST
Rackspace sent me login info and reproduction instructions for their own x86_64 machine running Taroon Update 1. I have confirmed that following their recipe it is possible to get mysql to freeze up with a stack trace pointing to pthread_rwlock_wrlock. Obviously I can't put the login details in this bugzilla entry, but will send them by private mail to anyone who needs to look into this. It does seem to be a pthread issue.
Comment 10 Tom Lane 2004-03-18 12:15:32 EST
I am relabeling this as a glibc issue. I'm not sure who bugzilla will reassign it to (maybe Jakub?) but whoever wants to look at it please contact me by mail for info about reproducing the problem.
Comment 11 Jakub Jelinek 2004-03-18 12:21:40 EST
This looks very much like: http://sources.redhat.com/ml/libc-hacker/2004-02/msg00019.html which is fixed in glibc-2.3.2-95.12 and later (U2 beta includes glibc-2.3.2-95.17).
Comment 12 Tom Lane 2004-03-18 12:39:59 EST
Ah, very interesting. [looks...] The Rackspace machine that shows the problem is running glibc-2.3.2-95.6. I will ask them to update and see if problem goes away. Thanks.
Comment 14 Ulrich Drepper 2004-08-26 01:19:42 EDT
Ping!? I want to close this bug if there is nothing more to report.
Comment 15 Tom Lane 2004-08-26 01:34:58 EDT
I've not heard any more complaints from Rackspace, so let's assume the issue is gone. They can reopen if not...