Description of problem: pthread_cond_broadcast could call lll_unlock() twice, breaking the shared data. For example, it can cause infinite wait in pthread_cond_broadcast. Version-Release number of selected component (if applicable): glibc 2.18 and 2.19 are affected. glibc 2.17 is fine, so Fedora 19 is not affected. Fedora 20 has this issue and most likely Fedora 21 will, too. Upcoming glibc 2.20 contains a fix for this issue. How reproducible: Occasionally. It happens as a result of race condition between waitor and signal-er. Steps to Reproduce: - Several waitors wait on pthread_cond_wait with holding mutex. - One signal-er calls broadcast with holding mutex Actual results: The thread that invoked pthread_cond_broadcast occasionally hangs. When I attach gdb, the hanging thread is waiting at lll_lock_wait() even though there are no concurrent threads. Expected results: pthread_cond_broadcast should exit immediately. Additional info: The bug is already fixed in glibc's source code on April 2014: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=8f630cca5c36941db1cb48726016bbed80ec1041 glibc's bugzilla ticket: https://sourceware.org/bugzilla/show_bug.cgi?id=17013 This is a backport request because Fedora will employ glibc 2.20 much much later.
I've pushed the patch, so the next update will have this fix.
Created attachment 943335 [details] syslog excerpt Could you tell me if that is what I am seeing, or do I need to open a new ticket? After a system update on 9/30 conky started crashing occasionaly and at the same time gnome-shell and gnome-terminal behave erratically and firefox takes too long to close tabs and refuses to quit. These are the packages that were updated: Sep 30 09:45:28 Updated: glibc-common-2.18-16.fc20.x86_64 Sep 30 09:45:32 Updated: glibc-2.18-16.fc20.x86_64 Sep 30 09:45:33 Updated: 2:libwbclient-4.1.12-4.fc20.x86_64 Sep 30 09:45:34 Updated: 2:samba-libs-4.1.12-4.fc20.x86_64 Sep 30 09:45:35 Updated: 2:samba-common-4.1.12-4.fc20.x86_64 Sep 30 09:45:38 Updated: 1:xscreensaver-base-5.30-4.fc20.x86_64 Sep 30 09:45:39 Updated: gnutls-3.1.26-2.fc20.x86_64 Sep 30 09:45:39 Updated: 1:xscreensaver-extras-base-5.30-4.fc20.x86_64 Sep 30 09:45:40 Updated: gnutls-dane-3.1.26-2.fc20.x86_64 Sep 30 09:45:40 Updated: 1:xscreensaver-gl-base-5.30-4.fc20.x86_64 Sep 30 09:45:41 Updated: 2:libsmbclient-4.1.12-4.fc20.x86_64 Sep 30 09:45:42 Updated: 2:samba-winbind-modules-4.1.12-4.fc20.x86_64 Sep 30 09:45:43 Updated: 2:samba-winbind-4.1.12-4.fc20.x86_64 Sep 30 09:45:43 Updated: ibus-anthy-python-1.5.6-1.fc20.noarch Sep 30 09:45:57 Updated: ibus-anthy-1.5.6-1.fc20.x86_64 Sep 30 09:45:59 Updated: glibc-headers-2.18-16.fc20.x86_64 Sep 30 09:46:00 Updated: glibc-devel-2.18-16.fc20.x86_64 Sep 30 09:46:00 Updated: 2:samba-winbind-clients-4.1.12-4.fc20.x86_64 Sep 30 09:46:02 Updated: 2:samba-client-4.1.12-4.fc20.x86_64 Sep 30 09:46:04 Updated: 1:xscreensaver-gl-extras-5.30-4.fc20.x86_64 Sep 30 09:46:04 Updated: gnutls-utils-3.1.26-2.fc20.x86_64 Sep 30 09:46:06 Updated: 1:xscreensaver-extras-5.30-4.fc20.x86_64 Sep 30 09:46:06 Updated: libcmis-0.4.1-5.fc20.x86_64 Sep 30 09:46:07 Updated: libseccomp-2.1.1-0.fc20.x86_64 Sep 30 09:46:08 Updated: liblangtag-0.5.4-4.fc20.x86_64 Sep 30 09:46:08 Updated: perl-Data-Dumper-2.154-1.fc20.x86_64 Sep 30 09:46:09 Updated: automake-1.13.4-6.fc20.noarch Sep 30 09:46:11 Updated: glibc-2.18-16.fc20.i686 Sep 30 09:46:12 Updated: gnutls-3.1.26-2.fc20.i686 Sep 30 09:49:36 Updated: gnome-chemistry-utils-0.14.9-2.fc20.x86_64 Sep 30 10:17:21 Updated: goffice-0.10.18-1.fc20.x86_64 Sep 30 10:17:23 Updated: ca-certificates-2014.2.1-1.1.fc20.noarch
Thanks for backporting it, Siddhesh. Alex, I'm not the expert on this, so this is just my guess from your syslog. The first seemingly related logs look like fork failure with "Resource temporarily unavailable". If I were you, I'd suspect the value of ulimit -u and -e as well as out-of-stack-memory in this case (isn't "ulimit -s" too big, aren't there too many processes/threads, etc). I skimmed the rest of the syslog. I'd say there are too few information to figure out whether this is caused by this particular bug in glibc. Usually you have to attach a debugger unless the application happens to write out super-detailed info in syslog about the cause of errors (which doesn't seem the case this time).
I wouldn't mind launching these programs with strace or gdb, but I can't figure a way to reproduce the problem. The first time it happened I was reading a long article in firefox and didn't notice things had gone haywire until I was done; the second time I was waiting for an emerge to complete on a gentoo system via ssh, I detached and rebooted. Some other times I wasn't even sitting in front of my monitor. These are the limits for my user: $ ulimit -s 8192 $ ulimit -u 1024 $ ulimit -e 0 I have run the system for hours, sometimes with many processes running concurrently (virtual machines, LAMP, ncurses based stuff, etc.), with no hitch. On a couple of other systems, but which are 32-bit, I haven't noticed similar crashes after the glibc update. I guess I'll have to wait for Siddhesh's patch to get packaged and see how that goes. Thank you very much for the input.
(In reply to Alexander Ploumistos from comment #2) > After a system update on 9/30 conky started crashing occasionaly and at the > same time gnome-shell and gnome-terminal behave erratically and firefox > takes too long to close tabs and refuses to quit. If this is the update that started causing problems then it is not this bug. This bug has been present since F-20 release.
After posting here, I have been working on this computer for 12+ hours every day and I have not had this issue again. I've been monitoring my logs and there was no message like those I posted. Any update that I have installed since then doesn't seem to be even remotely relevant. I haven't been work with virt-manager much, but I can't see how virt-manager could cause conky to crash or mess with my terminal. Anyway, I'll keep paying attention and if I manage to gather something meaningful, I 'll file a fresh bug report. Thank you both for your time.
Oops, "I haven't been *working*[..]".
This message is a reminder that Fedora 20 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 20. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '20'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 20 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.