This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1979990 - glibc: pthread_cond_wait missed wakeup (swbz#25847)
Summary: glibc: pthread_cond_wait missed wakeup (swbz#25847)
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: glibc
Version: 8.4
Hardware: Unspecified
OS: Linux
high
urgent
Target Milestone: beta
: ---
Assignee: Carlos O'Donell
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard:
Depends On: 1889892
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-07 14:44 UTC by Andrew Mike
Modified: 2023-09-23 22:23 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1889892
Environment:
Last Closed: 2023-09-23 22:23:38 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   RHEL-8381 0 None Migrated None 2023-09-23 22:23:49 UTC

Description Andrew Mike 2021-07-07 14:44:13 UTC
+++ This bug was initially created as a clone of Bug #1889892 +++

Description of problem:

This bug was submitted by Qin Li to glibc bugzilla earlier this year, with a one-line patch, though it hasn't been merged into glibc yet:

https://sourceware.org/bugzilla/show_bug.cgi?id=25847

Version-Release number of selected component: glibc-2.27 onwards

How reproducible: reliably, try the repro from the sourceware url above

Actual results: deadlocks after 30-120 minutes on a 4-core Fedora 32 box

Expected results: should never deadlock

Additional info:

This bug in pthread conditions will deadlock the OCaml runtime, as well as Python and .NET applications.

The bug was introduced in glibc 2.27 and is still present in glibc 2.31.

I confirm the repro from the above deadlocks on Fedora 32. Takes about 30-180 minutes on a 4 core server.

I further confirm that the one-line fix to glibc at the above applies cleanly to Fedora 32's glibc source rpm, and does not deadlock after running the repro for more than 30 hours.

Please kindly consider merging the one-line fix into Fedora glibc.

More background about this bug, for the sake of future internet searchers:
* https://discuss.ocaml.org/t/is-there-a-known-recent-linux-locking-bug-that-affects-the-ocaml-runtime

--- Additional comment from Michael Bacarella on 2020-10-20 20:34:52 UTC ---

will deadlock

--- Additional comment from Michael Bacarella on 2020-10-20 20:35:47 UTC ---



--- Additional comment from Carlos O'Donell on 2020-10-27 13:21:56 UTC ---

We are looking to fix this for Fedora and Red Hat Enterprise Linux 8 as this has impact to users on both platforms.

--- Additional comment from Török Edwin on 2020-11-01 17:59:31 UTC ---

Small modification to upstream testcase that abort()s when the loop is stuck for several iterations.

--- Additional comment from Carlos O'Donell on 2020-11-10 14:25:18 UTC ---

Delaying the review of this until the end of November when we have more time to review upstream patches.

--- Additional comment from Fedora Program Management on 2021-04-29 17:06:51 UTC ---

This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

--- Additional comment from Carlos O'Donell on 2021-04-29 20:14:38 UTC ---

Still a bug, and still in Rawhide.

Comment 1 Carlos O'Donell 2021-07-09 13:36:04 UTC
The goal is to review and improve the situation with pthread convdvar wakeup in upstream glibc 2.34 (releasing August 2021).

Any fixes that go upstream can then be considered for inclusion into RHEL8 from that point onwards.

I'm working upstream with a colleague at IBM to review the correctness of the fixes being proposed.

Comment 7 RHEL Program Management 2023-09-23 22:22:24 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 8 RHEL Program Management 2023-09-23 22:23:38 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.


Note You need to log in before you can comment on or make changes to this bug.