Red Hat Bugzilla – Bug 468438
List corruption in cmirror causes machine lock-up or cmirror processing stoppage
Last modified: 2010-01-11 21:08:44 EST
'commit e07369b28d7a569e742d80152ef10c9d42bc2650' introduced a bug where a tfr struct would get added to one queue (cluster_queue) before being removed from another (x->delay_queue).
This causes a variety of issues, including:
- machine hang (if clogd is in real time scheduling mode)
- LVM/dmsetup command hangs
- sync stoppage
... and any number of things that can result from corrupted list or lost requests.
Author: Jonathan Brassow <email@example.com>
Date: Fri Oct 24 13:42:06 2008 -0500
clogd: Fix for bug 468438 - list corruption
'commit e07369b28d7a569e742d80152ef10c9d42bc2650' introduced the
concept of a delay queue to hold requests while membership changes
occurred. Sometimes, a request would be added to the delay_queue
/and/ the cluster_queue, resulting in list corruption. Depending
on how the list was corrupted, infinite loops could occur, or
requests could simply be lost.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.