Bug 197396

Summary: dmeventd threads ignoring pthread_cancel causing cluster mirror recovery commands to fail
Product: Red Hat Enterprise Linux 4 Reporter: Jonathan Earl Brassow <jbrassow>
Component: device-mapperAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 4.0CC: agk, christophe.varoqui, egoggin, jlaska, kanderso, lmb, mbroz, rkenna, tranlan
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2006-0434 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-10 21:26:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 181411    
Attachments:
Description Flags
Patch to handle threads that ignore pthread_cancel none

Description Jonathan Earl Brassow 2006-06-30 20:34:50 UTC
Mirroring uses a process called 'dmeventd' to monitor devices for failures. 
When dmeventd receives a request to monitor a device, it creates a new thread to
watch it.  Also, when a device no longer needs monitoring, dmeventd will reap
the thread.

Monitoring/unmonitoring events happen as a natural course of recovery while the
device is being reconfigured.

In the case of cluster mirrors, if a log device where to fail, dmeventd would
try to reduce the mirror from one with a disk log to one without.  This means
the device must be unmonitored and remonitored when the new device is ready. 
dmeventd would get stuck waiting for a thread to be reaped, and hang the
recovery command.  The overall command would then timeout leaving the bad mirror
in place.  Further, since the command to recover is hung on a particular node,
all lvm commands to that node hang waiting for a lock and subsequently fail.

Comment 1 Jonathan Earl Brassow 2006-06-30 20:34:51 UTC
Created attachment 131831 [details]
Patch to handle threads that ignore pthread_cancel

Comment 5 Alasdair Kergon 2006-07-05 17:57:14 UTC
included in 1.02.07-3.0

Comment 9 Red Hat Bugzilla 2006-08-10 21:26:31 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0434.html