Description of problem:
I have a cluster running cluster mirroring. If I raise the I/O high enough and
fail the primary side of the mirror, it generates too many messages for the
machine to complete other critical tasks (like heartbeating for the cluster).
The messages printed are:
device-mapper: incrementing error_count on 253:3
This message is found in drivers/md/dm-raid1.c:fail_mirror(). We already get
messages from the device subsystem (e.g. scsi0 (0:0): rejecting I/O to offline
device); and the above is really unnecessary. (RHEL 5 has already pulled this
The system becomes so busy processing this useless message that cluster members
start to be removed. Once this happens, CLVM commands can not continue;
resulting in a hung recovery process. The mirror never gets recovered, and LVM
Version-Release number of selected component (if applicable):
Always (with high enough load).
Steps to Reproduce:
1. Create cluster mirror, put FS on it.
2. Fail primary leg of the mirror
Patch to kernel is a one line fix to remove an unnecessary message.
Created attachment 146217 [details]
Patch to remove print statement
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
committed in stream U5 build 45. A test kernel with this patch is available from
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.