Bug 223865

Summary: Unneeded message sprays to console causing nodes to leave cluster
Product: Red Hat Enterprise Linux 4 Reporter: Jonathan Earl Brassow <jbrassow>
Component: kernelAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: agk, jbaron, kanderso, rkenna
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0304 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-08 04:44:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch to remove print statement none

Description Jonathan Earl Brassow 2007-01-22 20:05:24 UTC
Description of problem:
I have a cluster running cluster mirroring.  If I raise the I/O high enough and
fail the primary side of the mirror, it generates too many messages for the
machine to complete other critical tasks (like heartbeating for the cluster).

The messages printed are:
device-mapper: incrementing error_count on 253:3
...
This message is found in drivers/md/dm-raid1.c:fail_mirror().  We already get
messages from the device subsystem (e.g. scsi0 (0:0): rejecting I/O to offline
device); and the above is really unnecessary.  (RHEL 5 has already pulled this
message out.)

The system becomes so busy processing this useless message that cluster members
start to be removed.  Once this happens, CLVM commands can not continue;
resulting in a hung recovery process.  The mirror never gets recovered, and LVM
commands stop.

Version-Release number of selected component (if applicable):
kernel-2.6.9-42.EL

How reproducible:
Always (with high enough load).

Steps to Reproduce:
1. Create cluster mirror, put FS on it.
2. Fail primary leg of the mirror
3.

Additional info:
Patch to kernel is a one line fix to remove an unnecessary message.

Comment 1 Jonathan Earl Brassow 2007-01-22 20:06:49 UTC
Created attachment 146217 [details]
Patch to remove print statement

Comment 4 RHEL Program Management 2007-01-30 17:05:26 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 Jason Baron 2007-02-01 19:35:49 UTC
committed in stream U5 build 45. A test kernel with this patch is available from
http://people.redhat.com/~jbaron/rhel4/


Comment 9 Red Hat Bugzilla 2007-05-08 04:44:06 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0304.html