Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 4 product line. The current stable release is 4.9. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 223865

Summary:

Unneeded message sprays to console causing nodes to leave cluster

Product:

Red Hat Enterprise Linux 4

Reporter:

Jonathan Earl Brassow <jbrassow>

Component:

kernel

Assignee:

Jonathan Earl Brassow <jbrassow>

Status:

CLOSED ERRATA

QA Contact:

Brian Brock <bbrock>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

4.0

CC:

agk, jbaron, kanderso, rkenna

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

RHBA-2007-0304

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2007-05-08 04:44:06 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Patch to remove print statement	none

Description Jonathan Earl Brassow 2007-01-22 20:05:24 UTC

Description of problem:
I have a cluster running cluster mirroring.  If I raise the I/O high enough and
fail the primary side of the mirror, it generates too many messages for the
machine to complete other critical tasks (like heartbeating for the cluster).

The messages printed are:
device-mapper: incrementing error_count on 253:3
...
This message is found in drivers/md/dm-raid1.c:fail_mirror().  We already get
messages from the device subsystem (e.g. scsi0 (0:0): rejecting I/O to offline
device); and the above is really unnecessary.  (RHEL 5 has already pulled this
message out.)

The system becomes so busy processing this useless message that cluster members
start to be removed.  Once this happens, CLVM commands can not continue;
resulting in a hung recovery process.  The mirror never gets recovered, and LVM
commands stop.

Version-Release number of selected component (if applicable):
kernel-2.6.9-42.EL

How reproducible:
Always (with high enough load).

Steps to Reproduce:
1. Create cluster mirror, put FS on it.
2. Fail primary leg of the mirror
3.

Additional info:
Patch to kernel is a one line fix to remove an unnecessary message.

Comment 1 Jonathan Earl Brassow 2007-01-22 20:06:49 UTC

Created attachment 146217 [details]
Patch to remove print statement

Comment 4 RHEL Program Management 2007-01-30 17:05:26 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 Jason Baron 2007-02-01 19:35:49 UTC

committed in stream U5 build 45. A test kernel with this patch is available from
http://people.redhat.com/~jbaron/rhel4/

Comment 9 Red Hat Bugzilla 2007-05-08 04:44:06 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0304.html