Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 590763

Summary:

PG_error bit is never cleared, even when a fresh I/O to the page succeeds

Product:

Red Hat Enterprise Linux 5

Reporter:

Jeremy West <jwest>

Component:

kernel

Assignee:

Rik van Riel <riel>

Status:

CLOSED ERRATA

QA Contact:

Barry Donahue <bdonahue>

Severity:

high

Docs Contact:

Priority:

urgent

Version:

5.6

CC:

bdonahue, bmarzins, coughlan, dhoward, djeffery, jmoyer, jpirko, jwest, lwang, msnitzer, plyons, riel, tao

Target Milestone:

Keywords:

Reopened, ZStream

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Input/output errors can occur due to temporary failures, such as multipath errors or losing network contact with an iSCSI server. In these cases, virtual memory attempts to retry the readpage() function on the memory page. However, the do_generic_file_read() function did not clear PG_error, which resulted in the system being unable to use the data in the page cache page, even if subsequent readpage() calls succeeded. With this update, the do_generic_file_read() function properly clears PG_error so that the page cache can be utilized in the case of input/output errors.

Story Points:

---

Clone Of:

481371

Environment:

Last Closed:

2011-01-13 21:31:15 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

481371

Bug Blocks:

591848, 596334, 599739

Attachments:

Description	Flags
clear PG_error before resubmitting readpage	none

Description Jeremy West 2010-05-10 16:07:30 UTC

+++ This bug was initially created as a clone of Bug #481371 +++

Description of problem:

The problem is that, once you set the PG_error bit for the page cache page, a regular read on the device file will always and forever see an
error.  I proposed this patch upstream:
  http://lkml.org/lkml/2009/1/23/288

A simple way around the problem is to mmap the device and read from the
locations that are giving I/O errors (but that's hardly acceptable!).


Version-Release number of selected component (if applicable):
2.4.9-78.30.EL

How reproducible:
100%

Steps to Reproduce:
1. Incur an I/O error by, for example, failing all paths to a device and trying to read from it.
2. restore the device to working order
3. try to read the failed sectors
  
Actual results:
EIO

Expected results:
Read succeeds

Additional info:
See also bug 454872 for one instance where this was seen.

--- Additional comment from jmoyer on 2009-01-23 16:33:18 EST ---

Created an attachment (id=329884)
Clear PG_error before issuing a readpage

This should fix the problem.

--- Additional comment from bmarzins on 2009-01-23 17:46:42 EST ---

I tested both RHEL5 and upstream. This bug also exists in both to them.

--- Additional comment from pm-rhel on 2009-06-05 09:47:00 EDT ---

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 2 Jeremy West 2010-05-10 16:09:15 UTC

*** Bug 563343 has been marked as a duplicate of this bug. ***

Comment 5 Rik van Riel 2010-05-27 15:49:11 UTC

Created attachment 417293 [details]
clear PG_error before resubmitting readpage

Comment 8 Jarod Wilson 2010-06-14 18:23:30 UTC

in kernel-2.6.18-203.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 10 Douglas Silas 2010-06-28 20:47:40 UTC

Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
Input/output errors can occur due to temporary failures, such as multipath errors or losing network contact with an iSCSI server. In these cases, virtual memory attempts to retry the readpage() function on the memory page. However, the do_generic_file_read() function did not clear PG_error, which resulted in the system being unable to use the data in the page cache page, even if subsequent readpage() calls succeeded. With this update, the do_generic_file_read() function properly clears PG_error so that the page cache can be utilized in the case of input/output errors.

Comment 15 errata-xmlrpc 2011-01-13 21:31:15 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html