Bug 590763 - PG_error bit is never cleared, even when a fresh I/O to the page succeeds
PG_error bit is never cleared, even when a fresh I/O to the page succeeds
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.6
All Linux
urgent Severity high
: rc
: ---
Assigned To: Rik van Riel
Barry Donahue
: Reopened, ZStream
: 563343 (view as bug list)
Depends On: 481371
Blocks: 591848 596334 599739
  Show dependency treegraph
 
Reported: 2010-05-10 12:07 EDT by Jeremy West
Modified: 2013-01-10 21:58 EST (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Input/output errors can occur due to temporary failures, such as multipath errors or losing network contact with an iSCSI server. In these cases, virtual memory attempts to retry the readpage() function on the memory page. However, the do_generic_file_read() function did not clear PG_error, which resulted in the system being unable to use the data in the page cache page, even if subsequent readpage() calls succeeded. With this update, the do_generic_file_read() function properly clears PG_error so that the page cache can be utilized in the case of input/output errors.
Story Points: ---
Clone Of: 481371
Environment:
Last Closed: 2011-01-13 16:31:15 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
clear PG_error before resubmitting readpage (1.30 KB, patch)
2010-05-27 11:49 EDT, Rik van Riel
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0017 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.6 kernel security and bug fix update 2011-01-13 05:37:42 EST

  None (edit)
Description Jeremy West 2010-05-10 12:07:30 EDT
+++ This bug was initially created as a clone of Bug #481371 +++

Description of problem:

The problem is that, once you set the PG_error bit for the page cache page, a regular read on the device file will always and forever see an
error.  I proposed this patch upstream:
  http://lkml.org/lkml/2009/1/23/288

A simple way around the problem is to mmap the device and read from the
locations that are giving I/O errors (but that's hardly acceptable!).


Version-Release number of selected component (if applicable):
2.4.9-78.30.EL

How reproducible:
100%

Steps to Reproduce:
1. Incur an I/O error by, for example, failing all paths to a device and trying to read from it.
2. restore the device to working order
3. try to read the failed sectors
  
Actual results:
EIO

Expected results:
Read succeeds

Additional info:
See also bug 454872 for one instance where this was seen.

--- Additional comment from jmoyer@redhat.com on 2009-01-23 16:33:18 EST ---

Created an attachment (id=329884)
Clear PG_error before issuing a readpage

This should fix the problem.

--- Additional comment from bmarzins@redhat.com on 2009-01-23 17:46:42 EST ---

I tested both RHEL5 and upstream. This bug also exists in both to them.

--- Additional comment from pm-rhel@redhat.com on 2009-06-05 09:47:00 EDT ---

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 2 Jeremy West 2010-05-10 12:09:15 EDT
*** Bug 563343 has been marked as a duplicate of this bug. ***
Comment 5 Rik van Riel 2010-05-27 11:49:11 EDT
Created attachment 417293 [details]
clear PG_error before resubmitting readpage
Comment 8 Jarod Wilson 2010-06-14 14:23:30 EDT
in kernel-2.6.18-203.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.
Comment 10 Douglas Silas 2010-06-28 16:47:40 EDT
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
Input/output errors can occur due to temporary failures, such as multipath errors or losing network contact with an iSCSI server. In these cases, virtual memory attempts to retry the readpage() function on the memory page. However, the do_generic_file_read() function did not clear PG_error, which resulted in the system being unable to use the data in the page cache page, even if subsequent readpage() calls succeeded. With this update, the do_generic_file_read() function properly clears PG_error so that the page cache can be utilized in the case of input/output errors.
Comment 15 errata-xmlrpc 2011-01-13 16:31:15 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html

Note You need to log in before you can comment on or make changes to this bug.