Bug 125178
Summary: | LTC9119-Random page cache corruption when audit is enabled in rhel 3 kernels | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Peter Martuccelli <peterm> |
Component: | kernel | Assignee: | Peter Martuccelli <peterm> |
Status: | CLOSED ERRATA | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | bugproxy, ccb, fenlason, khake, kweidner, petrides, riel, tao |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-09-02 04:31:45 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Peter Martuccelli
2004-06-03 13:28:16 UTC
note that the addresses where the corruption occurs appear to be the *last* addresses in a page. Probably the code in drivers/audit/args.c:do_realpath is walking backwards too far and inserting a slash before the start of the string, in a memory page that doesn't belong to it. ubject: [Bug 9119] New: - RH125178-Random page cache corruption when audit is enabled in rhel 3 kernels Importance: normal References: <9119.bugzilla.com> In-Reply-To: <9119.bugzilla.com> X-Bugzilla-Reason: Reporter X-Bugzilla-Family: Distro Service Message-Id: <20040603215825.4048593B67.ibm.com> Date: Thu, 3 Jun 2004 17:58:25 -0400 (EDT) Do not reply to this note. It was sent by a machine. Instead append your comments to the bug at the URL below. https://bugzilla.linux.ibm.com/show_bug.cgi?id=9119 Summary: RH125178-Random page cache corruption when audit is enabled in rhel 3 kernels Vendor: Red Hat Linux Version: RHEL3 U2 Platform: xSeries Architecture: All Submitting Project: Bluefortress Customer Priority: -- Owning Team: LTC OSC Acceptance: N/S Customer Status: N/S Required Date: 0000-00-00 00:00:00 Target Date: 2000-00-00 00:00:00 Make External: --- Status: OPEN Technical Severity: high Engineer Priority: P2 Component: Kernel Owner: khoa.com SubmittedBy: gjlynx.com QAContact: khoa.com Opened by Peter Martuccelli (peterm) on 2004-06-03 09:28 From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.2) Gecko/20040301 Description of problem: Random page cache entries will have a '/' character written to them on page aligned boundaries. Suspected kernel routine is do_realpath in drivers/audit/args.c. Example of corruption when a kernel compilation is used to reproduce the issue follows. Corrupted file drivers/char/epca.c Code snippet from pristine code base pc_callout.name = "cud"; pc_callout.major = DIGICU_MAJOR; <== this line is correct pc_callout.minor_start = 0; pc_callout.init_termios.c_cflag = B9600 | CS8 | CREAD | CLOCAL | HUPCL; pc_callout.subtype = SERIAL_TYPE_CALLOUT; [peterm@redrum char]$ md5sum epca.c b9f50dcc5528a9fe3db522d7c942024c epca.c [peterm@redrum char]$ Code snippet from corrupted file pc_callout.name = "cud"; pc_callout.major = DIGICU/MAJOR; <== this line has been modified pc_callout.minor_start = 0; pc_callout.init_termios.c_cflag = B9600 | CS8 | CREAD | CLOCAL | HUPCL; pc_callout.subtype = SERIAL_TYPE_CALLOUT; MD5 sum before reboot: File has been modified. [peterm@redrum char]$ md5sum epca.c 76fab5e6a01648ffd808fd0cde8418b9 epca.c The page is not marked dirty, and is never flushed, it remains corrupted for the life of the bootload. Following a reboot the file reverts back to its original state. MD5 sum after reboot: File reverts back to pristine version. [peterm@redrum char]$ md5sum epca.c b9f50dcc5528a9fe3db522d7c942024c epca.c Version-Release number of selected component (if applicable): kernel-2.4.21-15 How reproducible: Always Steps to Reproduce: 1. enable auditing 2. recompile kernel, make modules, etc 3. look for compilation errors, investigate compilation failures looking for a '/' inserted into the source code, header file, etc. 4. reboot and edit corrupted file a second time, '/' is no longer present. 5. depending on the corrution to the page cache entries you may experience an oops. Additional info: ------- Additional Comment #1 From Klaus Weidner (klaus) on 2004-06-03 12:28 ------- note that the addresses where the corruption occurs appear to be the *last* addresses in a page. Probably the code in drivers/audit/args.c:do_realpath is walking backwards too far and inserting a slash before the start of the string, in a memory page that doesn't belong to it. I had three engineers running the patch today with no problems reported. I am moving ahead with getting the patch applied to the RHEL3 kernel. Thanks Peter, Can you tell me if the other observed issue was tested with this patch ( and possible resolved?). It was the issue of: filesystems corruption, after reboot modified files lose the modification ----- Additional Comments From khoa.com 2004-06-04 19:22 ------- Since fix is already available and will be accepted by Red Hat, I'd like to move this bug into FixedAwaitingTest state. Thanks. ----- Additional Comments From khake.com 2004-06-07 18:27 ------- Peter: When can you make this available for the team to use as part of the test case runs? Thanks. A fix for this problem has just been committed to the RHEL3 U3 patch pool this evening (in kernel version 2.4.21-15.11.EL). Users running LAuS to audit system calls without this fix are at risk of incurring data corruption. You only incur persistent data corruption when the flawed code writes to a dirty page. An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-433.html ----- Additional Comments From markwiz.com 2004-09-15 11:25 EDT ------- IBM - RHEL3 U3 is available and this bug should be fixed. Please test and post results. |