Bug 1437706

Summary: glibc corrupting open mode - misc/mntent_r.c:__setmntent()
Product: Red Hat Enterprise Linux 6 Reporter: Manjunath Patil <manjunath.b.patil>
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: CLOSED DUPLICATE QA Contact: qe-baseos-tools-bugs
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.9CC: ashankar, codonell, deepak.patel, fweimer, isaac.chen, manjunath.b.patil, mnewsome, pfrankli
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-31 01:22:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Manjunath Patil 2017-03-31 00:40:22 UTC
Description of problem:
The glibc-rh1012343.patch which was included in glibc-2.12-1.208.el6
introduced a possible corruption window. Following is the change added
by the glibc-rh1012343.patch -

95     char newmode[modelen + 2];
96  -  memcpy (mempcpy (newmode, mode, modelen), "c", 2);
97  +  memcpy (mempcpy (newmode, mode, modelen), "ce", 2);
98     FILE *result = fopen (file, newmode);

Here the patch added a new character to newmode, without increasing the
size of newmode. As a result, the newmode will not have a null character
at the end. The fopen expecting newmode to be string, could end up
reading more character than it should be.

Version-Release number of selected component (if applicable):
glibc-2.12-1.208.el6

How reproducible:
Doing a start and shutdown-abort of oracle database in a loop for over 100 times with NFS storage triggers this issue intermittently.

A simple test-case to reproduce this issue is on the way.

Steps to Reproduce:
1. configure oracle database with NFS storage
2. start database
3. shutdown abort database [ non-graceful termination]
4. repeat 2 and 3 in a loop for about 100 times
5. You will see database start failing with EACCES error
[ EACCES was due to corruption of open mode. In this case, open mode "r" (in source file) was converted to "rw" (in strace output)]

Additional info:
This issue is fixed in mainline -
[... glibc.git.mainline]# git blame misc/mntent_r.c | grep newmode
312be3f9 (Ulrich Drepper        2011-11-15 04:24:42 -0500  41)   char newmode[modelen + 3];
312be3f9 (Ulrich Drepper        2011-11-15 04:24:42 -0500  42)   memcpy (mempcpy (newmode, mode, modelen), "ce", 3);
ee8449f7 (Ulrich Drepper        2003-09-04 08:27:37 +0000  43)   FILE *result = fopen (file, newmode);

[... glibc.git.mainline]# git log -n 1 -p 312be3f9 | head
commit 312be3f9f5eab1643d7dcc7728c76d413d4f2640
Author: Ulrich Drepper <drepper>
Date:   Tue Nov 15 04:24:42 2011 -0500

    Clean up internal fopen uses
   
    No need to ever not use c and e.

[... glibc.git.mainline]# git log -n 1 -p 312be3f9 | grep newmode
-  char newmode[modelen + 2];
-  memcpy (mempcpy (newmode, mode, modelen), "c", 2);
+  char newmode[modelen + 3];
+  memcpy (mempcpy (newmode, mode, modelen), "ce", 3);
   FILE *result = fopen (file, newmode);

Once newmode size is increased to 3, we couldn't reproduce the issue with database test case.

Comment 2 Carlos O'Donell 2017-03-31 01:22:47 UTC
(In reply to Manjunath Patil from comment #0)
> Once newmode size is increased to 3, we couldn't reproduce the issue with
> database test case.

Manjunath,

Thank you very much for reporting this issue.

In bug 1437147 we acknowledge Oracle as the first reporter via their blog post:
https://bugzilla.redhat.com/show_bug.cgi?id=1437147#c2

I'm going to close this bug as a duplicate of the RHEL 6.9 bug 1437618 (RHEL 6.10 bug 1437147), which has already been fixed and will be released shortly in our RHEL 6.9 batch update.

Please note that our own internal audit revealed that 3 functions were not yet correctly covered for cancellation, and they have been fixed in the upcoming update along with the fix in setmntent() for the missing null terminator.

Cheers,
Carlos.

*** This bug has been marked as a duplicate of bug 1437618 ***