Bug 124624 - mmap use causes kernel panic
mmap use causes kernel panic
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Larry Woodman
:
: 124626 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-05-27 20:47 EDT by ara howard
Modified: 2007-11-30 17:07 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-09-02 00:31:42 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description ara howard 2004-05-27 20:47:05 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.2)
Gecko/20040415

Description of problem:
programs which mmap large files, but in small chunks using offset and
length, in rw mode cause kernel panic

Version-Release number of selected component (if applicable):
2.4.21-15.EL

How reproducible:
Always

Steps to Reproduce:
1. create a large file
  
  [ahoward@harp ahoward]$ dd if=/dev/zero of=1gb bs=8192 count=131072
  131072+0 records in
  131072+0 records out

2. compile this program

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <sys/mman.h>
#include <fcntl.h>


#define TILE_SIZE 1048576

/*
 *
 * ~ > gcc filemap_bug.c -o filemap_bug
 * ~ > filemap_bug big_file
 *
 */

int
main (argc, argv)
     int argc;
     char **argv;
{

  int ret;
  char *path;
  struct stat buf;
  off_t size, offset, length, tn;
  int fd;
  void *mem;
  unsigned char *start;
  unsigned char *byte;
  int i;

  if (argc < 2)
    {
      fprintf (stderr, "%s huge_input_file\n", __FILE__);
      return (EXIT_FAILURE);
    }

  path = *(argv + 1);
  ret = stat (path, &buf);
  size = buf.st_size;
  fd = open (path, O_RDWR);

  for (offset = 0, tn = 0; offset < size; offset += TILE_SIZE, tn++)
    {
      length = size - offset;
      length = length > TILE_SIZE ? TILE_SIZE : length;
      fprintf (stdout, "<%s>[%d,%d] - tile_number <%d>\n", path,
offset, length, tn);

      mem = mmap (NULL, length, PROT_READ | PROT_WRITE, MAP_SHARED,
fd, offset);
      start = (unsigned char *)mem;
      madvise (start, length, MADV_SEQUENTIAL);
      for (byte = start; byte - start < length; byte++)
        {
          *byte = 42; 
        }
      msync (start, length, MS_SYNC); 
      munmap (start, length);
    }

  close (fd);
  return (EXIT_SUCCESS);
}


3. run the program on created file

  ./a.out 1gb

4. watch the kernel panic (for me around tile 140)


    

Actual Results:  kernel panic

Expected Results:  every byte of the input file == 42

Additional info:
Comment 1 Rik van Riel 2004-05-27 21:42:43 EDT
*** Bug 124626 has been marked as a duplicate of this bug. ***
Comment 2 Rik van Riel 2004-05-27 21:44:50 EDT
Ara, what exactly is the error message you get from the kernel ?

If it contains a null pointer dereference in page_referenced(), a
patch for that got applied to the RHEL code base recently...
Comment 3 ara howard 2004-05-28 08:55:47 EDT
we see something like

  filemap.c:2371 bad pmd c.............

and i __think__ we also saw a screen full of stuff which contained

 ...
 page_referenced()
 ...

but the console server was flaky at that time...

we have done this 4 times and seen the 'bad pmd' error each time.

cheers.
Comment 4 Larry Woodman 2004-06-01 16:08:00 EDT
OK, I can reproduce the problem locally so I'll work on fixing it.

Larry
Comment 5 ara howard 2004-06-01 16:14:51 EDT
great! - please let me know if i can do anything from this end.  we've
got about 160 liscensed enterprise boxes here that we use for processing 
HUGE files so lack of a working mmap is a real show stopper.
Comment 6 Larry Woodman 2004-06-02 21:29:31 EDT
OK, I think its fixed.  Please try out this kernel and let me know how
it goes:

http://people.redhat.com/~lwoodman/.RHEL3/


Larry
Comment 7 Larry Woodman 2004-06-08 10:55:22 EDT
Ara, any news on whether this kernel fixes your problems?

Larry
Comment 8 ara howard 2004-06-08 11:20:39 EDT
yes!  sorry i've not gotten back to you - crazy week.  the patch
worked beautifully.  all i've got left is to try it on an smp machine.
 i will try to get to that today and get back to you.  thanks very
much for the prompt - and correct! - patch.  any idea what the release
schedule for these things normally are?  our sysads typically only run
'official' kernels... ;-(

cheers.

-a
Comment 9 Larry Woodman 2004-06-08 11:26:52 EDT
It will be included in RHEL3-U3 and that has a mid-August release
date target.

Larry
Comment 10 Ernie Petrides 2004-06-09 00:22:13 EDT
Larry's fix for this problem has just been committed to the RHEL3 U3
patch pool this evening (in kernel version 2.4.21-15.8.EL).
Comment 11 John Flanagan 2004-09-02 00:31:42 EDT
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-433.html

Note You need to log in before you can comment on or make changes to this bug.