Bug 1181649

Summary: makedumpfile: User process data pages are not excluded appropriately.
Product: Red Hat Enterprise Linux 7 Reporter: Tetsuo Handa <penguin-kernel>
Component: kexec-toolsAssignee: kdump team <kdump-team-bugs>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: anderson, bhe, fernando, mhuang, penguin-kernel
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-30 08:05:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tetsuo Handa 2015-01-13 14:29:22 UTC
Description of problem:

The size of vmcore significantly differs depending on whether user process
data pages are filled with 0 or not. Since the makedumpfile command's -d 31
option excludes both "Pages filled with zero" and "User process data pages",
the size of vmcore should not depend on whether "User process data pages"
are filled with 0 or not.

In RHEL 5, the size of vmcore did not depend on whether "User process data
pages" are filled with 0 or not. Starting from RHEL 6, the size of vmcore
depends on whether "User process data pages" are filled with 0 or not.
Therefore, I suspect that makedumpfile command is failing to exclude user
process data pages appropriately.

If my suspect is correct, please fix this problem because I expect that the
size of vmcore is significantly reduced and sensitive data in user process
data pages is pruned by -d 31 option.


Version-Release number of selected component (if applicable):

kernel-3.10.0-123.13.2.el7 + kexec-tools-2.0.4-32.el7_0.5
kernel-2.6.32-504.3.3.el6 + kexec-tools-2.0.0-280.el6


How reproducible:

Always


Steps to Reproduce:

1. Compile a memory fulfilling program shown below.

   # gcc -Wall -O3 -o memfill memfill.c

---------- memfill.c start ----------
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
  char *buf = malloc(atoi(argv[1]));
  if (buf) {
    memset(buf, atoi(argv[2]), atoi(argv[1]));
    putchar('c');
    fflush(stdout);
  }
  return 0;
}
---------- memfill.c end ----------

2. Edit /etc/kdump.conf to disable compression option and
   enable report messages.

   core_collector makedumpfile --message-level 16 -d 31

3. Restart the kdump service.

4. Run the following commands, and the kdump will be taken if you
   give appropriate size to memfill command. 

   # swapoff -a
   # echo 3 > /proc/sys/vm/drop_caches
   # ./memfill 1610612736 0 > /proc/sysrq-trigger

   The first argument of memfill command is amount to fill and
   the second argument is a byte to fill. Above is an example
   for a system with 2048MB RAM. Change the first argument
   depending on amount of available memory.

   Rerun the same commands, with the same size and a different byte.

   # swapoff -a
   # echo 3 > /proc/sys/vm/drop_caches
   # ./memfill 1610612736 1 > /proc/sysrq-trigger

5. Check makedumpfile's messages and the size of vmcore file.


Actual results:

The size of vmcore filled with 0 case is much smaller than
that for filled with 1 case.

vmcore for filled with 0 case =   130,381,552 bytes
vmcore for filled with 1 case = 1,728,108,672 bytes

---------- messages for filled with 0 case ----------
kdump: saving vmcore
STEP [Excluding unnecessary pages] : 0.017042 seconds
STEP [Excluding unnecessary pages] : 0.011780 seconds
STEP [Copying data               ] : 1.856561 seconds

Original pages  : 0x00000000000711bf
  Excluded pages   : 0x0000000000068d16
    Pages filled with zero  : 0x00000000000615e2
    Cache pages             : 0x00000000000008b6
    Cache pages + private   : 0x0000000000000000
    User process data pages : 0x00000000000030c4
    Free pages              : 0x0000000000003dba
    Hwpoison pages          : 0x0000000000000000
  Remaining pages  : 0x00000000000084a9
  (The number of pages is reduced to 7%.)
Memory Hole     : 0x000000000000ee41
--------------------------------------------------
Total pages     : 0x0000000000080000

kdump: saving vmcore complete
---------- messages for filled with 0 case ----------

---------- messages for filled with 1 case ----------
kdump: saving vmcore
STEP [Excluding unnecessary pages] : 0.014904 seconds
STEP [Excluding unnecessary pages] : 0.013190 seconds
STEP [Copying data               ] : 20.808219 seconds

Original pages  : 0x00000000000711bf
  Excluded pages   : 0x0000000000009959
    Pages filled with zero  : 0x0000000000001d8b
    Cache pages             : 0x0000000000000b35
    Cache pages + private   : 0x0000000000000000
    User process data pages : 0x000000000000360d
    Free pages              : 0x0000000000003a8c
    Hwpoison pages          : 0x0000000000000000
  Remaining pages  : 0x0000000000067866
  (The number of pages is reduced to 91%.)
Memory Hole     : 0x000000000000ee41
--------------------------------------------------
Total pages     : 0x0000000000080000

kdump: saving vmcore complete
---------- messages for filled with 1 case ----------


Expected results:

The size of vmcore for filled with 0 case should be nearly equals to
that for filled with 1 case.

Comment 2 Dave Anderson 2015-01-14 22:05:58 UTC
I don't maintain makedumpfile,  but I'm a member of the "kdump team"
and noticed this BZ go by.  I do own/maintain the crash utility, and
since your results were certainly surprising, I thought I'd just
try to reproduce it in a current RHEL7 environment.

But running on a 3.10.0-221.el7 kernel with kexec-tools-2.0.7-14.el7,
I could not reproduce this.  

I tested it with the same sequence of commands shown in your description,
and the resultant dumpfile sizes are essentially the same, and the 
excluded page type counts are all in the same ballpark.  I also tried 
it with lzo compression, which is typically much better than using the
traditional zlib compression, and we use it by default now.

Here are my results:

Using zlib compression:

memfill 0:
  
  kdump: dump target is /dev/mapper/rhel_dell--per410--01-root
  kdump: saving to /sysroot//var/crash/127.0.0.1-2015.01.14-11:37:50/
  kdump: saving vmcore-dmesg.txt
  kdump: saving vmcore-dmesg.txt complete
  kdump: saving vmcore
  STEP [Excluding unnecessary pages] : 0.117448 seconds
  STEP [Excluding unnecessary pages] : 0.117090 seconds
  STEP [Copying data               ] : 4.732822 seconds
  STEP [Copying data               ] : 0.135796 seconds
  
  Original pages  : 0x00000000003f6511
    Excluded pages   : 0x00000000003d9b0a
      Pages filled with zero  : 0x000000000000b0ae
      Cache pages             : 0x0000000000001e8c
      Cache pages + private   : 0x0000000000000001
      User process data pages : 0x000000000006749e
      Free pages              : 0x0000000000365731
      Hwpoison pages          : 0x0000000000000000
    Remaining pages  : 0x000000000001ca07
    (The number of pages is reduced to 2%.)
  Memory Hole     : 0x0000000000039aef
  --------------------------------------------------
  Total pages     : 0x0000000000430000
  
  kdump: saving vmcore complete


memfill 1:

  kdump: dump target is /dev/mapper/rhel_dell--per410--01-root
  kdump: saving to /sysroot//var/crash/127.0.0.1-2015.01.14-15:36:13/
  kdump: saving vmcore-dmesg.txt
  kdump: saving vmcore-dmesg.txt complete
  kdump: saving vmcore
  STEP [Excluding unnecessary pages] : 0.114320 seconds
  STEP [Excluding unnecessary pages] : 0.113896 seconds
  STEP [Copying data               ] : 4.505355 seconds
  STEP [Copying data               ] : 0.103653 seconds
  
  Original pages  : 0x00000000003f6511
    Excluded pages   : 0x00000000003db0d8
      Pages filled with zero  : 0x000000000000b725
      Cache pages             : 0x0000000000001e34
      Cache pages + private   : 0x0000000000000001
      User process data pages : 0x0000000000066e41
      Free pages              : 0x0000000000366d3d
      Hwpoison pages          : 0x0000000000000000
    Remaining pages  : 0x000000000001b439
    (The number of pages is reduced to 2%.)
  Memory Hole     : 0x0000000000039aef
  --------------------------------------------------
  Total pages     : 0x0000000000430000
  
  kdump: saving vmcore complete

And the two dumpfile sizes from above are relatively close:
  
  $ du -sh 127.0.0.1-2015.01.14-11:37:50/vmcore
  444M	127.0.0.1-2015.01.14-11:37:50/vmcore
  # du -sh 127.0.0.1-2015.01.14-15:36:13/vmcore
  422M	127.0.0.1-2015.01.14-15:36:13/vmcore
  # 
  
Using lzo compression (makedumpfile -l), the compression is
much more effective, but the results are still the same:

memfill 0:
  
  kdump: dump target is /dev/mapper/rhel_dell--per410--01-root
  kdump: saving to /sysroot//var/crash/127.0.0.1-2015.01.14-15:51:59/
  kdump: saving vmcore-dmesg.txt
  kdump: saving vmcore-dmesg.txt complete
  kdump: saving vmcore
  STEP [Excluding unnecessary pages] : 0.115406 seconds
  STEP [Excluding unnecessary pages] : 0.115366 seconds
  STEP [Copying data               ] : 1.130029 seconds
  STEP [Copying data               ] : 0.034387 seconds
  
  Original pages  : 0x00000000003f6511
    Excluded pages   : 0x00000000003da16b
      Pages filled with zero  : 0x000000000000af6c
      Cache pages             : 0x0000000000001e3a
      Cache pages + private   : 0x0000000000000001
      User process data pages : 0x00000000000670c7
      Free pages              : 0x00000000003662fd
      Hwpoison pages          : 0x0000000000000000
    Remaining pages  : 0x000000000001c3a6
    (The number of pages is reduced to 2%.)
  Memory Hole     : 0x0000000000039aef
  --------------------------------------------------
  Total pages     : 0x0000000000430000
  
  kdump: saving vmcore complete
  

memfill 1:

  kdump: dump target is /dev/mapper/rhel_dell--per410--01-root
  kdump: saving to /sysroot//var/crash/127.0.0.1-2015.01.14-15:57:57/
  kdump: saving vmcore-dmesg.txt
  kdump: saving vmcore-dmesg.txt complete
  kdump: saving vmcore
  STEP [Excluding unnecessary pages] : 0.114234 seconds
  STEP [Excluding unnecessary pages] : 0.114295 seconds
  STEP [Copying data               ] : 1.061543 seconds
  STEP [Copying data               ] : 0.036045 seconds
  
  Original pages  : 0x00000000003f6511
    Excluded pages   : 0x00000000003daf44
      Pages filled with zero  : 0x000000000000b6f7
      Cache pages             : 0x0000000000001e2d
      Cache pages + private   : 0x0000000000000042
      User process data pages : 0x0000000000067295
      Free pages              : 0x0000000000366749
      Hwpoison pages          : 0x0000000000000000
    Remaining pages  : 0x000000000001b5cd
    (The number of pages is reduced to 2%.)
  Memory Hole     : 0x0000000000039aef
  --------------------------------------------------
  Total pages     : 0x0000000000430000
  
  kdump: saving vmcore complete
  
Note that the dumpfile size is reduced to about a quarter of the size
of the zlib dumpfiles:
  
  # du -sh 127.0.0.1-2015.01.14-15:51:59/vmcore 127.0.0.1-2015.01.14-15:57:57/vmcore
  98M	127.0.0.1-2015.01.14-15:51:59/vmcore
  95M	127.0.0.1-2015.01.14-15:57:57/vmcore
  #
  

I checked the makedumpfile sources, and see that user-space,
page cache, and free pages will be checked for and filtered first.
Only then are zero-filled pages checked.  So the zero-fill check
should never even "see" the memfill 0 or 1 memory pages, because
they would be recognized as user-space pages first.

Furthermore, given that the memfill program fills 1610612736 bytes 
(0x60000 pages) with 0/1, the dumpfile statistics should show at 
least that many user process pages.  And in my tests above, the 
user process page counts were as expected, regardless whether they
were filled with 0 or 1:

      User process data pages : 0x000000000006749e
      User process data pages : 0x0000000000066e41
      User process data pages : 0x00000000000670c7
      User process data pages : 0x0000000000067295

The strange thing about your results for memfill 0 is that
you see 0x615e2 zero-filled pages, and only 0x30c4 user
space pages, which doesn't make sense:
  
> ---------- messages for filled with 0 case ----------
> kdump: saving vmcore
> STEP [Excluding unnecessary pages] : 0.017042 seconds
> STEP [Excluding unnecessary pages] : 0.011780 seconds
> STEP [Copying data               ] : 1.856561 seconds
> 
> Original pages  : 0x00000000000711bf
>   Excluded pages   : 0x0000000000068d16
>     Pages filled with zero  : 0x00000000000615e2
>     Cache pages             : 0x00000000000008b6
>     Cache pages + private   : 0x0000000000000000
>     User process data pages : 0x00000000000030c4
>     Free pages              : 0x0000000000003dba
>     Hwpoison pages          : 0x0000000000000000
>   Remaining pages  : 0x00000000000084a9
>   (The number of pages is reduced to 7%.)
> Memory Hole     : 0x000000000000ee41

It almost looks like the user-space pages were not recognized
as such, and were subsequently recognized as zero-filled pages?

The same thing goes for your memfill 1 case, because again, 
why is the user space page count so low?:

> ---------- messages for filled with 1 case ----------
> kdump: saving vmcore
> STEP [Excluding unnecessary pages] : 0.014904 seconds
> STEP [Excluding unnecessary pages] : 0.013190 seconds
> STEP [Copying data               ] : 20.808219 seconds
> 
> Original pages  : 0x00000000000711bf
>   Excluded pages   : 0x0000000000009959
>     Pages filled with zero  : 0x0000000000001d8b
>     Cache pages             : 0x0000000000000b35
>     Cache pages + private   : 0x0000000000000000
>     User process data pages : 0x000000000000360d
>     Free pages              : 0x0000000000003a8c
>     Hwpoison pages          : 0x0000000000000000
>   Remaining pages  : 0x0000000000067866
>   (The number of pages is reduced to 91%.)
> Memory Hole     : 0x000000000000ee41
> --------------------------------------------------
> Total pages     : 0x0000000000080000
> 
> kdump: saving vmcore complete
 

So it appears that all of your memfill pages were captured in 
both dumpfiles instead of being filtered as user space pages.
And given that, it makes sense the the "memfill 1" dump
would be much larger in size, because the 0x615e2 zero-fill
pages all share/point-to a single zero-filled dumpfile page,
whereas all the "memfill 1" pages would each have their 
own compressed page in the dumpfile.

Comment 3 Tetsuo Handa 2015-01-15 02:42:45 UTC
Thank you for testing. I confirmed that updating kexec-tools package
to 2.0.7-13.el7 fixes this problem in RHEL 7.

Note that below results used default configuration (i.e. size difference
is smaller than #1 because compression option is enabled).

kexec-tools-2.0.4-32.el7_0.5.x86_64.rpm
33720   /var/crash/127.0.0.1-2015.01.15-10:49:03 (filled with 0 case)
48812   /var/crash/127.0.0.1-2015.01.15-10:50:21 (filled with 1 case)

kexec-tools-2.0.7-13.el7.x86_64.rpm
25288   /var/crash/127.0.0.1-2015.01.15-10:58:48 (filled with 0 case)
25056   /var/crash/127.0.0.1-2015.01.15-10:59:45 (filled with 1 case)

kexec-tools-2.0.0-280.el6.x86_64.rpm
24096   /var/crash/127.0.0.1-2015-01-15-11:32:25 (filled with 0 case)
38196   /var/crash/127.0.0.1-2015-01-15-11:33:23 (filled with 1 case)

Therefore, I'd like to wait for updated kexec-tools package in RHEL 6.
(Would you change product selection from RHEL 7 to RHEL 6?)

Comment 4 Dave Anderson 2015-01-15 19:47:34 UTC
> Therefore, I'd like to wait for updated kexec-tools package in RHEL 6.
> (Would you change product selection from RHEL 7 to RHEL 6?)

As I mentioned in comment #1, I am not the makedumpfile maintainer
and therefore I prefer not to get in the way.  I also prefer not get
involved in bugzilla flag/version modifications.  I would suggest
closing this bugzilla and opening a new RHEL6 bugzilla, and include
the updated details that you see on a RHEL6 system.

That being said, I also tried this test on a RHEL6 machine running 
2.6.32-504.el6 along with kexec-tools-2.0.0-280.el6, and I see something
similar:

  66M	/var/crash/127.0.0.1-2015-01-15-11:11:36/vmcore  (memfill 0)
  81M	/var/crash/127.0.0.1-2015-01-15-11:29:50/vmcore  (memfill 1)

The size difference is fairly trivial, but the "memfill 1" dumpfile is 
almost 25% larger, which is kind of interesting.

However, what is more interesting is that in RHEL6, there is the
possibility of user pages getting written to the dumpfile when transparent
hugepages are enabled.  When that happens, I suspect that the user-page 
filtering does not work because of the page flags used to identify 
user pages are only seen in the first "head" page, and not seen in
any of the "tail" pages. 

For example, using the memfill program, I can read many of the
user pages in the "malloc" region.  Here 7f10ee800000 and 7f10ee801000
are in that region -- I cannot read the first one, but I can read
the second and subsequent pages in the transparent hugepage:
  
  crash> rd 7f10ee800000
  rd: page excluded: user virtual address: 7f10ee800000  type: "64-bit UVADDR"
  crash> rd 7f10ee801000
      7f10ee801000:  0101010101010101                    ........
  crash> rd 7f10ee802000
      7f10ee802000:  0101010101010101                    ........
  crash> rd 7f10ee803000
      7f10ee803000:  0101010101010101                    ........
  crash>

Makedumpfile recognize that 7f10ee800000 is a user page by the page flags,
but which also has the "head" flag set, because it is the first 4K page 
in a 2MB transparent hugepage:
  
  crash> vtop 7f10ee800000
  VIRTUAL     PHYSICAL        
  7f10ee800000  115000000       
  
     PML: 11e1eb7f0 => 11ee1e067
     PUD: 11ee1e218 => 11e7b2067
     PMD: 11e7b2ba0 => 80000001150000e7
    PAGE: 115000000  (2MB)
  
        PTE         PHYSICAL   FLAGS
  80000001150000e7  115000000  (PRESENT|RW|USER|ACCESSED|DIRTY|PSE|NX)
  
        VMA           START       END     FLAGS FILE
  ffff88011db84250 7f10ee6fa000 7f114e6fb000 100073 
  
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea0003c98000 115000000 ffff88011e118fd9 7f10ee800  1 c0000000104068 uptodate,lru,active,head,swapbacked
  crash>

But the subsequent page(s) in the hugepage only have the "tail" flag set,
so the RHEL6 version makedumpfile (makedumpfile-1.3.5 based) apparently
didn't recognize it as a user-space page:

  crash> vtop 7f10ee801000
  VIRTUAL     PHYSICAL        
  7f10ee801000  115001000       
  
     PML: 11e1eb7f0 => 11ee1e067
     PUD: 11ee1e218 => 11e7b2067
     PMD: 11e7b2ba0 => 80000001150000e7
    PAGE: 115000000  (2MB)
  
        PTE         PHYSICAL   FLAGS
  80000001150000e7  115000000  (PRESENT|RW|USER|ACCESSED|DIRTY|PSE|NX)
  
        VMA           START       END     FLAGS FILE
  ffff88011db84250 7f10ee6fa000 7f114e6fb000 100073 
  
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea0003c98038 115001000                0 7f33ae502  0 c0000000008000 tail
  crash> 

Now, I see the same kernel page-flag behavior in RHEL7, so it would seem
that the version of makedumpfile used in RHEL7 (makedumpfile-1.5.6 based)
has support to recognize "tail" pages of transparent hugepages.  

But a quick check of the sources shows that RHEL6 also seems to have 
head/tail recognition code in place as well, so I don't know exactly 
why it's not recognizing the tail pages.  Maybe there was a patch
to that code area that was only put in RHEL7?  I have no idea.
  
In any case, I leave that to the makedumpfile maintainer to check...

Comment 5 Dave Anderson 2015-01-15 20:53:14 UTC
Correction: with respect to my speculation w/respect to page.flags, as
it turns out, they are not even used for filtering user pages.

In both RHEL6 and RHEL7 versions of makedumpfile, the user
page check does not look at the page.flags, but rather the 
contents of the page.mapping address:

                /*
                 * Exclude the data page of the user process.
                 */
                else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
                    && isAnon(mapping)) {
                        if (clear_bit_on_2nd_bitmap_for_kernel(pfn, cycle))
                                pfn_user++;
                }

And in both RHEL6 and RHEL7, isAnon() is identical:
  
  static inline int
  isAnon(unsigned long mapping)
  {
          return ((unsigned long)mapping & PAGE_MAPPING_ANON) != 0;
  }

where PAGE_MAPPING_ANON is the same as the kernel:

  #define PAGE_MAPPING_ANON       (1)

where the 1-bit in the page.mapping address is "borrowed" for use
as a flag.

So looking at the RHEL6 examples in my last comment, you can see
the 1-bit set in the "MAPPING" address of the first 4K page of the
transparent hugepage:

       PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea0003c98000 115000000 ffff88011e118fd9 7f10ee800  1 c0000000104068 uptodate,lru,active,head,swapbacked 

But the second and subsequent 4K pages in the hugepage have NULL
page.mapping pointers:

        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea0003c98038 115001000                0 7f33ae502  0 c0000000008000 tail
  
So upon first glance, that would seem to be the reason that the second 
and subsequent user pages were not filtered.

However, in RHEL7, the page.mapping values are the similar in nature,
for example, here for 7fe54f000000 (head) and 7fe54f001000 (tail):

  crash> vtop 7fe54f000000
  VIRTUAL     PHYSICAL        
  7fe54f000000  3f2800000       
  
     PML: 42293d7f8 => 41f9a0067
     PUD: 41f9a0ca8 => 42288b067
     PMD: 42288b3c0 => 80000003f28000e7
    PAGE: 3f2800000  (2MB)
  
        PTE         PHYSICAL   FLAGS
  80000003f28000e7  3f2800000  (PRESENT|RW|USER|ACCESSED|DIRTY|PSE|NX)
  
        VMA           START       END     FLAGS FILE
  ffff88041f9646c0 7fe52a1e5000 7fe58a1e6000 100073 
  
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea000fca0000 3f2800000 ffff88041f8f4f01 7fe54f000  1 2fffff00084068 uptodate,lru,active,head,swapbacked
  crash> vtop 7fe54f001000
  VIRTUAL     PHYSICAL        
  7fe54f001000  3f2801000       
  
     PML: 42293d7f8 => 41f9a0067
     PUD: 41f9a0ca8 => 42288b067
     PMD: 42288b3c0 => 80000003f28000e7
    PAGE: 3f2800000  (2MB)
  
        PTE         PHYSICAL   FLAGS
  80000003f28000e7  3f2800000  (PRESENT|RW|USER|ACCESSED|DIRTY|PSE|NX)
  
        VMA           START       END     FLAGS FILE
  ffff88041f9646c0 7fe52a1e5000 7fe58a1e6000 100073 
  
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea000fca0040 3f2801000                0        0  0 2fffff00008000 tail
  crash> 
  
But the second and subsequent pages gets filtered in RHEL7:

  crash> rd 7fe54f000000
  rd: page excluded: user virtual address: 7fe54f000000  type: "64-bit UVADDR"
  crash> rd 7fe54f001000
  rd: page excluded: user virtual address: 7fe54f001000  type: "64-bit UVADDR"
  crash> rd 7fe54f002000
  rd: page excluded: user virtual address: 7fe54f002000  type: "64-bit UVADDR"
  crash> rd 7fe54f003000
  rd: page excluded: user virtual address: 7fe54f003000  type: "64-bit UVADDR"
  crash> 

So I don't even understand how it works in RHEL7?   ;-(

Comment 6 Dave Anderson 2015-01-16 14:50:22 UTC
> So I don't even understand how it works in RHEL7?   ;-(

As it turns out, it does not work in RHEL7 either...

For a simplified test, I did kdump with core_collector set to "cp" to
create a copy of /proc/vmcore.  Then I tested makedumpfile with the
vmcore copy.  And as it turns out, it is possible that user pages
may not get filtered when the user data area is assigned a transparent
hugepage.  Here's a RHEL7 example:
  
  crash> sys | grep RELEASE
       RELEASE: 3.10.0-221.el7.x86_64
  crash> rd -u 7fcc14c00000
  rd: page excluded: user virtual address: 7fcc14c00000  type: "64-bit UVADDR"
  crash> rd -u 7fcc14c01000
      7fcc14c01000:  0101010101010101                    ........
  crash> vtop 7fcc14c00000
  VIRTUAL     PHYSICAL        
  7fcc14c00000  40fc00000       
  
     PML: 41d9ee7f8 => 4128b4067
     PUD: 4128b4980 => 423ac0067
     PMD: 423ac0530 => 800000040fc000e7
    PAGE: 40fc00000  (2MB)
  
        PTE         PHYSICAL   FLAGS
  800000040fc000e7  40fc00000  (PRESENT|RW|USER|ACCESSED|DIRTY|PSE|NX)
  
        VMA           START       END     FLAGS FILE
  ffff880412ae9e60 7fcc14b2a000 7fcc74b2b000 100073 
  
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea00103f0000 40fc00000 ffff8800cab0d581 7fcc14c00  1 2fffff00084068 uptodate,lru,active,head,swapbacked
  crash> vtop 7fcc14c01000
  VIRTUAL     PHYSICAL        
  7fcc14c01000  40fc01000       
  
     PML: 41d9ee7f8 => 4128b4067
     PUD: 4128b4980 => 423ac0067
     PMD: 423ac0530 => 800000040fc000e7
    PAGE: 40fc00000  (2MB)
  
        PTE         PHYSICAL   FLAGS
  800000040fc000e7  40fc00000  (PRESENT|RW|USER|ACCESSED|DIRTY|PSE|NX)
  
        VMA           START       END     FLAGS FILE
  ffff880412ae9e60 7fcc14b2a000 7fcc74b2b000 100073 
  
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea00103f0040 40fc01000                0        2  0 2fffff00008000 tail
  crash> 


In theory, this issue has been addressed in the upstream version of
makedumpfile:

 commit e8b4f93b3260defe86f5e13ca7536c07f2e32914
 Author: Atsushi Kumagai <kumagai-atsushi.nec.co.jp>
 Date:   Thu Aug 21 08:55:54 2014 +0900

    [PATCH v4] Exclude unnecessary hugepages.

    There are 2 types of hugepages in the kernel, the both should be
    excluded as user pages.
   
    1. Transparent huge pages (THP)
    All the pages are anonymous pages (at least for now), so we should
    just get how many pages are in the corresponding hugepage.
    It can be gotten from the page->lru.prev of the second page in the
    hugepage.
   
    2. Hugetlbfs pages
    The pages aren't anonymous pages but kind of user pages, we should
    exclude also these pages in any way.
    Luckily, it's possible to detect these pages by looking the
    page->lru.next of the second page in the hugepage. This idea came
    from the kernel's PageHuge().
    The number of pages can be gotten in the same way as THP.

    Changelog:
    v4:
      - Cleaned up according to Petr's and Baoquan's comments.
    v3:
      - Cleaned up according to Petr's comments.
      - Fix misdetection of hugetlb pages.
    v2:
      - Rebased to "Generic multi-page exclusion".

    Signed-off-by: Atsushi Kumagai <kumagai-atsushi.nec.co.jp>

Comment 7 Dave Anderson 2015-01-16 15:18:42 UTC
> In theory, this issue has been addressed in the upstream version of
> makedumpfile:
>
> commit e8b4f93b3260defe86f5e13ca7536c07f2e32914
> Author: Atsushi Kumagai <kumagai-atsushi.nec.co.jp>
> Date:   Thu Aug 21 08:55:54 2014 +0900

FYI, I built and tested the upstream version of makedumpfile 
from git://git.code.sf.net/p/makedumpfile/code, and verified
that all 4k pages in a transparent hugepage are filtered.

Comment 8 Dave Anderson 2015-01-16 16:08:24 UTC
And the git commit above is included in the makedumpfile-1.5.7 release
at http://sourceforge.net/projects/makedumpfile/files/makedumpfile/1.5.7

Comment 9 Dave Anderson 2015-01-20 02:46:49 UTC
The most recent RHEL7 version of kexec-tools is kexec-tools-2.0.7-15.el7, which
was built on 1/13/15, has been updated to include makedumpfile-1.5.7.  So
this issue will be fixed in the RHEL7.1 kexec-tools errata.

Comment 10 Tetsuo Handa 2015-01-20 04:56:27 UTC
(In reply to Dave Anderson from comment #9)
> The most recent RHEL7 version of kexec-tools is kexec-tools-2.0.7-15.el7,
> which
> was built on 1/13/15, has been updated to include makedumpfile-1.5.7.  So
> this issue will be fixed in the RHEL7.1 kexec-tools errata.

I see. This issue will be fixed in the RHEL7.1 GA release.
Thank you for your time.

  kexec-tools-2.0.4-32.el7_0.5.x86_64.rpm includes
  makedumpfile: version 1.5.4 (released on 3 Jul 2013)

  kexec-tools-2.0.7-13.el7.x86_64.rpm includes
  makedumpfile: version 1.5.7 (released on 18 Sep 2014)

Now, I'd like to wait for a fix for RHEL 6. Should I open a new entry?

Comment 11 Dave Anderson 2015-01-20 19:49:25 UTC
> Now, I'd like to wait for a fix for RHEL 6. Should I open a new entry?

Yes.

Comment 12 Minfei Huang 2015-01-21 03:29:52 UTC
(In reply to Dave Anderson from comment #11)
> > Now, I'd like to wait for a fix for RHEL 6. Should I open a new entry?
> 
> Yes.

Hi, everyone.

This issue is same as the bz1068674. And we plan to solve it in the rhel6.7.

Comment 14 Baoquan He 2015-03-30 08:05:13 UTC
This is huge page filtering bug and has been fixed in rhel7.1. So close it as CURRENTRELEASE. Please add comment or reopen it if any concern.