Bug 611398

Summary: makedumpfile: reset_bitmap_of_free_pages: The free list is broken
Product: Red Hat Enterprise Linux 6 Reporter: Qian Cai <qcai>
Component: kexec-toolsAssignee: Cong Wang <amwang>
Status: CLOSED DUPLICATE QA Contact: Han Pingtian <phan>
Severity: high Docs Contact:
Priority: high    
Version: 6.0CC: nhorman, phan, rkhan
Target Milestone: rc   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-07-14 14:07:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Qian Cai 2010-07-05 06:15:16 UTC
Description of problem:
makedumpfile is unable to exclude free pages of a rhel6 x86_64 guest kdump vmcore.

# makedumpfile -D -d 16 vmcore vmcore.16
LOAD (0)
  phys_start : 1000
  phys_end   : 9f400
  virt_start : c0001000
  virt_end   : c009f400
LOAD (1)
  phys_start : 100000
  phys_end   : 2000000
  virt_start : c0100000
  virt_end   : c2000000
LOAD (2)
  phys_start : a000000
  phys_end   : 38000000
  virt_start : ca000000
  virt_end   : f8000000
LOAD (3)
  phys_start : 38000000
  phys_end   : 3fffb000
  virt_start : ffffffff
  virt_end   : 107ffafff
Linux kdump
page_size    : 4096

max_mapnr    : 3fffb

PAE          : ON
kernel_start : c0000000
vmalloc_start: 1

num of NODEs : 1


Memory type  : SPARSEMEM

mem_map (0)
  mem_map    : c17d0000
  pfn_start  : 0
  pfn_end    : 20000
mem_map (1)
  mem_map    : c1bd0000
  pfn_start  : 20000
  pfn_end    : 3fffb
Excluding unnecessary pages        : [100 %] reset_bitmap_of_free_pages: The free list is broken.
create_2nd_bitmap: Can't exclude unnecessary pages.

makedumpfile Failed.

Version-Release number of selected component (if applicable):
both host and guest:
kernel-2.6.32-42.el6
kexec-tools-2.0.0-96.el6

How reproducible:
always

Comment 1 Qian Cai 2010-07-05 06:25:38 UTC
x86_64 has no such problem.

Comment 3 Qian Cai 2010-07-05 06:28:09 UTC
host:
qemu-kvm-0.12.1.2-2.90.el6.x86_64

Both host and guest were using RHEL6.0-20100701.3 tree if that is matter.

Comment 4 Han Pingtian 2010-07-05 07:54:04 UTC
I have reproduced this problem on bare metal. With kernel 2.6.32-42.el6.i686, kexec-tools 2.0.0-94.el6.i686:

# makedumpfile -D  -d 16 vmcore vmcore.small
LOAD (0)
  phys_start : 10000
  phys_end   : 9d800
  virt_start : c0010000
  virt_end   : c009d800
LOAD (1)
  phys_start : 100000
  phys_end   : 2000000
  virt_start : c0100000
  virt_end   : c2000000
LOAD (2)
  phys_start : a000000
  phys_end   : 38000000
  virt_start : ca000000
  virt_end   : f8000000
LOAD (3)
  phys_start : 38000000
  phys_end   : bf30b000
  virt_start : ffffffffffffffff
  virt_end   : 8730afff
LOAD (4)
  phys_start : bf3dd000
  phys_end   : bf3de000
  virt_start : ffffffffffffffff
  virt_end   : fff
LOAD (5)
  phys_start : bf657000
  phys_end   : bf800000
  virt_start : ffffffffffffffff
  virt_end   : 1a8fff
LOAD (6)
  phys_start : 100000000
  phys_end   : 640000000
  virt_start : ffffffffffffffff
  virt_end   : 53fffffff
Linux kdump
page_size    : 4096

max_mapnr    : 640000

PAE          : ON
kernel_start : c0000000
vmalloc_start: 1

num of NODEs : 1


Memory type  : SPARSEMEM

mem_map (0)
  mem_map    : c17cb000
  pfn_start  : 0
  pfn_end    : 20000
mem_map (1)
  mem_map    : c1bcb000
  pfn_start  : 20000
  pfn_end    : 40000
mem_map (2)
  mem_map    : ca000000
  pfn_start  : 40000
  pfn_end    : 60000
mem_map (3)
  mem_map    : ca400000
  pfn_start  : 60000
  pfn_end    : 80000
mem_map (4)
  mem_map    : ca800000
  pfn_start  : 80000
  pfn_end    : a0000
mem_map (5)
  mem_map    : cac00000
  pfn_start  : a0000
  pfn_end    : c0000
mem_map (6)
  mem_map    : 0
  pfn_start  : c0000
  pfn_end    : e0000
mem_map (7)
  mem_map    : 0
  pfn_start  : e0000
  pfn_end    : 100000
mem_map (8)
  mem_map    : cb000000
  pfn_start  : 100000
  pfn_end    : 120000
mem_map (9)
  mem_map    : cb400000
  pfn_start  : 120000
  pfn_end    : 140000
mem_map (10)
  mem_map    : cb800000
  pfn_start  : 140000
  pfn_end    : 160000
mem_map (11)
  mem_map    : cbc00000
  pfn_start  : 160000
  pfn_end    : 180000
mem_map (12)
  mem_map    : cc000000
  pfn_start  : 180000
  pfn_end    : 1a0000
mem_map (13)
  mem_map    : cc400000
  pfn_start  : 1a0000
  pfn_end    : 1c0000
mem_map (14)
  mem_map    : cc800000
  pfn_start  : 1c0000
  pfn_end    : 1e0000
mem_map (15)
  mem_map    : ccc00000
  pfn_start  : 1e0000
  pfn_end    : 200000
mem_map (16)
  mem_map    : cd000000
  pfn_start  : 200000
  pfn_end    : 220000
mem_map (17)
  mem_map    : cd400000
  pfn_start  : 220000
  pfn_end    : 240000
mem_map (18)
  mem_map    : cd800000
  pfn_start  : 240000
  pfn_end    : 260000
mem_map (19)
  mem_map    : cdc00000
  pfn_start  : 260000
  pfn_end    : 280000
mem_map (20)
  mem_map    : ce000000
  pfn_start  : 280000
  pfn_end    : 2a0000
mem_map (21)
  mem_map    : ce400000
  pfn_start  : 2a0000
  pfn_end    : 2c0000
mem_map (22)
  mem_map    : ce800000
  pfn_start  : 2c0000
  pfn_end    : 2e0000
mem_map (23)
  mem_map    : cec00000
  pfn_start  : 2e0000
  pfn_end    : 300000
mem_map (24)
  mem_map    : cf000000
  pfn_start  : 300000
  pfn_end    : 320000
mem_map (25)
  mem_map    : cf400000
  pfn_start  : 320000
  pfn_end    : 340000
mem_map (26)
  mem_map    : cf800000
  pfn_start  : 340000
  pfn_end    : 360000
mem_map (27)
  mem_map    : cfc00000
  pfn_start  : 360000
  pfn_end    : 380000
mem_map (28)
  mem_map    : d0000000
  pfn_start  : 380000
  pfn_end    : 3a0000
mem_map (29)
  mem_map    : d0400000
  pfn_start  : 3a0000
  pfn_end    : 3c0000
mem_map (30)
  mem_map    : d0800000
  pfn_start  : 3c0000
  pfn_end    : 3e0000
mem_map (31)
  mem_map    : d0c00000
  pfn_start  : 3e0000
  pfn_end    : 400000
mem_map (32)
  mem_map    : 0
  pfn_start  : 400000
  pfn_end    : 420000
mem_map (33)
  mem_map    : 0
  pfn_start  : 420000
  pfn_end    : 440000
mem_map (34)
  mem_map    : 0
  pfn_start  : 440000
  pfn_end    : 460000
mem_map (35)
  mem_map    : 0
  pfn_start  : 460000
  pfn_end    : 480000
mem_map (36)
  mem_map    : 0
  pfn_start  : 480000
  pfn_end    : 4a0000
mem_map (37)
  mem_map    : 0
  pfn_start  : 4a0000
  pfn_end    : 4c0000
mem_map (38)
  mem_map    : 0
  pfn_start  : 4c0000
  pfn_end    : 4e0000
mem_map (39)
  mem_map    : 0
  pfn_start  : 4e0000
  pfn_end    : 500000
mem_map (40)
  mem_map    : 0
  pfn_start  : 500000
  pfn_end    : 520000
mem_map (41)
  mem_map    : 0
  pfn_start  : 520000
  pfn_end    : 540000
mem_map (42)
  mem_map    : 0
  pfn_start  : 540000
  pfn_end    : 560000
mem_map (43)
  mem_map    : 0
  pfn_start  : 560000
  pfn_end    : 580000
mem_map (44)
  mem_map    : 0
  pfn_start  : 580000
  pfn_end    : 5a0000
mem_map (45)
  mem_map    : 0
  pfn_start  : 5a0000
  pfn_end    : 5c0000
mem_map (46)
  mem_map    : 0
  pfn_start  : 5c0000
  pfn_end    : 5e0000
mem_map (47)
  mem_map    : 0
  pfn_start  : 5e0000
  pfn_end    : 600000
mem_map (48)
  mem_map    : 0
  pfn_start  : 600000
  pfn_end    : 620000
mem_map (49)
  mem_map    : 0
  pfn_start  : 620000
  pfn_end    : 640000
Excluding unnecessary pages        : [100 %] reset_bitmap_of_free_pages: The free list is broken.
create_2nd_bitmap: Can't exclude unnecessary pages.

makedumpfile Failed.

Comment 5 Han Pingtian 2010-07-05 08:00:13 UTC
But the 'makedumpfile -d 1' and 'makedumpfile -d 3' can success. I will try other numbers.

Comment 6 Han Pingtian 2010-07-05 08:27:35 UTC
I can reproduce this problem on ppc64:

[root@ibm-js22-07 127.0.0.1-2010-07-05-03:03:00]# makedumpfile -d 16 vmcore vmcore.16 
Excluding unnecessary pages        : [100 %] exclude_free_page: Can't get necessary symbols for excluding free pages.
create_2nd_bitmap: Can't exclude unnecessary pages.

makedumpfile Failed.

Comment 7 Qian Cai 2010-07-05 08:33:26 UTC
(In reply to comment #6)
> I can reproduce this problem on ppc64:
> 
> [root@ibm-js22-07 127.0.0.1-2010-07-05-03:03:00]# makedumpfile -d 16 vmcore
> vmcore.16 
> Excluding unnecessary pages        : [100 %] exclude_free_page: Can't get
> necessary symbols for excluding free pages.
> create_2nd_bitmap: Can't exclude unnecessary pages.
> 
> makedumpfile Failed.    

This looks like a separate issue. Perhaps need to forward-port another patch from rhel5 - bug 465396.

Comment 8 Han Pingtian 2010-07-05 08:52:09 UTC
It seems this problem doesn't exist on x86_64.

Comment 9 Han Pingtian 2010-07-05 09:43:38 UTC
makedumpfile works just fine with '1', '2', '4' and '8', but fails with '16' and '31'.

Comment 10 Cong Wang 2010-07-13 10:25:52 UTC
I tried the latest kexec-tools, -112, I can't see this bug any more. I did see -99 has this problem. So probably it's Neil's patch for 611654 which fixes it.

Please verify.

Comment 11 Han Pingtian 2010-07-14 09:34:28 UTC
Confirm that -115.el6 doesn't has this problem any longer.

Comment 12 Neil Horman 2010-07-14 14:07:20 UTC

*** This bug has been marked as a duplicate of bug 611654 ***