Bug 770000 - KVM kdump out of puff!
Summary: KVM kdump out of puff!
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kexec-tools
Version: 6.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Dave Young
QA Contact: Guangze Bai
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-12-22 23:17 UTC by Kevin W. Rudd
Modified: 2015-02-08 21:37 UTC (History)
7 users (show)

Fixed In Version: kexec-tools-2.0.0-248.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-21 07:47:27 UTC
Target Upstream Version:


Attachments (Terms of Use)
Full console output from failed kdump (8.75 KB, application/x-bzip2)
2011-12-22 23:17 UTC, Kevin W. Rudd
no flags Details
sosreport from test guest (583.88 KB, application/x-xz)
2011-12-22 23:20 UTC, Kevin W. Rudd
no flags Details
exclude virtio_balloon module (521 bytes, patch)
2012-08-14 06:58 UTC, Dave Young
no flags Details | Diff
function cleanup (1.19 KB, patch)
2012-08-16 01:53 UTC, Dave Young
no flags Details | Diff
exclude virtio_balloon (1.26 KB, patch)
2012-08-16 01:53 UTC, Dave Young
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:0281 normal SHIPPED_LIVE kexec-tools bug fix and enhancement update 2013-02-20 20:37:00 UTC

Description Kevin W. Rudd 2011-12-22 23:17:37 UTC
Created attachment 549252 [details]
Full console output from failed kdump

Virtio_balloon huffs and puffs, but can't seem to blow that kdump down.

Kdump fails on a KVM guest if balloon memory is involved.

Setting up a KVM test guest with 3G of memory (both Current and Max)
works just fine.  But, in my test, if the Max is increased to 4G, kdump
will fail with OOM errors:

...
Loading snd-seq.ko module
virtio_balloon virtio2: Out of puff! Can't get 256 pages
insmod invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0,
oom_score_adj=0
insmod cpuset=/ mems_allowed=0
Pid: 231, comm: insmod Not tainted 2.6.32-220.2.1.el6.x86_64 #1
Call Trace:
 [<ffffffff810c2ad1>] ? cpuset_print_task_mems_allowed+0x91/0xb0
 [<ffffffff81113850>] ? dump_header+0x90/0x1b0
 [<ffffffff8120d79c>] ? security_real_capable_noaudit+0x3c/0x70
 [<ffffffff81113cda>] ? oom_kill_process+0x8a/0x2c0
 [<ffffffff81113c11>] ? select_bad_process+0xe1/0x120
 [<ffffffff81114130>] ? out_of_memory+0x220/0x3c0
 [<ffffffff81123e4e>] ? __alloc_pages_nodemask+0x89e/0x940
 [<ffffffff8115dbe2>] ? kmem_getpages+0x62/0x170
 [<ffffffff8115e7fa>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff8115e24f>] ? cache_grow+0x2cf/0x320
 [<ffffffff8115e579>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff8115f35b>] ? kmem_cache_alloc+0x11b/0x190
 [<ffffffff811ec6f2>] ? sysfs_new_dirent+0x42/0x130
 [<ffffffff811eb85c>] ? sysfs_add_file_mode+0x3c/0xb0
 [<ffffffff811eeb71>] ? internal_create_group+0xc1/0x1a0
 [<ffffffff811eec83>] ? sysfs_create_group+0x13/0x20
 [<ffffffff8108da83>] ? module_param_sysfs_setup+0x93/0xc0
 [<ffffffff810ad04d>] ? mod_sysfs_setup+0x5d/0xd0
 [<ffffffff810aef5a>] ? load_module+0x187a/0x1ca0
 [<ffffffff810abde0>] ? setup_modinfo_srcversion+0x0/0x30
 [<ffffffff810af3fb>] ? sys_init_module+0x7b/0x250
 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:   42, btch:   7 usd:  11
active_anon:36 inactive_anon:4 isolated_anon:0
 active_file:0 inactive_file:0 isolated_file:0
 unevictable:3066 dirty:0 writeback:0 unstable:0
 free:454 slab_reclaimable:1000 slab_unreclaimable:4249
 mapped:50 shmem:0 pagetables:8 bounce:0
Node 0 DMA free:388kB min:4kB low:4kB high:4kB active_anon:0kB
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:400kB mlocked:0kB
dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 126 126 126
Node 0 DMA32 free:1428kB min:1428kB low:1784kB high:2140kB
active_anon:144kB inactive_anon:16kB active_file:0kB inactive_file:0kB
unevictable:12264kB isolated(anon):0kB isolated(file):0kB
present:129192kB mlocked:0kB dirty:0kB writeback:0kB mapped:200kB
shmem:0kB slab_reclaimable:4000kB slab_unreclaimable:16996kB
kernel_stack:352kB pagetables:32kB unstable:0kB bounce:0kB
writeback_tmp:0kB pages_scanned:3002 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 2*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB
0*1024kB 0*2048kB 0*4096kB = 384kB
Node 0 DMA32: 1*4kB 0*8kB 1*16kB 0*32kB 0*64kB 1*128kB 1*256kB 0*512kB
1*1024kB 0*2048kB 0*4096kB = 1428kB
3066 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap  = 0kB
Total swap = 0kB
45306 pages RAM
16550 pages reserved
74 pages shared
25312 pages non-shared
[ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
[  231]     0   231      309       62   0       0             0 insmod
Out of memory: Kill process 231 (insmod) score 1 or sacrifice child
Killed process 231, UID 0, (insmod) total-vm:1236kB, anon-rss:132kB,
file-rss:116kB
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
KILL
...

dominfo shows the state before, and after kdump starts:

Before:

virsh # dominfo RHEL6
Id:             26
Name:           RHEL6
UUID:           6b9dc03a-2a7b-f93c-5a2b-42f5254e1958
OS Type:        hvm
State:          running
CPU(s):         1
CPU time:       14.8s
Max memory:     4194304 kB
Used memory:    3145728 kB
Persistent:     yes
Autostart:      disable
Managed save:   no

After:

virsh # dominfo RHEL6
Id:             25
Name:           RHEL6
UUID:           6b9dc03a-2a7b-f93c-5a2b-42f5254e1958
OS Type:        hvm
State:          running
CPU(s):         1
CPU time:       513.2s
Max memory:     4194304 kB
Used memory:    4119008 kB
Persistent:     yes
Autostart:      disable
Managed save:   no

It's as if the balloon driver is almost able to get there (but falls a
bit short).  If you can get to virsh quick enough and bump the memory
to the Max with setmem, the kdump process will actually complete
(depending on which modules got the axe earlier in the process).

Comment 1 Kevin W. Rudd 2011-12-22 23:20:45 UTC
Created attachment 549253 [details]
sosreport from test guest

$ cat sosreport-kvmtest-20111222142325-2c5c.tar.xz.md5
19c054d1e718cfece8aeb3d9a5432c5c

Comment 3 Curtis Taylor 2011-12-22 23:48:12 UTC
Try adding:

    blacklist virtio_balloon

To /etc/kdump.conf
Then "service kdump restart" to boild a new kdump.conf

Comment 4 Kevin W. Rudd 2011-12-23 00:12:22 UTC
Thanks.  That workaround works fine in my test case.

Comment 6 Ademar Reis 2012-04-30 18:26:28 UTC
There's a workaround and this is not critical, so moving to 6.4. Maybe /etc/kdump.conf should blacklist virtio_baloon by default, I'm not sure.

Comment 7 RHEL Product and Program Management 2012-07-10 08:08:29 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 8 RHEL Product and Program Management 2012-07-10 23:31:55 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 13 Dave Young 2012-08-14 06:58:36 UTC
Created attachment 604172 [details]
exclude virtio_balloon module

Hi, 

Could you try the attached patch, see if it resolves the problem?

Comment 15 Kevin W. Rudd 2012-08-15 17:20:34 UTC
The patch works fine with one minor edit:

virtio-balloon needs to be virtio_balloon

For the base mkdumprd in RHEL6.3, this is what I ended up testing:

# diff -au mkdumprd.orig mkdumprd
--- mkdumprd.orig       2012-08-15 09:40:44.009504958 -0700
+++ mkdumprd    2012-08-15 10:19:28.203734588 -0700
@@ -2577,7 +2577,7 @@
     _base_name=$(basename $MODULE)
     if [[ "$_base_name" =~ "snd" ]] || [[ "$_base_name" =~ "soundcore" ]] ||
     [[ "$_base_name" =~ "cfg80211" ]] || [[ "$_base_name" =~ "mac80211" ]] ||
-    [[ "$_base_name" =~ "iwl" ]]; then
+    [[ "$_base_name" =~ "iwl" ]] || [[ "$_base_name" =~ "virtio_balloon" ]]; then
         MODULES=${MODULES/$MODULE/}
         continue
     fi

Comment 16 Dave Young 2012-08-16 01:53:04 UTC
Created attachment 604740 [details]
function cleanup

Comment 17 Dave Young 2012-08-16 01:53:55 UTC
Created attachment 604741 [details]
exclude virtio_balloon

Comment 18 Dave Young 2012-08-16 01:59:25 UTC
(In reply to comment #15)
> The patch works fine with one minor edit:
> 
> virtio-balloon needs to be virtio_balloon
> 
> For the base mkdumprd in RHEL6.3, this is what I ended up testing:
> 
> # diff -au mkdumprd.orig mkdumprd
> --- mkdumprd.orig       2012-08-15 09:40:44.009504958 -0700
> +++ mkdumprd    2012-08-15 10:19:28.203734588 -0700
> @@ -2577,7 +2577,7 @@
>      _base_name=$(basename $MODULE)
>      if [[ "$_base_name" =~ "snd" ]] || [[ "$_base_name" =~ "soundcore" ]] ||
>      [[ "$_base_name" =~ "cfg80211" ]] || [[ "$_base_name" =~ "mac80211" ]]
> ||
> -    [[ "$_base_name" =~ "iwl" ]]; then
> +    [[ "$_base_name" =~ "iwl" ]] || [[ "$_base_name" =~ "virtio_balloon"
> ]]; then
>          MODULES=${MODULES/$MODULE/}
>          continue
>      fi

Hi, thanks for the remind, I have cleanuped a bit about the function, and changed to virtio_balloon. I have verified in a kvm guest that the virtio_balloon ko will be excluded.

If you would like to retest it, it will be better, thanks again.

Comment 19 Kevin W. Rudd 2012-08-20 21:17:56 UTC
The latest modifications work fine.  Thanks.

Comment 23 errata-xmlrpc 2013-02-21 07:47:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0281.html


Note You need to log in before you can comment on or make changes to this bug.