Bug 1130660 - Host installation fails if kdump integration is selected and no crashkernel memory is reserved
Summary: Host installation fails if kdump integration is selected and no crashkernel m...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-host-deploy
Classification: oVirt
Component: Plugins.kdump
Version: master
Hardware: Unspecified
OS: Unspecified
unspecified
medium vote
Target Milestone: ---
: 1.3.0
Assignee: Martin Perina
QA Contact: Pavel Stehlik
URL:
Whiteboard: infra
Depends On:
Blocks: 1079821
TreeView+ depends on / blocked
 
Reported: 2014-08-15 20:10 UTC by Adam Litke
Modified: 2016-02-10 19:32 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-09-03 14:32:28 UTC
oVirt Team: Infra


Attachments (Terms of Use)
host deploy log (236.52 KB, text/x-log)
2014-08-15 20:10 UTC, Adam Litke
no flags Details

Description Adam Litke 2014-08-15 20:10:31 UTC
Created attachment 927267 [details]
host deploy log

Description of problem:

When installing a new host, if Kdump integration is selected but the host has not been configured with kdump reserved memory, host installation fails because the kdump service will not start.


Version-Release number of selected component (if applicable):

ovirt-engine-3.3_beta1-5029-geddafc8

How reproducible:

Always

Steps to Reproduce:
1. On the hosts page, select new
2. In the Power Management section, Ensure Kdump Integration is checked
3. Choose other settings as normal
4. Click OK

Actual results:

"Host installation failed. Command returned failure code 1 during SSH session" appears in webadmin Events.  

The host deploy log shows: 
2014-08-15 16:01:10 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:446 execute-output: ('/bin/systemctl', 'start', 'kdump.service') stderr:
Job for kdump.service failed. See 'systemctl status kdump.service' and 'journalctl -xn' for details.

systemctl status kdump.service shows:
Aug 15 15:16:39 picket-fence.alitke.net kdumpctl[14084]: No memory reserved for crash kernel.
Aug 15 15:16:39 picket-fence.alitke.net kdumpctl[14084]: Starting kdump: [FAILED]
Aug 15 15:16:39 picket-fence.alitke.net systemd[1]: kdump.service: main process exited, code=exited, status=1/FAILURE
Aug 15 15:16:39 picket-fence.alitke.net systemd[1]: Failed to start Crash recovery kernel arming.
Aug 15 15:16:39 picket-fence.alitke.net systemd[1]: Unit kdump.service entered failed state.


Expected results:
According to http://www.ovirt.org/Fence_kdump#Testing_scenarios installation should succeed but a warning should appear in the webadmin.

Additional info:

Comment 1 Martin Perina 2014-08-19 13:21:31 UTC
Hi Adam,

I'm unable to reproduce this error either with ovirt-engine-3.5.0-0.0.master.20140804172041.git23b558e (beta1 RPM is no longer available on repo) or latest master and RHEL6 or F20 hosts. I'm pretty sure we tested this, so it's quite strange for that it happened on your setup. Do you still have the setup? If so could you please post:

1) Content of /proc/cmdline file

2) Result of following commands
     touch /etc/kdump.conf
     kdumpctl restart

3) Version of ovirt-host-deploy package


Thanks

Comment 2 Adam Litke 2014-08-25 15:47:38 UTC
This reproduces all of the time for me.  Here is the requested info:

1) cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.15.10-200.fc20.x86_64 root=/dev/mapper/fedora_lager-root ro rd.lvm.lv=fedora_lager/root vconsole.font=latarcyrheb-sun16 rd.lvm.lv=fedora_lager/swap rhgb quiet

2)
sudo touch /etc/kdump.conf ; echo $?
0
$ sudo kdumpctl restart
Memory for crashkernel is not reserved
Please reserve memory by passing "crashkernel=X@Y" parameter to the kernel
kexec: failed to unloaded kdump kernel
Stopping kdump: [FAILED]
No kdump initial ramdisk found.
Rebuilding /boot/initramfs-3.15.10-200.fc20.x86_64kdump.img
No memory reserved for crash kernel.
Starting kdump: [FAILED]

3) Not sure.  I am using ovirt-engine-3.5 branch with a development installation.

Comment 3 Martin Perina 2014-08-25 18:01:00 UTC
(In reply to Adam Litke from comment #2)
> This reproduces all of the time for me.  Here is the requested info:
> 
> 1) cat /proc/cmdline 
> BOOT_IMAGE=/vmlinuz-3.15.10-200.fc20.x86_64
> root=/dev/mapper/fedora_lager-root ro rd.lvm.lv=fedora_lager/root
> vconsole.font=latarcyrheb-sun16 rd.lvm.lv=fedora_lager/swap rhgb quiet
> 
> 2)
> sudo touch /etc/kdump.conf ; echo $?
> 0
> $ sudo kdumpctl restart
> Memory for crashkernel is not reserved
> Please reserve memory by passing "crashkernel=X@Y" parameter to the kernel
> kexec: failed to unloaded kdump kernel
> Stopping kdump: [FAILED]
> No kdump initial ramdisk found.
> Rebuilding /boot/initramfs-3.15.10-200.fc20.x86_64kdump.img
> No memory reserved for crash kernel.
> Starting kdump: [FAILED]

This is correct because crashkernel option is not present in /proc/cmdline

> 
> 3) Not sure.  I am using ovirt-engine-3.5 branch with a development
> installation.

engine built in development environment is using ovirt-host-deploy-* and otopi-* RPM packages when deploying a host (host part of "host deploy" process is not contained in engine git repo, but in ovirt-host-deploy-* RPM packages).
Please execute:
  rpm -q ovirt-host-deploy

To be 100% percent sure you can look if method _crashkernel_param_present() exists in /usr/share/ovirt-host-deploy/plugins/ovirt-host-deploy/kdump/packages.py. This method is responsible for testing presence of crashkernel option.

Thanks

Comment 4 Adam Litke 2014-08-25 18:35:45 UTC
ovirt-host-deploy-1.3.0-0.0.master.fc20.noarch

Comment 5 Martin Perina 2014-08-25 18:51:32 UTC
(In reply to Adam Litke from comment #4)
> ovirt-host-deploy-1.3.0-0.0.master.fc20.noarch

This is most probably a manual build from git. Could you please install latest ovirt-host-deploy and ovirt-host-deploy-java from official oVirt 3.5 repo [1] and try to deploy the host again? If it fails for you even with newest RPM, could you please also post host deploy log (the latest file from ${DEVENV_HOME}/var/log/ovirt-engine/host-deploy/ (the exact name is mentioned in Events tab in UI)?

Thanks a lot

[1] http://resources.ovirt.org/pub/ovirt-3.5-pre/rpm/fc20

Comment 6 Adam Litke 2014-09-03 14:32:28 UTC
Yes, it seems to be fixed upstream in ovirt-host-deploy.  Closing.


Note You need to log in before you can comment on or make changes to this bug.