Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1130660

Summary: Host installation fails if kdump integration is selected and no crashkernel memory is reserved
Product: [oVirt] ovirt-host-deploy Reporter: Adam Litke <alitke>
Component: Plugins.kdumpAssignee: Martin Perina <mperina>
Status: CLOSED CURRENTRELEASE QA Contact: Pavel Stehlik <pstehlik>
Severity: medium Docs Contact:
Priority: unspecified    
Version: masterCC: alitke, alonbl, bazulay, bugs, dougsland, ecohen, gklein, iheim, mperina, yeylon
Target Milestone: ---Keywords: Triaged
Target Release: 1.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-09-03 14:32:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1079821    
Attachments:
Description Flags
host deploy log none

Description Adam Litke 2014-08-15 20:10:31 UTC
Created attachment 927267 [details]
host deploy log

Description of problem:

When installing a new host, if Kdump integration is selected but the host has not been configured with kdump reserved memory, host installation fails because the kdump service will not start.


Version-Release number of selected component (if applicable):

ovirt-engine-3.3_beta1-5029-geddafc8

How reproducible:

Always

Steps to Reproduce:
1. On the hosts page, select new
2. In the Power Management section, Ensure Kdump Integration is checked
3. Choose other settings as normal
4. Click OK

Actual results:

"Host installation failed. Command returned failure code 1 during SSH session" appears in webadmin Events.  

The host deploy log shows: 
2014-08-15 16:01:10 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:446 execute-output: ('/bin/systemctl', 'start', 'kdump.service') stderr:
Job for kdump.service failed. See 'systemctl status kdump.service' and 'journalctl -xn' for details.

systemctl status kdump.service shows:
Aug 15 15:16:39 picket-fence.alitke.net kdumpctl[14084]: No memory reserved for crash kernel.
Aug 15 15:16:39 picket-fence.alitke.net kdumpctl[14084]: Starting kdump: [FAILED]
Aug 15 15:16:39 picket-fence.alitke.net systemd[1]: kdump.service: main process exited, code=exited, status=1/FAILURE
Aug 15 15:16:39 picket-fence.alitke.net systemd[1]: Failed to start Crash recovery kernel arming.
Aug 15 15:16:39 picket-fence.alitke.net systemd[1]: Unit kdump.service entered failed state.


Expected results:
According to http://www.ovirt.org/Fence_kdump#Testing_scenarios installation should succeed but a warning should appear in the webadmin.

Additional info:

Comment 1 Martin Perina 2014-08-19 13:21:31 UTC
Hi Adam,

I'm unable to reproduce this error either with ovirt-engine-3.5.0-0.0.master.20140804172041.git23b558e (beta1 RPM is no longer available on repo) or latest master and RHEL6 or F20 hosts. I'm pretty sure we tested this, so it's quite strange for that it happened on your setup. Do you still have the setup? If so could you please post:

1) Content of /proc/cmdline file

2) Result of following commands
     touch /etc/kdump.conf
     kdumpctl restart

3) Version of ovirt-host-deploy package


Thanks

Comment 2 Adam Litke 2014-08-25 15:47:38 UTC
This reproduces all of the time for me.  Here is the requested info:

1) cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.15.10-200.fc20.x86_64 root=/dev/mapper/fedora_lager-root ro rd.lvm.lv=fedora_lager/root vconsole.font=latarcyrheb-sun16 rd.lvm.lv=fedora_lager/swap rhgb quiet

2)
sudo touch /etc/kdump.conf ; echo $?
0
$ sudo kdumpctl restart
Memory for crashkernel is not reserved
Please reserve memory by passing "crashkernel=X@Y" parameter to the kernel
kexec: failed to unloaded kdump kernel
Stopping kdump: [FAILED]
No kdump initial ramdisk found.
Rebuilding /boot/initramfs-3.15.10-200.fc20.x86_64kdump.img
No memory reserved for crash kernel.
Starting kdump: [FAILED]

3) Not sure.  I am using ovirt-engine-3.5 branch with a development installation.

Comment 3 Martin Perina 2014-08-25 18:01:00 UTC
(In reply to Adam Litke from comment #2)
> This reproduces all of the time for me.  Here is the requested info:
> 
> 1) cat /proc/cmdline 
> BOOT_IMAGE=/vmlinuz-3.15.10-200.fc20.x86_64
> root=/dev/mapper/fedora_lager-root ro rd.lvm.lv=fedora_lager/root
> vconsole.font=latarcyrheb-sun16 rd.lvm.lv=fedora_lager/swap rhgb quiet
> 
> 2)
> sudo touch /etc/kdump.conf ; echo $?
> 0
> $ sudo kdumpctl restart
> Memory for crashkernel is not reserved
> Please reserve memory by passing "crashkernel=X@Y" parameter to the kernel
> kexec: failed to unloaded kdump kernel
> Stopping kdump: [FAILED]
> No kdump initial ramdisk found.
> Rebuilding /boot/initramfs-3.15.10-200.fc20.x86_64kdump.img
> No memory reserved for crash kernel.
> Starting kdump: [FAILED]

This is correct because crashkernel option is not present in /proc/cmdline

> 
> 3) Not sure.  I am using ovirt-engine-3.5 branch with a development
> installation.

engine built in development environment is using ovirt-host-deploy-* and otopi-* RPM packages when deploying a host (host part of "host deploy" process is not contained in engine git repo, but in ovirt-host-deploy-* RPM packages).
Please execute:
  rpm -q ovirt-host-deploy

To be 100% percent sure you can look if method _crashkernel_param_present() exists in /usr/share/ovirt-host-deploy/plugins/ovirt-host-deploy/kdump/packages.py. This method is responsible for testing presence of crashkernel option.

Thanks

Comment 4 Adam Litke 2014-08-25 18:35:45 UTC
ovirt-host-deploy-1.3.0-0.0.master.fc20.noarch

Comment 5 Martin Perina 2014-08-25 18:51:32 UTC
(In reply to Adam Litke from comment #4)
> ovirt-host-deploy-1.3.0-0.0.master.fc20.noarch

This is most probably a manual build from git. Could you please install latest ovirt-host-deploy and ovirt-host-deploy-java from official oVirt 3.5 repo [1] and try to deploy the host again? If it fails for you even with newest RPM, could you please also post host deploy log (the latest file from ${DEVENV_HOME}/var/log/ovirt-engine/host-deploy/ (the exact name is mentioned in Events tab in UI)?

Thanks a lot

[1] http://resources.ovirt.org/pub/ovirt-3.5-pre/rpm/fc20

Comment 6 Adam Litke 2014-09-03 14:32:28 UTC
Yes, it seems to be fixed upstream in ovirt-host-deploy.  Closing.