Created attachment 927267 [details] host deploy log Description of problem: When installing a new host, if Kdump integration is selected but the host has not been configured with kdump reserved memory, host installation fails because the kdump service will not start. Version-Release number of selected component (if applicable): ovirt-engine-3.3_beta1-5029-geddafc8 How reproducible: Always Steps to Reproduce: 1. On the hosts page, select new 2. In the Power Management section, Ensure Kdump Integration is checked 3. Choose other settings as normal 4. Click OK Actual results: "Host installation failed. Command returned failure code 1 during SSH session" appears in webadmin Events. The host deploy log shows: 2014-08-15 16:01:10 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:446 execute-output: ('/bin/systemctl', 'start', 'kdump.service') stderr: Job for kdump.service failed. See 'systemctl status kdump.service' and 'journalctl -xn' for details. systemctl status kdump.service shows: Aug 15 15:16:39 picket-fence.alitke.net kdumpctl[14084]: No memory reserved for crash kernel. Aug 15 15:16:39 picket-fence.alitke.net kdumpctl[14084]: Starting kdump: [FAILED] Aug 15 15:16:39 picket-fence.alitke.net systemd[1]: kdump.service: main process exited, code=exited, status=1/FAILURE Aug 15 15:16:39 picket-fence.alitke.net systemd[1]: Failed to start Crash recovery kernel arming. Aug 15 15:16:39 picket-fence.alitke.net systemd[1]: Unit kdump.service entered failed state. Expected results: According to http://www.ovirt.org/Fence_kdump#Testing_scenarios installation should succeed but a warning should appear in the webadmin. Additional info:
Hi Adam, I'm unable to reproduce this error either with ovirt-engine-3.5.0-0.0.master.20140804172041.git23b558e (beta1 RPM is no longer available on repo) or latest master and RHEL6 or F20 hosts. I'm pretty sure we tested this, so it's quite strange for that it happened on your setup. Do you still have the setup? If so could you please post: 1) Content of /proc/cmdline file 2) Result of following commands touch /etc/kdump.conf kdumpctl restart 3) Version of ovirt-host-deploy package Thanks
This reproduces all of the time for me. Here is the requested info: 1) cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.15.10-200.fc20.x86_64 root=/dev/mapper/fedora_lager-root ro rd.lvm.lv=fedora_lager/root vconsole.font=latarcyrheb-sun16 rd.lvm.lv=fedora_lager/swap rhgb quiet 2) sudo touch /etc/kdump.conf ; echo $? 0 $ sudo kdumpctl restart Memory for crashkernel is not reserved Please reserve memory by passing "crashkernel=X@Y" parameter to the kernel kexec: failed to unloaded kdump kernel Stopping kdump: [FAILED] No kdump initial ramdisk found. Rebuilding /boot/initramfs-3.15.10-200.fc20.x86_64kdump.img No memory reserved for crash kernel. Starting kdump: [FAILED] 3) Not sure. I am using ovirt-engine-3.5 branch with a development installation.
(In reply to Adam Litke from comment #2) > This reproduces all of the time for me. Here is the requested info: > > 1) cat /proc/cmdline > BOOT_IMAGE=/vmlinuz-3.15.10-200.fc20.x86_64 > root=/dev/mapper/fedora_lager-root ro rd.lvm.lv=fedora_lager/root > vconsole.font=latarcyrheb-sun16 rd.lvm.lv=fedora_lager/swap rhgb quiet > > 2) > sudo touch /etc/kdump.conf ; echo $? > 0 > $ sudo kdumpctl restart > Memory for crashkernel is not reserved > Please reserve memory by passing "crashkernel=X@Y" parameter to the kernel > kexec: failed to unloaded kdump kernel > Stopping kdump: [FAILED] > No kdump initial ramdisk found. > Rebuilding /boot/initramfs-3.15.10-200.fc20.x86_64kdump.img > No memory reserved for crash kernel. > Starting kdump: [FAILED] This is correct because crashkernel option is not present in /proc/cmdline > > 3) Not sure. I am using ovirt-engine-3.5 branch with a development > installation. engine built in development environment is using ovirt-host-deploy-* and otopi-* RPM packages when deploying a host (host part of "host deploy" process is not contained in engine git repo, but in ovirt-host-deploy-* RPM packages). Please execute: rpm -q ovirt-host-deploy To be 100% percent sure you can look if method _crashkernel_param_present() exists in /usr/share/ovirt-host-deploy/plugins/ovirt-host-deploy/kdump/packages.py. This method is responsible for testing presence of crashkernel option. Thanks
ovirt-host-deploy-1.3.0-0.0.master.fc20.noarch
(In reply to Adam Litke from comment #4) > ovirt-host-deploy-1.3.0-0.0.master.fc20.noarch This is most probably a manual build from git. Could you please install latest ovirt-host-deploy and ovirt-host-deploy-java from official oVirt 3.5 repo [1] and try to deploy the host again? If it fails for you even with newest RPM, could you please also post host deploy log (the latest file from ${DEVENV_HOME}/var/log/ovirt-engine/host-deploy/ (the exact name is mentioned in Events tab in UI)? Thanks a lot [1] http://resources.ovirt.org/pub/ovirt-3.5-pre/rpm/fc20
Yes, it seems to be fixed upstream in ovirt-host-deploy. Closing.