Created attachment 1628960 [details] Ovirt Host Tab for one of the hosts Description of problem: The Hosts' tab in the admin portal shows that kdump status is "Disabled" while all hosts have kdump enabled and loaded successfully. Version-Release number of selected component (if applicable): ovirt-engine-4.3.6.7-1.el7.noarch ovirt-engine-webadmin-portal-4.3.6.7-1.el7.noarch How reproducible: Always - all hosts (3) are reported as kdump 'Disabled' Steps to Reproduce: 1.Install engine (v4.2.7) 2.Keep upgrading till 4.3.6 Actual results: Hosts' "Kdump Status:" is wrong ~~~ [root@ovirt3 ~]# systemctl status kdump ● kdump.service - Crash recovery kernel arming Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled) Active: active (exited) since вт 2019-10-08 16:07:14 EEST; 2 weeks 2 days ago Process: 2394 ExecStart=/usr/bin/kdumpctl start (code=exited, status=0/SUCCESS) Main PID: 2394 (code=exited, status=0/SUCCESS) Tasks: 0 CGroup: /system.slice/kdump.service окт 08 16:07:13 ovirt3.localdomain dracut[3620]: drwxr-xr-x 2 root root 0 Oct 8 16:06 usr/share/zoneinfo/Europe окт 08 16:07:13 ovirt3.localdomain dracut[3620]: -rw-r--r-- 1 root root 2104 Sep 26 14:34 usr/share/zoneinfo/Europe/Sofia окт 08 16:07:13 ovirt3.localdomain dracut[3620]: drwxr-xr-x 2 root root 0 Oct 8 16:06 var окт 08 16:07:13 ovirt3.localdomain dracut[3620]: lrwxrwxrwx 1 root root 11 Oct 8 16:06 var/lock -> ../run/lock окт 08 16:07:13 ovirt3.localdomain dracut[3620]: lrwxrwxrwx 1 root root 6 Oct 8 16:06 var/run -> ../run окт 08 16:07:13 ovirt3.localdomain dracut[3620]: ======================================================================== окт 08 16:07:13 ovirt3.localdomain dracut[3620]: *** Creating initramfs image file '/boot/initramfs-3.10.0-1062.1.2.el7.x86_64kdump.img' done *** окт 08 16:07:14 ovirt3.localdomain kdumpctl[2394]: kexec: loaded kdump kernel окт 08 16:07:14 ovirt3.localdomain kdumpctl[2394]: Starting kdump: [OK] окт 08 16:07:14 ovirt3.localdomain systemd[1]: Started Crash recovery kernel arming. ~~~ [root@ovirt2 watchdog0]# systemctl status kdump.service ● kdump.service - Crash recovery kernel arming Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled) Active: active (exited) since пт 2019-10-25 00:11:12 EEST; 7min ago Process: 25334 ExecStop=/usr/bin/kdumpctl stop (code=exited, status=0/SUCCESS) Process: 25343 ExecStart=/usr/bin/kdumpctl start (code=exited, status=0/SUCCESS) Main PID: 25343 (code=exited, status=0/SUCCESS) окт 25 00:11:10 ovirt2.localdomain systemd[1]: Starting Crash recovery kernel arming... окт 25 00:11:11 ovirt2.localdomain kdumpctl[25343]: /usr/bin/kdumpctl: line 571: /sys/class/watchdog/watchdog0/device/modalias: No such file or directory окт 25 00:11:12 ovirt2.localdomain kdumpctl[25343]: kexec: loaded kdump kernel окт 25 00:11:12 ovirt2.localdomain kdumpctl[25343]: Starting kdump: [OK] окт 25 00:11:12 ovirt2.localdomain systemd[1]: Started Crash recovery kernel arming. ~~~ [root@ovirt1 ~]# systemctl status kdump ● kdump.service - Crash recovery kernel arming Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled) Active: active (exited) since чт 2019-10-24 21:08:15 EEST; 3h 11min ago Process: 3529 ExecStart=/usr/bin/kdumpctl start (code=exited, status=0/SUCCESS) Main PID: 3529 (code=exited, status=0/SUCCESS) Tasks: 0 CGroup: /system.slice/kdump.service окт 24 21:08:09 ovirt1.localdomain systemd[1]: Starting Crash recovery kernel arming... окт 24 21:08:14 ovirt1.localdomain kdumpctl[3529]: /usr/bin/kdumpctl: line 571: /sys/class/watchdog/watchdog0/device/modalias: No such file or directory окт 24 21:08:15 ovirt1.localdomain kdumpctl[3529]: kexec: loaded kdump kernel окт 24 21:08:15 ovirt1.localdomain kdumpctl[3529]: Starting kdump: [OK] окт 24 21:08:15 ovirt1.localdomain systemd[1]: Started Crash recovery kernel arming. ~~~ Expected results: The Host's "Kdump Status:" should indicate "Enabled". Additional info: Screenshot attached
(In reply to Strahil Nikolov from comment #0) > Created attachment 1628960 [details] > Ovirt Host Tab for one of the hosts > > Description of problem: > The Hosts' tab in the admin portal shows that kdump status is "Disabled" > while all hosts have kdump enabled and loaded successfully. What do you mean by Hosts tab? Do you mean that if go from Hosts view to the detail of the host and there, in General tab, you can see "Kdump status: Disabled"? If so, have you enabled Kdump integration in Host detail in Power Management tab? Kdump integration in oVirt is not only about having kdump service running on the host, we can also configure kdump to be able to send messages to oVirt engine when host is dumping. For more details please take a look at below presentation: Video: https://www.youtube.com/watch?v=RAGV_za_Qvw Slides: http://www.slideshare.net/MartinPeina/integrating-kdump-into-ovirt
My power management is not configured despite IPMI is available, as they are in a separate VLAN (simulating a prod setup). If kdump integration requires power management to be enabled - then the info should be 'Not Configured' or something like that. I will check if I can work this around
The main reason of Kdump integration into oVirt is to prevent fencing (power management restart) of the host, while the host is dumping. If this is not prevented, then crash dump is not saved and you will loose diagnostic data. That's why Kdump integration is a part of Power Management. I also don't understand the issue with separate VLAN, because having separate physical network for power management is preferred solution. Let me explain how fencing works in oVirt: 1. Let's have host1 and host2 in the same cluster, and host1 became non-responsive (which means that oVirt engine cannot communicate with it) 2. Engine picks some host which is Up (engine can communicate with this host), in this scenario it will be host2 3. Selected host (host2) will be used as fence proxy, meaning that engine tell VDSM on host2 to execute fence agent on host2 to perform power management operation for host1 Above means that engine doesn't need to have direct connection to power management devices, but each host in the cluster need to be able to access power management interfaces of each other host.
In our environment, we have a separate network for Remote Management (IPMI) and no host have and no host will ever have access to remote management. The systems can be accessed over IPMI via a special 'Bastion' Windows (actually it's Citrix) Host. So fencing cannot be setup in prod , thus no fencing is setup for the lab (same security requirements exist).
So in that case you cannot use power management and because of that you don't need to configured Kdump integration. Just please be aware that without power management you will not be able to HA VMs. So apart from the issue, that you current setup is not compatible with oVirt requirements for power mamagement is there any items which you think should be fixed?
The message 'Disabled' is misleading. Maybe a new 'Not Configured' is more suitable.
OK, so let's rename 'Kdump Status" to 'Kdump Integration Status' in Host detail view.
This bugzilla is included in oVirt 4.5.0 release, published on April 20th 2022. Since the problem described in this bug report should be resolved in oVirt 4.5.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.