Bug 1765339 - Kdump Status in Hosts' section is wrong (Disabled)
Summary: Kdump Status in Hosts' section is wrong (Disabled)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Frontend.WebAdmin
Version: 4.3.6.7
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ovirt-4.5.0
: 4.5.0
Assignee: Eli Mesika
QA Contact: Pavol Brilla
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-24 21:22 UTC by Strahil Nikolov
Modified: 2022-04-28 09:26 UTC (History)
3 users (show)

Fixed In Version: ovirt-engine-4.5.0
Clone Of:
Environment:
Last Closed: 2022-04-28 09:26:34 UTC
oVirt Team: Infra
Embargoed:
mperina: ovirt-4.5?


Attachments (Terms of Use)
Ovirt Host Tab for one of the hosts (79.20 KB, image/png)
2019-10-24 21:22 UTC, Strahil Nikolov
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-engine pull 121 0 None open rename 'Kdump Status' to 'Kdump Integration Status' in Host detail view. 2022-03-04 00:33:03 UTC
Red Hat Issue Tracker RHV-37753 0 None None None 2022-03-03 08:55:51 UTC

Description Strahil Nikolov 2019-10-24 21:22:55 UTC
Created attachment 1628960 [details]
Ovirt Host Tab for one of the hosts

Description of problem:
The Hosts' tab in the admin portal shows that kdump status is "Disabled" while all hosts have kdump enabled and loaded successfully.

Version-Release number of selected component (if applicable):
ovirt-engine-4.3.6.7-1.el7.noarch
ovirt-engine-webadmin-portal-4.3.6.7-1.el7.noarch


How reproducible:
Always - all hosts (3) are reported as kdump 'Disabled'

Steps to Reproduce:
1.Install engine (v4.2.7)
2.Keep upgrading till 4.3.6


Actual results:
Hosts' "Kdump Status:" is wrong

~~~
[root@ovirt3 ~]# systemctl status kdump
● kdump.service - Crash recovery kernel arming
   Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled)
   Active: active (exited) since вт 2019-10-08 16:07:14 EEST; 2 weeks 2 days ago
  Process: 2394 ExecStart=/usr/bin/kdumpctl start (code=exited, status=0/SUCCESS)
 Main PID: 2394 (code=exited, status=0/SUCCESS)
    Tasks: 0
   CGroup: /system.slice/kdump.service

окт 08 16:07:13 ovirt3.localdomain dracut[3620]: drwxr-xr-x   2 root     root            0 Oct  8 16:06 usr/share/zoneinfo/Europe
окт 08 16:07:13 ovirt3.localdomain dracut[3620]: -rw-r--r--   1 root     root         2104 Sep 26 14:34 usr/share/zoneinfo/Europe/Sofia
окт 08 16:07:13 ovirt3.localdomain dracut[3620]: drwxr-xr-x   2 root     root            0 Oct  8 16:06 var
окт 08 16:07:13 ovirt3.localdomain dracut[3620]: lrwxrwxrwx   1 root     root           11 Oct  8 16:06 var/lock -> ../run/lock
окт 08 16:07:13 ovirt3.localdomain dracut[3620]: lrwxrwxrwx   1 root     root            6 Oct  8 16:06 var/run -> ../run
окт 08 16:07:13 ovirt3.localdomain dracut[3620]: ========================================================================
окт 08 16:07:13 ovirt3.localdomain dracut[3620]: *** Creating initramfs image file '/boot/initramfs-3.10.0-1062.1.2.el7.x86_64kdump.img' done ***
окт 08 16:07:14 ovirt3.localdomain kdumpctl[2394]: kexec: loaded kdump kernel
окт 08 16:07:14 ovirt3.localdomain kdumpctl[2394]: Starting kdump: [OK]
окт 08 16:07:14 ovirt3.localdomain systemd[1]: Started Crash recovery kernel arming.
~~~
[root@ovirt2 watchdog0]# systemctl status kdump.service 
● kdump.service - Crash recovery kernel arming
   Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled)
   Active: active (exited) since пт 2019-10-25 00:11:12 EEST; 7min ago
  Process: 25334 ExecStop=/usr/bin/kdumpctl stop (code=exited, status=0/SUCCESS)
  Process: 25343 ExecStart=/usr/bin/kdumpctl start (code=exited, status=0/SUCCESS)
 Main PID: 25343 (code=exited, status=0/SUCCESS)

окт 25 00:11:10 ovirt2.localdomain systemd[1]: Starting Crash recovery kernel arming...
окт 25 00:11:11 ovirt2.localdomain kdumpctl[25343]: /usr/bin/kdumpctl: line 571: /sys/class/watchdog/watchdog0/device/modalias: No such file or directory
окт 25 00:11:12 ovirt2.localdomain kdumpctl[25343]: kexec: loaded kdump kernel
окт 25 00:11:12 ovirt2.localdomain kdumpctl[25343]: Starting kdump: [OK]
окт 25 00:11:12 ovirt2.localdomain systemd[1]: Started Crash recovery kernel arming.

~~~
[root@ovirt1 ~]# systemctl status kdump
● kdump.service - Crash recovery kernel arming
   Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled)
   Active: active (exited) since чт 2019-10-24 21:08:15 EEST; 3h 11min ago
  Process: 3529 ExecStart=/usr/bin/kdumpctl start (code=exited, status=0/SUCCESS)
 Main PID: 3529 (code=exited, status=0/SUCCESS)
    Tasks: 0
   CGroup: /system.slice/kdump.service

окт 24 21:08:09 ovirt1.localdomain systemd[1]: Starting Crash recovery kernel arming...
окт 24 21:08:14 ovirt1.localdomain kdumpctl[3529]: /usr/bin/kdumpctl: line 571: /sys/class/watchdog/watchdog0/device/modalias: No such file or directory
окт 24 21:08:15 ovirt1.localdomain kdumpctl[3529]: kexec: loaded kdump kernel
окт 24 21:08:15 ovirt1.localdomain kdumpctl[3529]: Starting kdump: [OK]
окт 24 21:08:15 ovirt1.localdomain systemd[1]: Started Crash recovery kernel arming.

~~~

Expected results:
The Host's "Kdump Status:" should indicate "Enabled".

Additional info:
Screenshot attached

Comment 1 Martin Perina 2019-10-31 08:50:51 UTC
(In reply to Strahil Nikolov from comment #0)
> Created attachment 1628960 [details]
> Ovirt Host Tab for one of the hosts
> 
> Description of problem:
> The Hosts' tab in the admin portal shows that kdump status is "Disabled"
> while all hosts have kdump enabled and loaded successfully.

What do you mean by Hosts tab? Do you mean that if go from Hosts view to the detail of the host and there, in General tab, you can see "Kdump status: Disabled"? If so, have you enabled Kdump integration in Host detail in Power Management tab? Kdump integration in oVirt is not only about having kdump service running on the host, we can also configure kdump to be able to send messages to oVirt engine when host is dumping. For more details please take a look at below presentation:

Video: https://www.youtube.com/watch?v=RAGV_za_Qvw
Slides: http://www.slideshare.net/MartinPeina/integrating-kdump-into-ovirt

Comment 2 Strahil Nikolov 2019-10-31 11:46:10 UTC
My power management is not configured  despite IPMI is available, as they are in a separate VLAN (simulating a prod setup).
If kdump integration requires power management to be enabled - then the info should be 'Not Configured' or something like that.

I will check if I can work this around

Comment 3 Martin Perina 2019-10-31 12:13:53 UTC
The main reason of Kdump integration into oVirt is to prevent fencing (power management restart) of the host, while the host is dumping. If this is not prevented, then crash dump is not saved and you will loose diagnostic data. That's why Kdump integration is a part of Power Management.

I also don't understand the issue with separate VLAN, because having separate physical network for power management is preferred solution. Let me explain how fencing works in oVirt:

1. Let's have host1 and host2 in the same cluster, and host1 became non-responsive (which means that oVirt engine cannot communicate with it)
2. Engine picks some host which is Up (engine can communicate with this host), in this scenario it will be host2
3. Selected host (host2) will be used as fence proxy, meaning that engine tell VDSM on host2 to execute fence agent on host2 to perform power management operation for host1

Above means that engine doesn't need to have direct connection to power management devices, but each host in the cluster need to be able to access power management interfaces of each other host.

Comment 4 Strahil Nikolov 2019-10-31 12:20:13 UTC
In our environment, we have a separate network for Remote Management (IPMI) and no host have and no host will ever have access to remote management.
The systems can be accessed over IPMI via a special 'Bastion' Windows (actually it's Citrix) Host.

So fencing cannot be setup in prod , thus  no fencing is setup for the lab (same security requirements exist).

Comment 5 Martin Perina 2019-10-31 13:26:21 UTC
So in that case you cannot use power management and because of that you don't need to configured Kdump integration. Just please be aware that without power management you will not be able to HA VMs.

So apart from the issue, that you current setup is not compatible with oVirt requirements for power mamagement is there any items which you think should be fixed?

Comment 6 Strahil Nikolov 2019-10-31 14:20:17 UTC
The message 'Disabled' is misleading.


Maybe a  new 'Not Configured' is more suitable.

Comment 7 Martin Perina 2019-11-07 09:06:01 UTC
OK, so let's rename 'Kdump Status" to 'Kdump Integration Status' in Host detail view.

Comment 9 Sandro Bonazzola 2022-04-28 09:26:34 UTC
This bugzilla is included in oVirt 4.5.0 release, published on April 20th 2022.

Since the problem described in this bug report should be resolved in oVirt 4.5.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.