RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2137346 - RFE: support 'TCO' watchdog built-in to Q35 machine
Summary: RFE: support 'TCO' watchdog built-in to Q35 machine
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: libvirt
Version: 9.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Martin Kletzander
QA Contact: Lili Zhu
URL:
Whiteboard:
Depends On: 2080207
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-24 14:40 UTC by Daniel Berrangé
Modified: 2023-11-07 09:36 UTC (History)
9 users (show)

Fixed In Version: libvirt-9.1.0-1.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-07 08:30:47 UTC
Type: Feature Request
Target Upstream Version: 9.1.0
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker LIBVIRTAT-14594 0 None None None 2023-07-26 08:26:55 UTC
Red Hat Issue Tracker RHELPLAN-137435 0 None None None 2022-10-24 14:52:59 UTC
Red Hat Product Errata RHSA-2023:6409 0 None None None 2023-11-07 08:31:29 UTC

Description Daniel Berrangé 2022-10-24 14:40:17 UTC
Description of problem:
The Q35 machine type chipset comes with unconditional support for a 'TCO' watchdog.

Linux guests automatically detect the TCO watchdog ahd load the 'iTCO_wdt' kmod to enable it.

There are two problems with this

 * Since it is a built-in device, there's no <watchdog> element needed to enable it, and thus also no way to set the watchdog action.

 * Even if configured, a weird decision by QEMU causes the watchdog action to never be triggered unless '-global ICH9-LPC.noreboot=off' is set.


This suggests we want

      <watchdog model='tco' action='poweroff'/>

to result in setting  -watchdog-action  and the ICH9-LPC.noreboot flag.

Oh, and the watchdog is currently broken - see bug 2080207 - so if trying to test this beware it won't work yet, which is rather unfortunate given that all guest OS are being given this watchdog unconditionally with no info that it is broken.

Comment 2 Daniel Berrangé 2023-01-12 15:55:45 UTC
(In reply to Daniel Berrangé from comment #0)
>  * Even if configured, a weird decision by QEMU causes the watchdog action
> to never be triggered unless '-global ICH9-LPC.noreboot=off' is set.

In QEMU 8.0 git this is now changed.

  commit a6b6414f0cf04636dc3d0c21ea4a2f19b7629c93
  Author: Daniel P. Berrangé <berrange>
  Date:   Fri Dec 16 07:57:48 2022 -0500

    hw/isa: enable TCO watchdog reboot pin strap by default


IOW, with -8.0 machine type versions or later, there will be no need for the ICH9-LPC.noreboot=off flag, it will default to 'on'. It is harmless to still turn it on explicitly though if we want compat with old QEMU

It is also still required to set a -watchdog-action, if the user wants something other than the default QEMU behaviour of 'reset'

IOW, if libvirt does NOTHING, when with new machine types the watchdog will work OOTB with q35, and result in guest resets. 

We still should express the watchdog in the XML though

> This suggests we want
> 
>       <watchdog model='tco' action='poweroff'/>
> 
> to result in setting  -watchdog-action  and the ICH9-LPC.noreboot flag.
> 
> Oh, and the watchdog is currently broken - see bug 2080207 - so if trying to
> test this beware it won't work yet, which is rather unfortunate given that
> all guest OS are being given this watchdog unconditionally with no info that
> it is broken.

This turned out to be a Linux kernel regression, which didn't affect RHEL-9 or earlier kernels, only Fedora 36/37.

Comment 3 Martin Kletzander 2023-01-31 11:40:06 UTC
Fixed upstream with c5340d5420012412ea298f0102cc7f113e87d89b..2dde3840b1d50e79f6b8161820fff9fe62f613a9

Comment 5 Lili Zhu 2023-03-20 03:09:07 UTC
Submit a PR:
https://github.com/autotest/tp-libvirt/pull/4811

This PR contains some basic test of tco watchdog, test passed.

Comment 10 errata-xmlrpc 2023-11-07 08:30:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: libvirt security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6409


Note You need to log in before you can comment on or make changes to this bug.