Bug 2168008

Summary: libvirt watchdog action=dump should not resume the vm after coredump the guest
Product: Red Hat Enterprise Linux 9 Reporter: yalzhang <yalzhang>
Component: libvirtAssignee: Virtualization Maintenance <virt-maint>
libvirt sub component: General QA Contact: Lili Zhu <lizhu>
Status: CLOSED MIGRATED Docs Contact:
Severity: unspecified    
Priority: unspecified CC: fjin, hutao, lmen, mkletzan, mprivozn, rjones, virt-maint
Version: 9.2Keywords: MigratedToJIRA, Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-07 21:23:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yalzhang@redhat.com 2023-02-08 02:22:25 UTC
Description of problem:
For libvirt watchdog action=dump, when it is triggered, the guest will pause and libvirt coredumps the guest, which is fine.  However libvirt then resumes the guest which is definitely not fine (it should restart it).

Version-Release number of selected component (if applicable):
libvirt-9.0.0-3.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Start vm with watchdog device:

# virsh dumpxml rhel --xpath //watchdog 
<watchdog model="i6300esb" action="dump">
  <address type="pci" domain="0x0000" bus="0x10" slot="0x01" function="0x0"/>
</watchdog>

# virsh start rhel 
Domain 'rhel' started

2. Trigger the watchdog in the vm and wait for 30s, vm will pause and generate the dump file, then libvirt will resume the vm:
#  echo 1 > /dev/watchdog
[   71.910336] watchdog: watchdog0: watchdog did not stop!
(check the status on vm, after 30s, it will pause for several seconds, then resume. No reset.)

Watch the domstate on the host, after 30s, the vm will be pause, and then resume to running status:
# watch virsh domstate rhel

Check there is coredump file created on the host:
# ll /var/lib/libvirt/qemu/dump
total 2088772
-rw-------. 1 root root 2138900771 Feb  7 21:06 3-rhel-2023-02-07-21:06:48

Actual results:
For libvirt watchdog action=dump, when it is triggered, the guest will pause and libvirt coredumps the guest, which is fine.  However libvirt then resumes the guest which is definitely not fine (it should restart it).

Expected results:
After coredump, libvirt should reset the vm

Additional info:

Comment 1 Richard W.M. Jones 2023-02-08 09:47:18 UTC
The issue here is that the watchdog software inside the guest and
the watchdog emulation done by qemu is not expecting a watchdog
that fires and then resumes the guest.  Real watchdog hardware
would never behave like this - it's always expected that if the
watchdog fires, the machine is reset, reboots, and watchdog initialization
is done over again.

An effect of this is that after the watchdog has fired once, it
will never fire again (making the watchdog ineffective until
a human intervenes and reboots the machine).

This would be, of course, a considerable change in the behaviour
of action=dump, but I think we have to accept that we got this
wrong initially.

I think we just need to change this code so instead of starting
up the vCPUs, it kills the domain:

https://gitlab.com/libvirt/libvirt/-/blob/5155ab4b2a704285505dfea6ffee8b980fdaa29e/src/qemu/qemu_driver.c#L3459

Comment 2 Michal Privoznik 2023-02-08 14:16:00 UTC
It has been like this for ages. Does this mean that it never worked properly or something else changed?

Comment 3 Martin Kletzander 2023-02-08 15:49:57 UTC
So whatever happens after the dump is not described anywhere. Anyone might expect the domain to get reset, destroyed, paused etc.  The way this works has not changed since its inception in commit e19cdbfcf16dfec7308f8926b1660533c10bcc7d.

Since this is not a qemu specificality, the dump is translated to pause by libvirt, we could theoretically add other actions, for example "dump+reset", "dump+pause", "dump+destroy", but I would rather not change the default, although we could document the current (maybe soon to be legacy) behaviour.

We could use this BZ for documenting the behaviour of the option as it is now and if the other ones are wanted we can get another BZ for those.  Would that be fine with you?

Comment 4 Richard W.M. Jones 2023-02-08 17:18:59 UTC
Adding Hu Tao who was the original author of this commit:

https://gitlab.com/libvirt/libvirt/-/commit/e19cdbfcf16dfec7308f8926b1660533c10bcc7d

Comment 5 Richard W.M. Jones 2023-02-08 17:21:34 UTC
(In reply to Michal Privoznik from comment #2)
> It has been like this for ages. Does this mean that it never worked properly
> or something else changed?

I mean, it works to some extent, it just screws up the watchdog
state after the first time it fires.

(In reply to Martin Kletzander from comment #3)
> Since this is not a qemu specificality, the dump is translated to pause by
> libvirt, we could theoretically add other actions, for example "dump+reset",
> "dump+pause", "dump+destroy", but I would rather not change the default,
> although we could document the current (maybe soon to be legacy) behaviour.
> 
> We could use this BZ for documenting the behaviour of the option as it is
> now and if the other ones are wanted we can get another BZ for those.  Would
> that be fine with you?

dump+pause isn't really a good idea, but I guess so.  I wonder if Hu Tao will
have an opinion about this.  Maybe Fujitsu depend on a particular behaviour.

Comment 6 yalzhang@redhat.com 2023-02-09 05:12:52 UTC
(In reply to Martin Kletzander from comment #3)
> So whatever happens after the dump is not described anywhere. Anyone might
> expect the domain to get reset, destroyed, paused etc.  The way this works
> has not changed since its inception in commit
> e19cdbfcf16dfec7308f8926b1660533c10bcc7d.
> 
> Since this is not a qemu specificality, the dump is translated to pause by
> libvirt, we could theoretically add other actions, for example "dump+reset",
> "dump+pause", "dump+destroy", but I would rather not change the default,
> although we could document the current (maybe soon to be legacy) behaviour.
> 
> We could use this BZ for documenting the behaviour of the option as it is
> now and if the other ones are wanted we can get another BZ for those.  Would
> that be fine with you?

It's fine for me. I'm expecting more input and discussion and finally we can get a solution fine with everyone.