Bug 1612943

Summary: Wrong RESUME event after I/O error
Product: Red Hat Enterprise Linux 7 Reporter: Milan Zamazal <mzamazal>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Yanqiu Zhang <yanqzhan>
Severity: high Docs Contact:
Priority: high    
Version: 7.5CC: dyuan, fjin, jdenemar, jherrman, jsuchane, lmen, mtessun, rbarry, tburke, xuzhang, yafu, yanqzhan
Target Milestone: rcKeywords: Reopened, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-4.5.0-11.el7 Doc Type: Bug Fix
Doc Text:
Previously, when a virtual machine (VM) was paused due to an I/O error and the error was not fixed, resuming the VM caused it to become paused again, and only a generic unpause event was logged. This update ensures that suspend events generated by the libvirt service properly indicate I/O errors. As a result, attempting to resume a VM with an unfixed I/O error now logs the correct event type.
Story Points: ---
Clone Of:
: 1634758 1634759 (view as bug list) Environment:
Last Closed: 2019-08-06 13:13:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1526025, 1634758, 1634759, 1651787    
Attachments:
Description Flags
libvirt log
none
Vdsm log
none
logs_for_comment3 none

Description Milan Zamazal 2018-08-06 15:07:56 UTC
Created attachment 1473647 [details]
libvirt log

Description of problem:

When a VM is paused due to I/O error and the VM is manually attempted to resume while storage is still broken, libvirt sends an extra resume event, resulting in client confusion. For instance in reaction to QEMU event sequence

  BLOCK_IO_ERROR ... storage broken
  STOP ... VM stopped
  RESUME ... manual RESUME attempt
  BLOCK_IO_ERROR ... no luck, storage still broken
  ... more BLOCK_IO_ERRORs
  STOP ... VM stopped again

libvirt may send the following VM lifecycle event sequence to Vdsm:

  Suspended detail 2 ... OK
  Resumed detail 0 ... OK
  Suspended detail 2 ... OK
  Resumed detail 0 ... ??
  Suspended detail 0 ... confuses the client due to previous resume and detail 0

Now the client is intrigued into thinking that the VM has been paused regularly and not due to I/O error.

Version-Release number of selected component (if applicable):

libvirt-3.9.0-14.el7_5.6.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.5.x86_64

How reproducible:

100% on certain setups

Steps to Reproduce:
1. Run a VM with an iSCSI drive.
2. Block connection to the storage host on the VM host using iptables with -j DROP.
3. Try to resume the VM.
4. Check the produced events.

See the attached logs (manual resume attempt is around 2018-08-05 07:04:58.760+0000) and https://bugzilla.redhat.com/show_bug.cgi?id=1526025#c52 for more details.

Actual results:

Sequence of libvirt lifecycle events as described above.

Expected results:

The second RESUME event shouldn't be sent at all, the last STOP event should be preferably sent with detail=2.

Additional info:

A very similar problem already occurred and has been fixed on QEMU level, see https://bugzilla.redhat.com/1582122 and https://bugzilla.redhat.com/1569614.

Comment 2 Milan Zamazal 2018-08-06 15:08:46 UTC
Created attachment 1473648 [details]
Vdsm log

Comment 3 Yanqiu Zhang 2018-08-08 11:52:08 UTC
Reproduce with:
libvirt-3.9.0-14.el7_5.7.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.5.x86_64


Steps:
1. Run a VM with a second disk attached.
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/mypool/disk1.img'/>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>

# virsh start V-7.5
Domain V-7.5 started

2. Cause an I/O error by overwriting on the disk by https://bugzilla.redhat.com/show_bug.cgi?id=1566153#c7

3. Wait until the VM gets paused:
Guest os can not be interacted, can see status by:

# virsh -r domstate V-7.5 --reason
paused (I/O error)

4. Try to manually resume the VM.
# virsh resume V-7.5
Domain V-7.5 resumed

# virsh -r domstate V-7.5 --reason
running (unpaused)

5. Check event and libvirtd.log
# cat virsh-event.log |grep lifecycle
event 'lifecycle' for domain V-7.5: Suspended I/O Error
event 'lifecycle' for domain V-7.5: Resumed Unpaused
event 'lifecycle' for domain V-7.5: Suspended I/O Error
event 'lifecycle' for domain V-7.5: Resumed Unpaused

The lifecycle events in order are "3 2", "4 0", "3 2", "4 0" in libvirtd.log:
'3' and '4' are virDomainEventType:
    VIR_DOMAIN_EVENT_SUSPENDED = 3,
    VIR_DOMAIN_EVENT_RESUMED = 4,
'2' and '0' are details on the cause of a lifecycle event:
VIR_DOMAIN_EVENT_SUSPENDED_IOERROR = 2,  /* Suspended due to a disk I/O error */
VIR_DOMAIN_EVENT_RESUMED_UNPAUSED = 0,   /* Normal resume due to admin unpause */
# cat libvirtd.log-20180808 |grep "Relaying domain lifecycle event"|uniq
2018-08-07 12:54:01.349+0000: 1753: debug : remoteRelayDomainEventLifecycle:312 : Relaying domain lifecycle event 3 2, callback 0 legacy 0
...
2018-08-07 12:55:33.883+0000: 1753: debug : remoteRelayDomainEventLifecycle:312 : Relaying domain lifecycle event 4 0, callback 0 legacy 0
...
2018-08-07 12:55:33.888+0000: 1753: debug : remoteRelayDomainEventLifecycle:312 : Relaying domain lifecycle event 3 2, callback 0 legacy 0
...
2018-08-07 12:55:34.046+0000: 1753: debug : remoteRelayDomainEventLifecycle:312 : Relaying domain lifecycle event 4 0, callback 0 legacy 0
...


Actual results:
1. In step4, guest is resumed to running status even though the ioerror still exists, and guest os is still inaccessible.
2. In step5, extra 'resume' occurred in lifecycle events, and reason is "admin unpause".


Additional info:
The qemu events in order are as expected: BLOCK_IO_ERROR"->"STOP"->"RESUME"->"BLOCK_IO_ERROR"->"STOP":
# cat /var/log/libvirt/libvirtd.log |grep  QEMU_MONITOR_RECV_EVENT
2018-08-07 12:54:01.345+0000: 1753: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f1b74027d50 event={"timestamp": {"seconds": 1533646441, "microseconds": 345352}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block329", "reason": "No space left on device", "operation": "write", "action": "stop"}}
...
2018-08-07 12:54:01.481+0000: 1753: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f1b74027d50 event={"timestamp": {"seconds": 1533646441, "microseconds": 356659}, "event": "STOP"}
2018-08-07 12:55:33.882+0000: 1753: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f1b74027d50 event={"timestamp": {"seconds": 1533646533, "microseconds": 882140}, "event": "RESUME"}
2018-08-07 12:55:33.884+0000: 1753: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f1b74027d50 event={"timestamp": {"seconds": 1533646533, "microseconds": 883390}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk1", "nospace": true, "__com.redhat_reason": "enospc", "node-name": "#block329", "reason": "No space left on device", "operation": "write", "action": "stop"}}
...
2018-08-07 12:55:34.033+0000: 1753: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_EVENT: mon=0x7f1b74027d50 event={"timestamp": {"seconds": 1533646533, "microseconds": 906884}, "event": "STOP"}

Comment 4 Yanqiu Zhang 2018-08-08 12:21:24 UTC
Created attachment 1474307 [details]
logs_for_comment3

Comment 5 Jiri Denemark 2018-08-13 11:17:24 UTC

*** This bug has been marked as a duplicate of bug 1582122 ***

Comment 6 Jiri Denemark 2018-08-13 11:19:27 UTC
Oops.

Comment 12 Ryan Barry 2018-09-12 12:54:25 UTC
Totally ok with me

Comment 13 Jiri Denemark 2018-09-12 12:58:06 UTC
Patches sent upstream for review: https://www.redhat.com/archives/libvir-list/2018-September/msg00548.html

Comment 14 Jiri Denemark 2018-10-01 14:31:56 UTC
This is now fixed upstream by

commit 55af06187c48a01192764d8638b85739b0178fe0
Refs: v4.7.0-298-g55af06187c
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Sep 10 15:10:54 2018 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Tue Sep 25 20:42:25 2018 +0200

    qemu: Properly report VIR_DOMAIN_EVENT_RESUMED_FROM_SNAPSHOT

    VIR_DOMAIN_EVENT_RESUMED_FROM_SNAPSHOT was defined but not used anywhere
    in our event generation code. This fixes qemuDomainRevertToSnapshot to
    properly report why the domain was resumed.

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: John Ferlan <jferlan>

commit 54b5b0ac3945ad5417b67bec8443cf6e7b3d482b
Refs: v4.7.0-299-g54b5b0ac39
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue Sep 11 19:26:07 2018 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Tue Sep 25 20:42:25 2018 +0200

    qemu: Report more appropriate running reasons

    This patch replaces some rather generic VIR_DOMAIN_RUNNING_UNPAUSED
    reasons when changing domain state to running with more specific ones.
    All of them are done when libvirtd reconnects to an existing domain
    after being restarted and sees an unfinished migration or save.

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: John Ferlan <jferlan>

commit 5dab984ed0cd0332e59d719420ab2f9d009b952f
Refs: v4.7.0-300-g5dab984ed0
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Sep 10 19:41:53 2018 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Tue Sep 25 20:42:25 2018 +0200

    qemu: Pass running reason to RESUME event handler

    Whenever we get the RESUME event from QEMU, we change the state of the
    affected domain to VIR_DOMAIN_RUNNING with VIR_DOMAIN_RUNNING_UNPAUSED
    reason. This is fine if the domain is resumed unexpectedly, but when we
    sent "cont" to QEMU we usually have a better reason for the state
    change. The better reason is used in qemuProcessStartCPUs which also
    sets the domain state to running if qemuMonitorStartCPUs reports
    success. Thus we may end up with two state updates in a row, but the
    final reason is correct.

    This patch is a preparation for dropping the state change done in
    qemuMonitorStartCPUs for which we need to pass the actual running reason
    to the RESUME event handler and use it there instead of
    VIR_DOMAIN_RUNNING_UNPAUSED.

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: John Ferlan <jferlan>

commit 8ae9b49f5a4a02f57a1dfa20d4fe04c3d40a4665
Refs: v4.7.0-301-g8ae9b49f5a
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue Sep 11 15:13:08 2018 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Tue Sep 25 20:42:25 2018 +0200

    qemu: Map running reason to resume event detail

    Thanks to the previous commit the RESUME event handler knows what reason
    should be used when changing the domain state to VIR_DOMAIN_RUNNING, but
    the emitted VIR_DOMAIN_EVENT_RESUMED event still uses a generic
    VIR_DOMAIN_EVENT_RESUMED_UNPAUSED detail. Luckily, the event detail can
    be easily deduced from the running reason, which saves us from having to
    pass one more value to the handler.

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: John Ferlan <jferlan>

commit e6d77a75c4bf0c017d62b717b75e4bb6aa7a456b
Refs: v4.7.0-302-ge6d77a75c4
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Sep 12 14:34:33 2018 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Tue Sep 25 20:42:25 2018 +0200

    qemu: Avoid duplicate resume events and state changes

    The only place where VIR_DOMAIN_EVENT_RESUMED should be generated is the
    RESUME event handler to make sure we don't generate duplicate events or
    state changes. In the worse case the duplicity can revert or cover
    changes done by other event handlers.

    For example, after QEMU sent RESUME, BLOCK_IO_ERROR, and STOP events
    we could happily mark the domain as running and report
    VIR_DOMAIN_EVENT_RESUMED to registered clients.

    https://bugzilla.redhat.com/show_bug.cgi?id=1612943

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: John Ferlan <jferlan>

Comment 16 Jiri Denemark 2018-10-26 16:14:59 UTC
The "qemu: Avoid duplicate resume events and state changes" patch causes a
regression with fake reboot (virsh reboot --mode acpi) and automatic restart
of the domain when it crashes (<on_crash>restart</on_crash>). The domain is
not switched back to running once it's restarted and the resume event is not
emitted by libvirt.

Comment 17 Jiri Denemark 2018-11-07 14:24:21 UTC
The patch fixing regression from comment 16 was sent upstream for review:
https://www.redhat.com/archives/libvir-list/2018-November/msg00250.html

Comment 18 Jiri Denemark 2018-11-08 08:49:45 UTC
The regression is now fixed upstream by

commit e47949357ba268e7e8c3adea7c262b84fa002302
Refs: v4.9.0-21-ge47949357b
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Nov 7 14:34:52 2018 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Thu Nov 8 09:08:58 2018 +0100

    qemu: Don't ignore resume events

    Since commit v4.7.0-302-ge6d77a75c4 processing RESUME event is mandatory
    for updating domain state. But the event handler explicitly ignored this
    event in some cases. Thus the state would be wrong after a fake reboot
    or when a domain was rebooted after it crashed.

    BTW, the code to ignore RESUME event after SHUTDOWN didn't make sense
    even before making RESUME event mandatory. Most likely it was there as a
    result of careless copy&paste from qemuProcessHandleStop.

    The corresponding debug message was clarified since the original state
    does not have to be "paused" only and while we have a "resumed" event,
    the state is called "running".

    https://bugzilla.redhat.com/show_bug.cgi?id=1612943

    Signed-off-by: Jiri Denemark <jdenemar>

Comment 20 Yanqiu Zhang 2019-04-12 10:30:38 UTC
Verified this bug on rhel7.7 with:
libvirt-4.5.0-12.el7.x86_64
qemu-kvm-rhev-2.12.0-26.el7.x86_64
kernel-3.10.0-1034/-1036.el7.x86_64

Steps and results are same with Bug 1634758.
Mark as verified.

Comment 22 errata-xmlrpc 2019-08-06 13:13:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2294