Bug 1252203

Summary: managedsave/start causes IRQ and task blocked for more than 120 seconds errors
Product: [Community] Virtualization Tools Reporter: jamespharvey20
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED DUPLICATE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: crobinso, rbalakri
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-10 21:48:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description jamespharvey20 2015-08-11 01:00:08 UTC
If I manually delete the Q35 USB Controllers, I can use
managedsave/start, but I start getting tty errors that don't happen
before the managedsave/start.

Using libvirt 1.2.18 (-1 Arch) and QEMU git-master (2.4.0.r40384.2d69736).

Installed using Q35 chipset.

I'm running QEMU git, which allows SCSI controller migration, so I can
attempt doing this.

I started my guest, and waited for 15 minutes.  No post-booting
tty1/dmesg messages.

After a managedsave/start, I start getting tty1/dmesg errors.  I can
also no longer ping the guest system.

=====
$ dmesg|grep "IRQ 21"
[    1.141040] ACPI: PCI Interrupt Link [GSIF] enabled at IRQ 21
$ ls -lA /proc/irq/21
total 0
-r--r--r-- 1 root root 0 Aug 10 19:42 affinity_hint
-r--r--r-- 1 root root 0 Aug 10 19:42 node
dr-xr-xr-x 2 root root 0 Aug 10 19:42 qxl
-rw-r--r-- 1 root root 0 Aug 10 19:42 smp_affinity
-rw-r--r-- 1 root root 0 Aug 10 19:42 smp_affinity_list
-r--r--r-- 1 root root 0 Aug 10 19:42 spurious
dr-xr-xr-x 2 root root 0 Aug 10 19:42 virtio2
$ ls -lA /proc/irq/21/qxl
total 0
$ ls -lA /proc/irq/21/virtio2
total 0
=====
{{{ on host, everything else is on guest }}}
virsh # managedsave <vmname>

Domain servo state saved by libvirt

virsh # start <vmname>
Domain <vmname> started
=====
[ 1117.083236] irq 21: nobody cared (try booting with the "irqpoll" option)
[ 1117.083236] handlers:
[ 1117.083236] [<ffffffffa00cfc60>] qxl_irq_handler [qxl]
[ 1117.083236] [<ffffffffa00f2530>] vp_interrupt [virtio_pci]
[ 1117.083236] Disabling IRQ #21
=====
{{{ bit later, doing nothing, just waiting }}}
[ 1440.223239] INFO: task vballoon:147 blocked for more than 120 seconds.
[ 1440.223409]       Not tainted 4.1.4-1-ARCH #1
[ 1440.223556] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1440.223805] INFO: task btrfs-transacti:242 blocked for more than 120 seconds.
[ 1440.223947]       Not tainted 4.1.4-1-ARCH #1
[ 1440.224069] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1440.224422] INFO: task systemd-journal:280 blocked for more than 120 seconds.
[ 1440.224575]       Not tainted 4.1.4-1-ARCH #1
[ 1440.224710] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
{{{ these message lines show up very slowly, as if the VM console
video is running at 1% speed }}}
=====
{{{ bit later, doing nothing, just waiting }}}
[ 1560.223227] {{{ repeats the above 9 lines,
vballoon:147/btrfs-transacti:242/systemd-journal:280] blocked for more
than 120 seconds }}}
=====
{{{ bit later, doing nothing, just waiting }}}
[ 1680.223217] {{{ repeats the above 9 lines,
vballoon:147/btrfs-transacti:242/systemd-journal:280] blocked for more
than 120 seconds }}}
[ 1688.885605] systemd[1]: systemd-journald.service: Watchdog timeout
(limit 1min)!
=====

Comment 1 Cole Robinson 2016-04-10 21:48:06 UTC
Duping to your other bug, because even though they are different errors it's still unlikely a libvirt problem and instead just general issues with qemu q35 migration

*** This bug has been marked as a duplicate of bug 1252202 ***