Bug 1713009
Summary: | [QEMU] Back-n-forth live migration of a paused VM results in QEMU crash with: "bdrv_inactivate_recurse: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed." | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Archit Modi <amodi> | ||||||||||||||||||||||||
Component: | qemu-kvm | Assignee: | Kevin Wolf <kwolf> | ||||||||||||||||||||||||
qemu-kvm sub component: | Live Migration | QA Contact: | nlevinki <nlevinki> | ||||||||||||||||||||||||
Status: | CLOSED WONTFIX | Docs Contact: | |||||||||||||||||||||||||
Severity: | unspecified | ||||||||||||||||||||||||||
Priority: | unspecified | CC: | armbru, dasmith, dgilbert, eglynn, ehabkost, fjin, jhakimra, kchamart, kkiwi, kwolf, lyarwood, mzamazal, nova-maint, pbabbar, rbalakri, rbryant, sbauza, sgordon, virt-maint, vromanso | ||||||||||||||||||||||||
Version: | 8.1 | Keywords: | Reopened | ||||||||||||||||||||||||
Target Milestone: | rc | ||||||||||||||||||||||||||
Target Release: | --- | ||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||||||||
Clone Of: | 1713008 | Environment: | |||||||||||||||||||||||||
Last Closed: | 2021-08-03 07:27:06 UTC | Type: | --- | ||||||||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||
Embargoed: | |||||||||||||||||||||||||||
Bug Depends On: | 1713008, 1717175 | ||||||||||||||||||||||||||
Bug Blocks: | 1887391 | ||||||||||||||||||||||||||
Attachments: |
|
Description
Archit Modi
2019-05-22 16:54:44 UTC
Reopening this as i found a reproducer with gdb logs Created attachment 1634995 [details]
gdb-compute-1
Created attachment 1634996 [details]
gdb-compute-0
(overcloud) [stack@undercloud-0 tempest-dir]$ tempest run --regex tempest.api.compute.admin.test_live_migration.LiveMigrationTest.test_live_block_migration_paused {0} tempest.api.compute.admin.test_live_migration.LiveMigrationTest.test_live_block_migration_paused [43.720125s] ... FAILED Captured traceback: ~~~~~~~~~~~~~~~~~~~ b'Traceback (most recent call last):' b' File "/usr/lib/python3.6/site-packages/tempest/api/compute/admin/test_live_migration.py", line 135, in test_live_block_migration_paused' b" self._test_live_migration(state='PAUSED')" b' File "/usr/lib/python3.6/site-packages/tempest/api/compute/admin/test_live_migration.py", line 125, in _test_live_migration' b' self._live_migrate(server_id, source_host, state, volume_backed)' b' File "/usr/lib/python3.6/site-packages/tempest/api/compute/admin/test_live_migration.py", line 77, in _live_migrate' b' waiters.wait_for_server_status(self.servers_client, server_id, state)' b' File "/usr/lib/python3.6/site-packages/tempest/common/waiters.py", line 78, in wait_for_server_status' b' raise exceptions.BuildErrorException(server_id=server_id)' b'tempest.exceptions.BuildErrorException: Server f5fc8e99-f177-46c4-93a7-d3d48d162eaf failed to build and is in ERROR status' b'' ...OUTPUT OMITTED... tearDownClass (tempest.api.compute.admin.test_live_migration.LiveMigrationTest) ------------------------------------------------------------------------------- Captured traceback: ~~~~~~~~~~~~~~~~~~~ b'Traceback (most recent call last):' b' File "/usr/lib/python3.6/site-packages/tempest/test.py", line 225, in tearDownClass' b' six.reraise(etype, value, trace)' b' File "/usr/lib/python3.6/site-packages/six.py", line 675, in reraise' b' raise value' b' File "/usr/lib/python3.6/site-packages/tempest/test.py", line 197, in tearDownClass' b' teardown()' b' File "/usr/lib/python3.6/site-packages/tempest/test.py", line 568, in resource_cleanup' b' raise testtools.MultipleExceptions(*cleanup_errors)' b"testtools.runtest.MultipleExceptions: (<class 'tempest.lib.exceptions.DeleteErrorException'>, Resource %(resource_id)s failed to delete and is in ERROR status" b'Details: Server f5fc8e99-f177-46c4-93a7-d3d48d162eaf failed to delete and is in ERROR status, <traceback object at 0x7f6287965bc8>)' b'' ====== Totals ====== Ran: 2 tests in 51.7801 sec. - Passed: 0 - Skipped: 0 - Expected Fail: 0 - Unexpected Success: 0 - Failed: 2 Sum of execute time for each test: 43.7201 sec. ============== Worker Balance ============== - Worker 0 (2 tests) => 0:00:51.780051 Created attachment 1634999 [details]
sos-compute-1
Created attachment 1635002 [details]
sos-compute-0
Created attachment 1635434 [details]
new-gdb-compute-1
(In reply to Archit Modi from comment #19) > Created attachment 1635434 [details] > new-gdb-compute-1 The backtrace seems to be from a live QEMU process that is working as expected, not from a QEMU crash. How exactly is it related to this bug report? (In reply to Eduardo Habkost from comment #20) > (In reply to Archit Modi from comment #19) > > Created attachment 1635434 [details] > > new-gdb-compute-1 > > The backtrace seems to be from a live QEMU process that is working as > expected, not from a QEMU crash. How exactly is it related to this bug > report? I am not sure which log represents a QEMU process/crash, this tempest test has been failing for not just RHOS 15 but other RHOS versions too-13.14.15.16 and i wanted it to be fixed, the bug report doesn't just mention about the crash but also about the tempest test failing (In reply to Eduardo Habkost from comment #20) > (In reply to Archit Modi from comment #19) > > Created attachment 1635434 [details] > > new-gdb-compute-1 > > The backtrace seems to be from a live QEMU process that is working as > expected, not from a QEMU crash. How exactly is it related to this bug > report? Also, from the libvirt logs, i can see the reason was it crashed ... char device redirected to /dev/pts/2 (label charserial0) 2019-11-12T17:02:12.583660Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead 2019-11-12 17:02:21.146+0000: initiating migration qemu-kvm: block.c:5445: bdrv_inactivate_recurse: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. 2019-11-12 17:02:21.756+0000: shutting down, reason=crashed After updating QEMU to qemu-kvm-rhev-2.12.0-20.el7 (or higher), did you make sure to restart libvirtd? systemctl stop libvirtd systemctl start libvirtd Related: To confirm your running Nova instance is indeed using the updated QEMU version, please post the QEMU log for that. It should be located on the Compute host: /var/log/libvirt/qemu/instance-YYYYYYYY.log. I restarted libvirtd and still could reproduce the issue, (latest-gdb-trace.txt attached) [root@shark20 ~]# systemctl stop libvirtd [root@shark20 ~]# systemctl start libvirtd [root@shark20 ~]# systemctl status libvirtd ● libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2019-11-13 13:00:04 UTC; 1s ago Docs: man:libvirtd(8) https://libvirt.org Main PID: 38569 (libvirtd) Tasks: 24 (limit: 32768) CGroup: /system.slice/libvirtd.service ├─ 705 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper ├─ 706 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper ├─22029 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/external.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper ├─22030 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/external.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper ├─22246 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/management.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper ├─22247 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/management.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper ├─22423 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/data.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper └─38569 /usr/sbin/libvirtd Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[22423]: read /etc/hosts - 4 addresses Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[22246]: read /etc/hosts - 4 addresses Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[22246]: read /var/lib/libvirt/dnsmasq/management.addnhosts - 0 addresses Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[22423]: read /var/lib/libvirt/dnsmasq/data.addnhosts - 0 addresses Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[705]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[22029]: read /var/lib/libvirt/dnsmasq/external.addnhosts - 0 addresses Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq-dhcp[705]: read /var/lib/libvirt/dnsmasq/default.hostsfile Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq-dhcp[22029]: read /var/lib/libvirt/dnsmasq/external.hostsfile Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq-dhcp[22029]: not giving name undercloud-0 to the DHCP lease of 10.0.0.25 because the name exists in /etc/hosts with address 172.16.0.34 Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq-dhcp[22246]: read /var/lib/libvirt/dnsmasq/management.hostsfile [root@compute-1 qemu]# less instance-00000097.log 2019-11-13 13:02:10.347+0000: starting up libvirt version: 5.6.0, package: 6.module+el8.1.0+4244+9aa4e6bb (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2019-09-19-17:11:14, ), qemu version: 4.1.0qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61, kernel: 4.18.0-147.el8.x86_64, hostname: compute-1.redhat.local LC_ALL=C \ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \ HOME=/var/lib/libvirt/qemu/domain-110-instance-00000097 \ XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-110-instance-00000097/.local/share \ XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-110-instance-00000097/.cache \ XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-110-instance-00000097/.config \ QEMU_AUDIO_DRV=none \ /usr/libexec/qemu-kvm \ -name guest=instance-00000097,debug-threads=on \ -S \ -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-110-instance-00000097/master-key.aes \ -machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off \ -cpu Broadwell-IBRS,vme=on,ss=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,arch-capabilities=on,xsaveopt=on,pdpe1gb=on,abm=on,skip-l1dfl-vmentry=on,spec-ctrl=off \ -m 64 \ -overcommit mem-lock=off \ -smp 1,sockets=1,cores=1,threads=1 \ -uuid dafe75d9-c1d3-4820-88f5-a440f246f425 \ -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=19.0.3-0.20190814170534.a8e19af.el8ost,serial=dafe75d9-c1d3-4820-88f5-a440f246f425,uuid=dafe75d9-c1d3-4820-88f5-a440f246f425,family=Virtual Machine' \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,fd=69,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-hpet \ -no-shutdown \ -boot strict=on \ -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ -drive file=/var/lib/nova/instances/dafe75d9-c1d3-4820-88f5-a440f246f425/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on \ -netdev tap,fd=72,id=hostnet0,vhost=on,vhostfd=73 \ -device virtio-net-pci,rx_queue_size=512,host_mtu=1442,netdev=hostnet0,id=net0,mac=fa:16:3e:cb:7c:3f,bus=pci.0,addr=0x3 \ -add-fd set=3,fd=75 \ -chardev pty,id=charserial0,logfile=/dev/fdset/3,logappend=on \ -device isa-serial,chardev=charserial0,id=serial0 \ -device usb-tablet,id=input0,bus=usb.0,port=1 \ -vnc 172.17.1.89:0 \ -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \ -incoming defer \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 \ -object rng-random,id=objrng0,filename=/dev/urandom \ -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x6 \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on char device redirected to /dev/pts/2 (label charserial0) 2019-11-13T13:02:10.554791Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead 2019-11-13 13:02:23.097+0000: initiating migration qemu-kvm: block.c:5445: bdrv_inactivate_recurse: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. 2019-11-13 13:02:23.777+0000: shutting down, reason=crashed Created attachment 1635763 [details]
latest-gdb-trace
(In reply to Archit Modi from comment #25) > Created attachment 1635763 [details] > latest-gdb-trace I still don't understand what the gdb traces you are attaching are supposed to mean. They just show a QEMU process that is working as expected, and contain no information about the crash being reported. How exactly are you generating those gdb traces? (In reply to Eduardo Habkost from comment #26) > (In reply to Archit Modi from comment #25) > > Created attachment 1635763 [details] > > latest-gdb-trace > > I still don't understand what the gdb traces you are attaching are supposed > to mean. They just show a QEMU process that is working as expected, and > contain no information about the crash being reported. How exactly are you > generating those gdb traces? i was following kashyap's request to gather these traces. (0) Install the debuginfo packages for QEMU on RHEL-8 (this requires you have the debuginfo RPM repositories enabled!) $> dnf debuginfo-install qemu-kvm (1) Launch QEMU with GDB: $> gdb /usr/libexec/qemu-kvm (2) In the GDB shell, set the logging: (gdb) set logging on (gdb) run [Reproduce the problem] (3) *Now* in the GDB shell do: (gdb) thread apply all bt full (gdb) quit (In reply to Archit Modi from comment #27) [...] > [Reproduce the problem] > > (3) *Now* in the GDB shell do: > > (gdb) thread apply all bt full > (gdb) quit This won't work unless you are able to reproduce the problem inside the QEMU process you are running. This doesn't seem to be the case, as you are running qemu-kvm without any arguments. If you can't attach gdb to the running QEMU process before it crashes, you will need a core dump from the crashed process. Is ABRT running on the host? If ABRT is capturing the crash, you can use ABRT to get the core dump and/or backtrace. (In reply to Archit Modi from comment #24) [...] > [root@compute-1 qemu]# less instance-00000097.log > 2019-11-13 13:02:10.347+0000: starting up libvirt version: 5.6.0, package: > 6.module+el8.1.0+4244+9aa4e6bb (Red Hat, Inc. > <http://bugzilla.redhat.com/bugzilla>, 2019-09-19-17:11:14, ), qemu version: > 4.1.0qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61, kernel: > 4.18.0-147.el8.x86_64, hostname: compute-1.redhat.local [...] > 2019-11-13 13:02:23.097+0000: initiating migration > qemu-kvm: block.c:5445: bdrv_inactivate_recurse: Assertion `!(bs->open_flags > & BDRV_O_INACTIVE)' failed. > 2019-11-13 13:02:23.777+0000: shutting down, reason=crashed qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61 is supposed to contain the fix for bug 1655963, commit 9e37271f50ec ("block: Don't inactivate children before parents"). It may be a completely different issue with a similar crash. Kevin, can I reassign this to you? (In reply to Eduardo Habkost from comment #31) > qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61 is supposed to contain the > fix for bug 1655963, commit 9e37271f50ec ("block: Don't inactivate children > before parents"). It may be a completely different issue with a similar > crash. Yes, it's a different assertion in a different place that is failing. Essentially, we're trying to inactivate a block node that is already inactivated. Two cases where I could see this happening is during incoming migration (i.e. we would be trying to migrate away from a host that hadn't even completely received the VM state for a previous, still ongoing migration - it looks like we have checks to prevent this) or after a failed migration (i.e. we would be retrying migration without having restarted the guest). Where in the logs do I have to look to see the history of the QEMU process in question to see if anything unusual happened before this? I suppose the QMP traffic is not logged, as usual? And of course, a backtrace of the crashed QEMU would still be useful. > Kevin, can I reassign this to you? Let's first figure out some more details before moving this to qemu-kvm, we don't have much information yet. Maybe Kashyap can ping me on IRC tomorrow (he doesn't seem to be available at the moment). (In reply to Kevin Wolf from comment #32) > (In reply to Eduardo Habkost from comment #31) > > qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61 is supposed to contain the > > fix for bug 1655963, commit 9e37271f50ec ("block: Don't inactivate children > > before parents"). It may be a completely different issue with a similar > > crash. > > Yes, it's a different assertion in a different place that is failing. > > Essentially, we're trying to inactivate a block node that is already > inactivated. Two cases where I could see this happening is during incoming > migration (i.e. we would be trying to migrate away from a host that hadn't > even completely received the VM state for a previous, still ongoing > migration - it looks like we have checks to prevent this) or after a failed > migration (i.e. we would be retrying migration without having restarted the > guest). > > Where in the logs do I have to look to see the history of the QEMU process > in question to see if anything unusual happened before this? I suppose the > QMP traffic is not logged, as usual? The libvirt logs above (comment #24) don't seem to include QMP traffic, unfortunately. > > And of course, a backtrace of the crashed QEMU would still be useful. Kashyap is working on that, I hope we get a backtrace tomorrow. > > > Kevin, can I reassign this to you? > > Let's first figure out some more details before moving this to qemu-kvm, we > don't have much information yet. Maybe Kashyap can ping me on IRC tomorrow > (he doesn't seem to be available at the moment). Even if there's a bug somewhere else triggering this, I consider any assertion failure in QEMU a QEMU bug. But we can wait until we have more details. Thanks! Okay ... after much fiddling around, got the core (thanks to an idea from a colleague, Patrick Uiterwijk -- had to `pgrep` until the QEMU process existed; and then `gdb attach` to get this:) ------------------------------------------------------------------------ Thread 7 "live_migration" received signal SIGABRT, Aborted. [Switching to Thread 0x7f24699f5700 (LWP 256858)] 0x00007f2470dc38df in raise () from target:/lib64/libc.so.6 (gdb) gcore warning: target file /proc/256592/cmdline contained unexpected null characters Saved corefile core.256592 (gdb) bt #0 0x00007f2470dc38df in raise () from target:/lib64/libc.so.6 #1 0x00007f2470dadcf5 in abort () from target:/lib64/libc.so.6 #2 0x00007f2470dadbc9 in __assert_fail_base.cold.0 () from target:/lib64/libc.so.6 #3 0x00007f2470dbbe96 in __assert_fail () from target:/lib64/libc.so.6 #4 0x00005596dd8828a9 in bdrv_inactivate_recurse (bs=0x5596de337610) at block.c:5445 #5 0x00005596dd883f0f in bdrv_inactivate_all () at block.c:5513 #6 0x00005596dd81276c in qemu_savevm_state_complete_precopy (f=0x5596ded95d80, iterable_only=<optimized out>, inactivate_disks=<optimized out>) at migration/savevm.c:1332 #7 0x00005596dd80ef56 in migration_thread (opaque=0x5596de308450) at migration/migration.c:2775 #8 0x00005596dd94d5e4 in qemu_thread_start (args=0x5596de34aa20) at util/qemu-thread-posix.c:502 #9 0x00007f24711572de in start_thread () from target:/lib64/libpthread.so.0 #10 0x00007f2470e88133 in clone () from target:/lib64/libc.so.6 (gdb) (gdb) frame 4 #4 0x00005596dd8828a9 in bdrv_inactivate_recurse (bs=0x5596de337610) at block.c:5445 5445 block.c: No such file or directory. (gdb) info locals child = <optimized out> parent = <optimized out> tighten_restrictions = false perm = <optimized out> shared_perm = <optimized out> ret = <optimized out> __PRETTY_FUNCTION__ = "bdrv_inactivate_recurse" (gdb) ------------------------------------------------------------------------ I have the core file with me; so further analysis is possible. Created attachment 1636144 [details]
QMP traffic from compute-1
Created attachment 1636155 [details]
QMP traffic from compute-0
So it turns out this is actually another instance of a long-standing problem with disk images and migration: The migration subsystem doesn't clearly define the ownership transfer of resources like disk images and also doesn't systemaically forbid accessing images that the current QEMU process doesn't own. The block layer catches such accesses every now and then and then they are usually addressed with a one-off fix rather than in a systematic way. Maybe now is finally the time for a real fix. What happens in short: compute-1 is started with incoming migration and -S. The -S option means that after migration completion, the VM isn't started yet and ownership of the disk images isn't taken yet. Both of these happen only when you issue a 'cont' command. However, in this instance, contrary to the expectation of the interface, no 'cont' command was issued, but instead 'migrate' was called again to migrate away without ever having taken control of the images. This finally led to an attempt to inactivate already inactive images, which caused the assertion failure. We should probably forbid most monitor commands (including 'migrate') in this state where the incoming migration is almost, but not really completed and we still lack the ownership of some resources. We could also probably use a QMP command to take ownership of all resources (and therefore complete the incoming migration) without resuming the VM yet. Related upstream mailing threads from January: * 'Re: [Qemu-devel] [Qemu-block] Aborts in iotest 169' https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06136.html * 'Re: [Qemu-devel] 3.1: second invocation of migrate crashes qemu' https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg02780.html If we want to go for the full solution this time rather than another local one-off hack to fix one specific instance, this will probably be more of a migration, runstate and monitor subsystem thing than a block layer one. Thanks for the excellent summary of the cause and the useful references, Kevin. Some addendum on the test setup RHOS QE was using (but didn't mention anywhere in the bug): - The setup actually has _shared_ storage (NFS-based); but the test, which is live-migrating a paused instance and its storage, makes more sense in a *non-shared* storage setup: --- [root@compute-1 qemu]# mount | grep nova 192.168.24.1:/nfsfileshare on /var/lib/nova/instances type nfs4 (rw,relatime,context=system_u:object_r:nfs_t:s0,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.24.26,local_lock=none,addr=192.168.24.1,_netdev) --- [root@compute-0 ~]# mount | grep nova 192.168.24.1:/nfsfileshare on /var/lib/nova/instances type nfs4 (rw,relatime,context=system_u:object_r:nfs_t:s0,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.24.37,local_lock=none,addr=192.168.24.1,_netdev) --- - The QMP traffic (from compute-1) confirms the live migration that's being done does _not_ include "block migration" (as evidenced by "blk":false): {"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false,"uri":"fd:migrate"},"id":"libvirt-296"} - - - I'll let Dave and other migration folks respond on the migration state machine. Once the dust settles, probably I can create a better bug / RFE. Created attachment 1636431 [details]
QEMU command-line of the affected instance from compute-0
Created attachment 1636435 [details]
QEMU command-line of the affected instance from compute-1
Kevin: Thanks for the analysis; OK, so yes migrating after an incoming without a cont. Adding an explicit ownership transfer in this case could help - basically a 'cont' without starting the CPUs. However, to be able to implement that we need non-asserting ownership calls from the block layer; for example if we add a 'cont-light' and someone issues it twice we'll probably get a similar assert; we need to have ways for the block layer to let us take/release the disks without it asserting and killing the process; just fail the call. Yes, that makes sense. In fact, the call that you need here is about activating images (bdrv_invalidate_cache_all), which can already be called multiple times without crashing. We probably need to also remove the assertion from bdrv_inactivate_all(), but for a different reason: If one of both operations fails, you end up with a state where some images are active and others are inactive. In such states, you should be able to retry in either direction. We should never try to inactivate images when all of them are already inactivated (it shows that we didn't properly keep track of the state we're in), but that restricted condition seems hard to assert. (In reply to Kevin Wolf from comment #44) > Yes, that makes sense. In fact, the call that you need here is about > activating images (bdrv_invalidate_cache_all), which can already be called > multiple times without crashing. > > We probably need to also remove the assertion from bdrv_inactivate_all(), > but for a different reason: If one of both operations fails, you end up with > a state where some images are active and others are inactive. In such > states, you should be able to retry in either direction. We should never try > to inactivate images when all of them are already inactivated (it shows that > we didn't properly keep track of the state we're in), but that restricted > condition seems hard to assert. The other side-effect of stopping that asserting though is that bugs like these turn into a failed migration rather than a dead VM which is much more pleasant, especially given that even with the command we're suggesting, when something forgets to use it then the same problem happens. Can we also have one for telling us what the current state is? (Or is that in one of the query-block's already?) As for the commands; how about: block-claim claim:bool where passing true takes ownership, passing false drops ownership. (HMP equivalents block_claim -c / -r for claim/release) Dave QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks *** Bug 1831103 has been marked as a duplicate of this bug. *** Throwing this over to kwolf; once you have an interface that doesn't assert, please throw the bz back to me and then I can wire up the migration to fail nicely and provide an explicit ownership hand off. (RHV hit this one again as a side effect of bz 1819721) *** Bug 1870383 has been marked as a duplicate of this bug. *** After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. |