RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1713009 - [QEMU] Back-n-forth live migration of a paused VM results in QEMU crash with: "bdrv_inactivate_recurse: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed."
Summary: [QEMU] Back-n-forth live migration of a paused VM results in QEMU crash with:...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Kevin Wolf
QA Contact: nlevinki
URL:
Whiteboard:
: 1831103 1870383 (view as bug list)
Depends On: 1713008 1717175
Blocks: 1887391
TreeView+ depends on / blocked
 
Reported: 2019-05-22 16:54 UTC by Archit Modi
Modified: 2023-03-14 19:40 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1713008
Environment:
Last Closed: 2021-08-03 07:27:06 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
tempest-run.log (513.13 KB, application/x-xz)
2019-05-22 16:54 UTC, Archit Modi
no flags Details
gdb-compute-1 (13.23 KB, text/plain)
2019-11-11 20:39 UTC, Archit Modi
no flags Details
gdb-compute-0 (9.17 KB, text/plain)
2019-11-11 20:40 UTC, Archit Modi
no flags Details
sos-compute-1 (18.11 MB, application/x-xz)
2019-11-11 20:44 UTC, Archit Modi
no flags Details
sos-compute-0 (17.92 MB, application/x-xz)
2019-11-11 20:45 UTC, Archit Modi
no flags Details
new-gdb-compute-1 (11.48 KB, text/plain)
2019-11-12 17:07 UTC, Archit Modi
no flags Details
latest-gdb-trace (11.34 KB, text/plain)
2019-11-13 13:08 UTC, Archit Modi
no flags Details
QMP traffic from compute-1 (154.29 KB, text/plain)
2019-11-14 14:18 UTC, Kashyap Chamarthy
no flags Details
QMP traffic from compute-0 (255.59 KB, text/plain)
2019-11-14 14:44 UTC, Kashyap Chamarthy
no flags Details
QEMU command-line of the affected instance from compute-0 (6.93 KB, text/plain)
2019-11-15 10:46 UTC, Kashyap Chamarthy
no flags Details
QEMU command-line of the affected instance from compute-1 (3.31 KB, text/plain)
2019-11-15 11:11 UTC, Kashyap Chamarthy
no flags Details

Description Archit Modi 2019-05-22 16:54:44 UTC
Created attachment 1572104 [details]
tempest-run.log

+++ This bug was initially created as a clone of Bug #1713008 +++

Description of problem: Deploy RHOS 14 with mentioned topology and run the following tempest test:
tempest.api.compute.admin.test_live_migration.LiveMigrationTest.test_live_block_migration_paused


Version-Release number of selected component (if applicable):
core_puddle: 2019-05-15.1

Topology: 1 Controller, 2 Compute, 3 Ceph

Steps to Reproduce:
1. Deploy RHOS 14 with mentioned topology
2. Run tempest test:
tempest.api.compute.admin.test_live_migration.LiveMigrationTest.test_live_block_migration_paused

Actual results:
Response - Headers: {'status': '200', u'content-length': '1691', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-06ff23fb-f74b-4ad0-b88a-565302d10e19', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:14 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-06ff23fb-f74b-4ad0-b88a-565302d10e19'}
        Body: {"server": {"OS-EXT-STS:task_state": null, "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "paused", "OS-EXT-SRV-ATTR:instance_name": "instance-00000016", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "OS-EXT-STS:power_state": 3, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "PAUSED", "updated": "2019-05-22T09:19:13Z", "hostId": "ba6037e255568dae0d1032f97b5a0148f0f8f8c5923780dd62e7e834", "OS-EXT-SRV-ATTR:host": "compute-1.localdomain", "OS-SRV-USG:terminated_at": null, "key_name": null, "OS-EXT-SRV-ATTR:hypervisor_hostname": "compute-1.localdomain", "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:14,864 469238 INFO     [tempest.api.compute.admin.test_live_migration] Live migrate back to source compute-0.localdomain
2019-05-22 05:19:20,374 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 202 POST http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f/action 5.510s
2019-05-22 05:19:20,375 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: {"os-migrateLive": {"disk_over_commit": false, "block_migration": false, "host": "compute-0.localdomain"}}
    Response - Headers: {'status': '202', u'content-length': '0', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f/action', u'x-compute-request-id': 'req-d53e900d-16eb-45c7-ae93-57dd5dd81139', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:14 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-d53e900d-16eb-45c7-ae93-57dd5dd81139'}
        Body: 
2019-05-22 05:19:21,031 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 200 GET http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f 0.655s
2019-05-22 05:19:21,031 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '1549', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-69cbb142-9f96-4568-ac9c-e6816deee537', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:20 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-69cbb142-9f96-4568-ac9c-e6816deee537'}
        Body: {"server": {"OS-EXT-STS:task_state": "migrating", "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "paused", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-STS:power_state": 3, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "MIGRATING", "updated": "2019-05-22T09:19:15Z", "hostId": "ba6037e255568dae0d1032f97b5a0148f0f8f8c5923780dd62e7e834", "OS-SRV-USG:terminated_at": null, "key_name": null, "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:22,362 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 200 GET http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f 0.320s
2019-05-22 05:19:22,363 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '1549', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-c9e25042-9b84-4ddf-b7fd-702f8ade1475', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:22 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-c9e25042-9b84-4ddf-b7fd-702f8ade1475'}
        Body: {"server": {"OS-EXT-STS:task_state": "migrating", "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "paused", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-STS:power_state": 3, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "MIGRATING", "updated": "2019-05-22T09:19:15Z", "hostId": "ba6037e255568dae0d1032f97b5a0148f0f8f8c5923780dd62e7e834", "OS-SRV-USG:terminated_at": null, "key_name": null, "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:23,859 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 200 GET http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f 0.487s
2019-05-22 05:19:23,860 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '1549', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-f432dae7-9a39-4d73-9793-de599fae4f56', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:23 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-f432dae7-9a39-4d73-9793-de599fae4f56'}
        Body: {"server": {"OS-EXT-STS:task_state": "migrating", "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "paused", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-STS:power_state": 3, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "MIGRATING", "updated": "2019-05-22T09:19:15Z", "hostId": "ba6037e255568dae0d1032f97b5a0148f0f8f8c5923780dd62e7e834", "OS-SRV-USG:terminated_at": null, "key_name": null, "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:25,613 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 200 GET http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f 0.743s
2019-05-22 05:19:25,613 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '1549', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-2816d9ba-8164-49f8-9bd8-dfdb99f7e793', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:24 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-2816d9ba-8164-49f8-9bd8-dfdb99f7e793'}
        Body: {"server": {"OS-EXT-STS:task_state": "migrating", "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "paused", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-STS:power_state": 3, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "MIGRATING", "updated": "2019-05-22T09:19:15Z", "hostId": "ba6037e255568dae0d1032f97b5a0148f0f8f8c5923780dd62e7e834", "OS-SRV-USG:terminated_at": null, "key_name": null, "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:27,021 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 200 GET http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f 0.398s
2019-05-22 05:19:27,021 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '1549', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-b08518e4-ba0f-4e70-a7b6-6b471140acdb', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:26 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-b08518e4-ba0f-4e70-a7b6-6b471140acdb'}
        Body: {"server": {"OS-EXT-STS:task_state": "migrating", "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "paused", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-STS:power_state": 3, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "MIGRATING", "updated": "2019-05-22T09:19:26Z", "hostId": "ba6037e255568dae0d1032f97b5a0148f0f8f8c5923780dd62e7e834", "OS-SRV-USG:terminated_at": null, "key_name": null, "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:28,494 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 200 GET http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f 0.463s
2019-05-22 05:19:28,495 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '1549', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-bdc67261-806a-4599-b96c-4e7b681942d2', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:28 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-bdc67261-806a-4599-b96c-4e7b681942d2'}
        Body: {"server": {"OS-EXT-STS:task_state": "migrating", "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "paused", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-STS:power_state": 3, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "MIGRATING", "updated": "2019-05-22T09:19:26Z", "hostId": "ba6037e255568dae0d1032f97b5a0148f0f8f8c5923780dd62e7e834", "OS-SRV-USG:terminated_at": null, "key_name": null, "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:29,965 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 200 GET http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f 0.460s
2019-05-22 05:19:29,966 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '1549', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-94435664-4c39-4c02-a087-d163578113aa', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:29 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-94435664-4c39-4c02-a087-d163578113aa'}
        Body: {"server": {"OS-EXT-STS:task_state": "migrating", "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "paused", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-STS:power_state": 3, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "MIGRATING", "updated": "2019-05-22T09:19:26Z", "hostId": "ba6037e255568dae0d1032f97b5a0148f0f8f8c5923780dd62e7e834", "OS-SRV-USG:terminated_at": null, "key_name": null, "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:31,915 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 200 GET http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f 0.939s
2019-05-22 05:19:31,916 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '1549', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-027a54e0-026f-422c-94d4-74e2af518391', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:30 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-027a54e0-026f-422c-94d4-74e2af518391'}
        Body: {"server": {"OS-EXT-STS:task_state": "migrating", "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "paused", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-STS:power_state": 3, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "MIGRATING", "updated": "2019-05-22T09:19:26Z", "hostId": "ba6037e255568dae0d1032f97b5a0148f0f8f8c5923780dd62e7e834", "OS-SRV-USG:terminated_at": null, "key_name": null, "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:33,565 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 200 GET http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f 0.640s
2019-05-22 05:19:33,566 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '1549', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-c521cf4d-fbe5-4c97-8e9d-df54f1bb73e9', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:32 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-c521cf4d-fbe5-4c97-8e9d-df54f1bb73e9'}
        Body: {"server": {"OS-EXT-STS:task_state": "migrating", "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "paused", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "progress": 0, "OS-EXT-STS:power_state": 3, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "MIGRATING", "updated": "2019-05-22T09:19:26Z", "hostId": "ba6037e255568dae0d1032f97b5a0148f0f8f8c5923780dd62e7e834", "OS-SRV-USG:terminated_at": null, "key_name": null, "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:34,925 469238 INFO     [tempest.lib.common.rest_client] Request (LiveMigrationTest:test_live_block_migration_paused): 200 GET http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f 0.342s
2019-05-22 05:19:34,926 469238 DEBUG    [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '200', u'content-length': '1522', 'content-location': 'http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f', u'x-compute-request-id': 'req-a5eabee0-fba4-444a-9a24-349436ade795', u'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version,Accept-Encoding', u'server': 'Apache', u'openstack-api-version': 'compute 2.1', u'connection': 'close', u'x-openstack-nova-api-version': '2.1', u'date': 'Wed, 22 May 2019 09:19:34 GMT', u'content-type': 'application/json', u'x-openstack-request-id': 'req-a5eabee0-fba4-444a-9a24-349436ade795'}
        Body: {"server": {"OS-EXT-STS:task_state": null, "addresses": {"tempest-LiveMigrationTest-848502904-network": [{"OS-EXT-IPS-MAC:mac_addr": "fa:16:3e:93:df:8b", "version": 4, "addr": "10.100.0.12", "OS-EXT-IPS:type": "fixed"}]}, "links": [{"href": "http://10.0.0.104:8774/v2.1/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "self"}, {"href": "http://10.0.0.104:8774/servers/099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "rel": "bookmark"}], "image": {"id": "898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "links": [{"href": "http://10.0.0.104:8774/images/898431fb-7e30-49cc-97d9-25c5d7c4fe6f", "rel": "bookmark"}]}, "OS-EXT-STS:vm_state": "error", "OS-SRV-USG:launched_at": "2019-05-22T09:18:52.000000", "flavor": {"id": "2141d048-a7db-4230-8c3b-a68551ac2cca", "links": [{"href": "http://10.0.0.104:8774/flavors/2141d048-a7db-4230-8c3b-a68551ac2cca", "rel": "bookmark"}]}, "id": "099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f", "security_groups": [{"name": "default"}], "user_id": "c6c28eddd60d4ff38a83d52a2b34fecf", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "OS-EXT-STS:power_state": 0, "OS-EXT-AZ:availability_zone": "nova", "config_drive": "", "status": "ERROR", "updated": "2019-05-22T09:19:34Z", "hostId": "9c716cb3080e86b2bb14390316f6f0e7614102c98c41a37e52b968dd", "OS-SRV-USG:terminated_at": null, "key_name": null, "name": "tempest-LiveMigrationTest-server-818390104", "created": "2019-05-22T09:18:36Z", "tenant_id": "62a1524e90ce4a999f8f960f26967621", "os-extended-volumes:volumes_attached": [], "metadata": {}}}
2019-05-22 05:19:34,937 469238 INFO     [tempest.common.waiters] State transition "MIGRATING/migrating" ==> "ERROR/None" after 13 second wait
}}}

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tempest/api/compute/admin/test_live_migration.py", line 134, in test_live_block_migration_paused
    self._test_live_migration(state='PAUSED')
  File "/usr/lib/python2.7/site-packages/tempest/api/compute/admin/test_live_migration.py", line 124, in _test_live_migration
    self._live_migrate(server_id, source_host, state, volume_backed)
  File "/usr/lib/python2.7/site-packages/tempest/api/compute/admin/test_live_migration.py", line 81, in _live_migrate
    waiters.wait_for_server_status(self.servers_client, server_id, state)
  File "/usr/lib/python2.7/site-packages/tempest/common/waiters.py", line 78, in wait_for_server_status
    raise exceptions.BuildErrorException(server_id=server_id)
tempest.exceptions.BuildErrorException: Server 099b8a1e-5e34-4ac9-8d0d-73f8e5ae845f failed to build and is in ERROR status

Expected results:
Test passed

Comment 8 Archit Modi 2019-11-11 20:33:37 UTC
Reopening this as i found a reproducer with gdb logs

Comment 9 Archit Modi 2019-11-11 20:39:44 UTC
Created attachment 1634995 [details]
gdb-compute-1

Comment 10 Archit Modi 2019-11-11 20:40:10 UTC
Created attachment 1634996 [details]
gdb-compute-0

Comment 11 Archit Modi 2019-11-11 20:40:24 UTC
(overcloud) [stack@undercloud-0 tempest-dir]$ tempest run --regex tempest.api.compute.admin.test_live_migration.LiveMigrationTest.test_live_block_migration_paused                                    
{0} tempest.api.compute.admin.test_live_migration.LiveMigrationTest.test_live_block_migration_paused [43.720125s] ... FAILED                                                                          
                                                                                                                                                                                                      
Captured traceback:                                                                                                                                                                                   
~~~~~~~~~~~~~~~~~~~                                                                                                                                                                                   
    b'Traceback (most recent call last):'                                                                                                                                                             
    b'  File "/usr/lib/python3.6/site-packages/tempest/api/compute/admin/test_live_migration.py", line 135, in test_live_block_migration_paused'                                                      
    b"    self._test_live_migration(state='PAUSED')"                                                                                                                                                  
    b'  File "/usr/lib/python3.6/site-packages/tempest/api/compute/admin/test_live_migration.py", line 125, in _test_live_migration'                                                                  
    b'    self._live_migrate(server_id, source_host, state, volume_backed)'                                                                                                                           
    b'  File "/usr/lib/python3.6/site-packages/tempest/api/compute/admin/test_live_migration.py", line 77, in _live_migrate'                                                                          
    b'    waiters.wait_for_server_status(self.servers_client, server_id, state)'                                                                                                                      
    b'  File "/usr/lib/python3.6/site-packages/tempest/common/waiters.py", line 78, in wait_for_server_status'                                                                                        
    b'    raise exceptions.BuildErrorException(server_id=server_id)'                                                                                                                                  
    b'tempest.exceptions.BuildErrorException: Server f5fc8e99-f177-46c4-93a7-d3d48d162eaf failed to build and is in ERROR status'
    b''

...OUTPUT OMITTED...

tearDownClass (tempest.api.compute.admin.test_live_migration.LiveMigrationTest)
-------------------------------------------------------------------------------

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    b'Traceback (most recent call last):'
    b'  File "/usr/lib/python3.6/site-packages/tempest/test.py", line 225, in tearDownClass'
    b'    six.reraise(etype, value, trace)'
    b'  File "/usr/lib/python3.6/site-packages/six.py", line 675, in reraise'
    b'    raise value'
    b'  File "/usr/lib/python3.6/site-packages/tempest/test.py", line 197, in tearDownClass'
    b'    teardown()'
    b'  File "/usr/lib/python3.6/site-packages/tempest/test.py", line 568, in resource_cleanup'
    b'    raise testtools.MultipleExceptions(*cleanup_errors)'
    b"testtools.runtest.MultipleExceptions: (<class 'tempest.lib.exceptions.DeleteErrorException'>, Resource %(resource_id)s failed to delete and is in ERROR status"                                
    b'Details: Server f5fc8e99-f177-46c4-93a7-d3d48d162eaf failed to delete and is in ERROR status, <traceback object at 0x7f6287965bc8>)'                                                           
    b''

======
Totals
======
Ran: 2 tests in 51.7801 sec.
 - Passed: 0
 - Skipped: 0
 - Expected Fail: 0
 - Unexpected Success: 0
 - Failed: 2
Sum of execute time for each test: 43.7201 sec.

==============
Worker Balance
==============
 - Worker 0 (2 tests) => 0:00:51.780051

Comment 12 Archit Modi 2019-11-11 20:44:02 UTC
Created attachment 1634999 [details]
sos-compute-1

Comment 13 Archit Modi 2019-11-11 20:45:11 UTC
Created attachment 1635002 [details]
sos-compute-0

Comment 19 Archit Modi 2019-11-12 17:07:55 UTC
Created attachment 1635434 [details]
new-gdb-compute-1

Comment 20 Eduardo Habkost 2019-11-12 19:32:47 UTC
(In reply to Archit Modi from comment #19)
> Created attachment 1635434 [details]
> new-gdb-compute-1

The backtrace seems to be from a live QEMU process that is working as expected, not from a QEMU crash.  How exactly is it related to this bug report?

Comment 21 Archit Modi 2019-11-12 23:38:09 UTC
(In reply to Eduardo Habkost from comment #20)
> (In reply to Archit Modi from comment #19)
> > Created attachment 1635434 [details]
> > new-gdb-compute-1
> 
> The backtrace seems to be from a live QEMU process that is working as
> expected, not from a QEMU crash.  How exactly is it related to this bug
> report?

I am not sure which log represents a QEMU process/crash, this tempest test has been failing for not just RHOS 15 but other RHOS versions too-13.14.15.16 and i wanted it to be fixed, the bug report doesn't just mention about the crash but also about the tempest test failing

Comment 22 Archit Modi 2019-11-12 23:48:39 UTC
(In reply to Eduardo Habkost from comment #20)
> (In reply to Archit Modi from comment #19)
> > Created attachment 1635434 [details]
> > new-gdb-compute-1
> 
> The backtrace seems to be from a live QEMU process that is working as
> expected, not from a QEMU crash.  How exactly is it related to this bug
> report?

Also, from the libvirt logs, i can see the reason was it crashed

...
char device redirected to /dev/pts/2 (label charserial0)
2019-11-12T17:02:12.583660Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead
2019-11-12 17:02:21.146+0000: initiating migration
qemu-kvm: block.c:5445: bdrv_inactivate_recurse: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
2019-11-12 17:02:21.756+0000: shutting down, reason=crashed

Comment 23 Kashyap Chamarthy 2019-11-13 09:04:19 UTC
After updating QEMU to qemu-kvm-rhev-2.12.0-20.el7 (or higher), did you make sure to restart libvirtd?

    systemctl stop libvirtd
    systemctl start libvirtd

Related: To confirm  your running Nova instance is indeed using the updated QEMU version, please post the QEMU log for that.  It should be located on the Compute host:  /var/log/libvirt/qemu/instance-YYYYYYYY.log.

Comment 24 Archit Modi 2019-11-13 13:07:34 UTC
I restarted libvirtd and still could reproduce the issue, (latest-gdb-trace.txt attached)

[root@shark20 ~]# systemctl stop libvirtd
[root@shark20 ~]# systemctl start libvirtd
[root@shark20 ~]# systemctl status libvirtd
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2019-11-13 13:00:04 UTC; 1s ago
     Docs: man:libvirtd(8)
           https://libvirt.org
 Main PID: 38569 (libvirtd)
    Tasks: 24 (limit: 32768)
   CGroup: /system.slice/libvirtd.service
           ├─  705 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─  706 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─22029 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/external.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─22030 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/external.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─22246 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/management.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─22247 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/management.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           ├─22423 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/data.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           └─38569 /usr/sbin/libvirtd

Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[22423]: read /etc/hosts - 4 addresses
Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[22246]: read /etc/hosts - 4 addresses
Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[22246]: read /var/lib/libvirt/dnsmasq/management.addnhosts - 0 addresses
Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[22423]: read /var/lib/libvirt/dnsmasq/data.addnhosts - 0 addresses
Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[705]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq[22029]: read /var/lib/libvirt/dnsmasq/external.addnhosts - 0 addresses
Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq-dhcp[705]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq-dhcp[22029]: read /var/lib/libvirt/dnsmasq/external.hostsfile
Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq-dhcp[22029]: not giving name undercloud-0 to the DHCP lease of 10.0.0.25 because the name exists in /etc/hosts with address 172.16.0.34
Nov 13 13:00:05 shark20.mobius.lab.eng.rdu2.redhat.com dnsmasq-dhcp[22246]: read /var/lib/libvirt/dnsmasq/management.hostsfile

[root@compute-1 qemu]# less instance-00000097.log
2019-11-13 13:02:10.347+0000: starting up libvirt version: 5.6.0, package: 6.module+el8.1.0+4244+9aa4e6bb (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2019-09-19-17:11:14, ), qemu version: 4.1.0qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61, kernel: 4.18.0-147.el8.x86_64, hostname: compute-1.redhat.local
LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
HOME=/var/lib/libvirt/qemu/domain-110-instance-00000097 \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-110-instance-00000097/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-110-instance-00000097/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-110-instance-00000097/.config \
QEMU_AUDIO_DRV=none \
/usr/libexec/qemu-kvm \
-name guest=instance-00000097,debug-threads=on \
-S \
-object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-110-instance-00000097/master-key.aes \
-machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off \
-cpu Broadwell-IBRS,vme=on,ss=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,arch-capabilities=on,xsaveopt=on,pdpe1gb=on,abm=on,skip-l1dfl-vmentry=on,spec-ctrl=off \
-m 64 \
-overcommit mem-lock=off \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid dafe75d9-c1d3-4820-88f5-a440f246f425 \
-smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=19.0.3-0.20190814170534.a8e19af.el8ost,serial=dafe75d9-c1d3-4820-88f5-a440f246f425,uuid=dafe75d9-c1d3-4820-88f5-a440f246f425,family=Virtual Machine' \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=69,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay \
-no-hpet \
-no-shutdown \
-boot strict=on \
-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
-drive file=/var/lib/nova/instances/dafe75d9-c1d3-4820-88f5-a440f246f425/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on \
-netdev tap,fd=72,id=hostnet0,vhost=on,vhostfd=73 \
-device virtio-net-pci,rx_queue_size=512,host_mtu=1442,netdev=hostnet0,id=net0,mac=fa:16:3e:cb:7c:3f,bus=pci.0,addr=0x3 \
-add-fd set=3,fd=75 \
-chardev pty,id=charserial0,logfile=/dev/fdset/3,logappend=on \
-device isa-serial,chardev=charserial0,id=serial0 \
-device usb-tablet,id=input0,bus=usb.0,port=1 \
-vnc 172.17.1.89:0 \
-device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \
-incoming defer \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 \
-object rng-random,id=objrng0,filename=/dev/urandom \
-device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x6 \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
char device redirected to /dev/pts/2 (label charserial0)
2019-11-13T13:02:10.554791Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead
2019-11-13 13:02:23.097+0000: initiating migration
qemu-kvm: block.c:5445: bdrv_inactivate_recurse: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
2019-11-13 13:02:23.777+0000: shutting down, reason=crashed

Comment 25 Archit Modi 2019-11-13 13:08:04 UTC
Created attachment 1635763 [details]
latest-gdb-trace

Comment 26 Eduardo Habkost 2019-11-13 13:44:58 UTC
(In reply to Archit Modi from comment #25)
> Created attachment 1635763 [details]
> latest-gdb-trace

I still don't understand what the gdb traces you are attaching are supposed to mean.  They just show a QEMU process that is working as expected, and contain no information about the crash being reported.  How exactly are you generating those gdb traces?

Comment 27 Archit Modi 2019-11-13 13:47:57 UTC
(In reply to Eduardo Habkost from comment #26)
> (In reply to Archit Modi from comment #25)
> > Created attachment 1635763 [details]
> > latest-gdb-trace
> 
> I still don't understand what the gdb traces you are attaching are supposed
> to mean.  They just show a QEMU process that is working as expected, and
> contain no information about the crash being reported.  How exactly are you
> generating those gdb traces?

i was following kashyap's request to gather these traces. 

(0) Install the debuginfo packages for QEMU on RHEL-8 (this requires you
    have the debuginfo RPM repositories enabled!)

      $> dnf debuginfo-install qemu-kvm

(1) Launch QEMU with GDB:

      $> gdb /usr/libexec/qemu-kvm

(2) In the GDB shell, set the logging:

      (gdb) set logging on
      (gdb) run

[Reproduce the problem]

(3) *Now* in the GDB shell do:

      (gdb) thread apply all bt full
      (gdb) quit

Comment 29 Eduardo Habkost 2019-11-13 13:56:57 UTC
(In reply to Archit Modi from comment #27)
[...]
> [Reproduce the problem]
> 
> (3) *Now* in the GDB shell do:
> 
>       (gdb) thread apply all bt full
>       (gdb) quit


This won't work unless you are able to reproduce the problem inside the QEMU process you are running.  This doesn't seem to be the case, as you are running qemu-kvm without any arguments.

If you can't attach gdb to the running QEMU process before it crashes, you will need a core dump from the crashed process.  Is ABRT running on the host?  If ABRT is capturing the crash, you can use ABRT to get the core dump and/or backtrace.

Comment 31 Eduardo Habkost 2019-11-13 14:09:50 UTC
(In reply to Archit Modi from comment #24)
[...]
> [root@compute-1 qemu]# less instance-00000097.log
> 2019-11-13 13:02:10.347+0000: starting up libvirt version: 5.6.0, package:
> 6.module+el8.1.0+4244+9aa4e6bb (Red Hat, Inc.
> <http://bugzilla.redhat.com/bugzilla>, 2019-09-19-17:11:14, ), qemu version:
> 4.1.0qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61, kernel:
> 4.18.0-147.el8.x86_64, hostname: compute-1.redhat.local
[...]
> 2019-11-13 13:02:23.097+0000: initiating migration
> qemu-kvm: block.c:5445: bdrv_inactivate_recurse: Assertion `!(bs->open_flags
> & BDRV_O_INACTIVE)' failed.
> 2019-11-13 13:02:23.777+0000: shutting down, reason=crashed

qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61 is supposed to contain the fix for bug 1655963, commit 9e37271f50ec ("block: Don't inactivate children before parents").  It may be a completely different issue with a similar crash.  Kevin, can I reassign this to you?

Comment 32 Kevin Wolf 2019-11-13 16:41:57 UTC
(In reply to Eduardo Habkost from comment #31)
> qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61 is supposed to contain the
> fix for bug 1655963, commit 9e37271f50ec ("block: Don't inactivate children
> before parents").  It may be a completely different issue with a similar
> crash.

Yes, it's a different assertion in a different place that is failing.

Essentially, we're trying to inactivate a block node that is already inactivated. Two cases where I could see this happening is during incoming migration (i.e. we would be trying to migrate away from a host that hadn't even completely received the VM state for a previous, still ongoing migration - it looks like we have checks to prevent this) or after a failed migration (i.e. we would be retrying migration without having restarted the guest).

Where in the logs do I have to look to see the history of the QEMU process in question to see if anything unusual happened before this? I suppose the QMP traffic is not logged, as usual?

And of course, a backtrace of the crashed QEMU would still be useful.

> Kevin, can I reassign this to you?

Let's first figure out some more details before moving this to qemu-kvm, we don't have much information yet. Maybe Kashyap can ping me on IRC tomorrow (he doesn't seem to be available at the moment).

Comment 33 Eduardo Habkost 2019-11-13 16:55:24 UTC
(In reply to Kevin Wolf from comment #32)
> (In reply to Eduardo Habkost from comment #31)
> > qemu-kvm-4.1.0-13.module+el8.1.0+4313+ef76ec61 is supposed to contain the
> > fix for bug 1655963, commit 9e37271f50ec ("block: Don't inactivate children
> > before parents").  It may be a completely different issue with a similar
> > crash.
> 
> Yes, it's a different assertion in a different place that is failing.
> 
> Essentially, we're trying to inactivate a block node that is already
> inactivated. Two cases where I could see this happening is during incoming
> migration (i.e. we would be trying to migrate away from a host that hadn't
> even completely received the VM state for a previous, still ongoing
> migration - it looks like we have checks to prevent this) or after a failed
> migration (i.e. we would be retrying migration without having restarted the
> guest).
> 
> Where in the logs do I have to look to see the history of the QEMU process
> in question to see if anything unusual happened before this? I suppose the
> QMP traffic is not logged, as usual?


The libvirt logs above (comment #24) don't seem to include QMP traffic, unfortunately.

> 
> And of course, a backtrace of the crashed QEMU would still be useful.

Kashyap is working on that, I hope we get a backtrace tomorrow.


> 
> > Kevin, can I reassign this to you?
> 
> Let's first figure out some more details before moving this to qemu-kvm, we
> don't have much information yet. Maybe Kashyap can ping me on IRC tomorrow
> (he doesn't seem to be available at the moment).

Even if there's a bug somewhere else triggering this, I consider any assertion failure in QEMU a QEMU bug.  But we can wait until we have more details.  Thanks!

Comment 34 Kashyap Chamarthy 2019-11-13 18:53:52 UTC
Okay ... after much fiddling around, got the core (thanks to an idea from a colleague, Patrick Uiterwijk -- had to `pgrep` until the QEMU process existed; and then `gdb attach` to get this:)

------------------------------------------------------------------------
Thread 7 "live_migration" received signal SIGABRT, Aborted.
[Switching to Thread 0x7f24699f5700 (LWP 256858)]
0x00007f2470dc38df in raise () from target:/lib64/libc.so.6
(gdb) gcore
warning: target file /proc/256592/cmdline contained unexpected null characters
Saved corefile core.256592
(gdb) bt
#0  0x00007f2470dc38df in raise () from target:/lib64/libc.so.6
#1  0x00007f2470dadcf5 in abort () from target:/lib64/libc.so.6
#2  0x00007f2470dadbc9 in __assert_fail_base.cold.0 () from target:/lib64/libc.so.6
#3  0x00007f2470dbbe96 in __assert_fail () from target:/lib64/libc.so.6
#4  0x00005596dd8828a9 in bdrv_inactivate_recurse (bs=0x5596de337610) at block.c:5445
#5  0x00005596dd883f0f in bdrv_inactivate_all () at block.c:5513
#6  0x00005596dd81276c in qemu_savevm_state_complete_precopy (f=0x5596ded95d80, iterable_only=<optimized out>, inactivate_disks=<optimized out>) at migration/savevm.c:1332
#7  0x00005596dd80ef56 in migration_thread (opaque=0x5596de308450) at migration/migration.c:2775
#8  0x00005596dd94d5e4 in qemu_thread_start (args=0x5596de34aa20) at util/qemu-thread-posix.c:502
#9  0x00007f24711572de in start_thread () from target:/lib64/libpthread.so.0
#10 0x00007f2470e88133 in clone () from target:/lib64/libc.so.6
(gdb) 
(gdb) frame 4
#4  0x00005596dd8828a9 in bdrv_inactivate_recurse (bs=0x5596de337610) at block.c:5445
5445    block.c: No such file or directory.
(gdb) info locals
child = <optimized out>
parent = <optimized out>
tighten_restrictions = false
perm = <optimized out>
shared_perm = <optimized out>
ret = <optimized out>
__PRETTY_FUNCTION__ = "bdrv_inactivate_recurse"
(gdb)
------------------------------------------------------------------------

I have the core file with me; so further analysis is possible.

Comment 35 Kashyap Chamarthy 2019-11-14 14:18:20 UTC
Created attachment 1636144 [details]
QMP traffic from compute-1

Comment 36 Kashyap Chamarthy 2019-11-14 14:44:32 UTC
Created attachment 1636155 [details]
QMP traffic from compute-0

Comment 37 Kevin Wolf 2019-11-14 15:49:01 UTC
So it turns out this is actually another instance of a long-standing problem with disk images and migration: The migration subsystem doesn't clearly define the ownership transfer of resources like disk images and also doesn't systemaically forbid accessing images that the current QEMU process doesn't own. The block layer catches such accesses every now and then and then they are usually addressed with a one-off fix rather than in a systematic way. Maybe now is finally the time for a real fix.

What happens in short: compute-1 is started with incoming migration and -S. The -S option means that after migration completion, the VM isn't started yet and ownership of the disk images isn't taken yet. Both of these happen only when you issue a 'cont' command. However, in this instance, contrary to the expectation of the interface, no 'cont' command was issued, but instead 'migrate' was called again to migrate away without ever having taken control of the images. This finally led to an attempt to inactivate already inactive images, which caused the assertion failure.

We should probably forbid most monitor commands (including 'migrate') in this state where the incoming migration is almost, but not really completed and we still lack the ownership of some resources. We could also probably use a QMP command to take ownership of all resources (and therefore complete the incoming migration) without resuming the VM yet.

Related upstream mailing threads from January:

* 'Re: [Qemu-devel] [Qemu-block] Aborts in iotest 169'
   https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06136.html

* 'Re: [Qemu-devel] 3.1: second invocation of migrate crashes qemu'
   https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg02780.html

If we want to go for the full solution this time rather than another local one-off hack to fix one specific instance, this will probably be more of a migration, runstate and monitor subsystem thing than a block layer one.

Comment 38 Kashyap Chamarthy 2019-11-14 17:41:38 UTC
Thanks for the excellent summary of the cause and the useful references,
Kevin.

Some addendum on the test setup RHOS QE was using (but didn't mention
anywhere in the bug):

- The setup actually has _shared_ storage (NFS-based); but the test,
  which is live-migrating a paused instance and its storage, makes
  more sense in a *non-shared* storage setup:

    ---
    [root@compute-1 qemu]# mount | grep nova
    192.168.24.1:/nfsfileshare on /var/lib/nova/instances type nfs4 (rw,relatime,context=system_u:object_r:nfs_t:s0,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.24.26,local_lock=none,addr=192.168.24.1,_netdev)
    ---
    [root@compute-0 ~]# mount | grep nova
    192.168.24.1:/nfsfileshare on /var/lib/nova/instances type nfs4 (rw,relatime,context=system_u:object_r:nfs_t:s0,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.24.37,local_lock=none,addr=192.168.24.1,_netdev)
    ---

- The QMP traffic (from compute-1) confirms the live migration that's 
  being done does _not_ include "block migration" (as evidenced by 
  "blk":false):

    {"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false,"uri":"fd:migrate"},"id":"libvirt-296"}

            - - -

I'll let Dave and other migration folks respond on the migration state 
machine.  Once the dust settles, probably I can create a better bug /
RFE.

Comment 40 Kashyap Chamarthy 2019-11-15 10:46:06 UTC
Created attachment 1636431 [details]
QEMU command-line of the affected instance from compute-0

Comment 41 Kashyap Chamarthy 2019-11-15 11:11:11 UTC
Created attachment 1636435 [details]
QEMU command-line of the affected instance from compute-1

Comment 43 Dr. David Alan Gilbert 2019-11-20 10:30:42 UTC
Kevin: Thanks for the analysis; OK, so yes migrating after an incoming without a cont.
Adding an explicit ownership transfer in this case could help - basically a 'cont' without starting the CPUs.
However, to be able to implement that we need non-asserting ownership calls from the block layer; for example
if we add a 'cont-light' and someone issues it twice we'll probably get a similar assert; we need to have
ways for the block layer to let us take/release the disks without it asserting and killing the process;
just fail the call.

Comment 44 Kevin Wolf 2019-11-20 10:42:46 UTC
Yes, that makes sense. In fact, the call that you need here is about activating images (bdrv_invalidate_cache_all), which can already be called multiple times without crashing.

We probably need to also remove the assertion from bdrv_inactivate_all(), but for a different reason: If one of both operations fails, you end up with a state where some images are active and others are inactive. In such states, you should be able to retry in either direction. We should never try to inactivate images when all of them are already inactivated (it shows that we didn't properly keep track of the state we're in), but that restricted condition seems hard to assert.

Comment 45 Dr. David Alan Gilbert 2019-11-20 19:59:19 UTC
(In reply to Kevin Wolf from comment #44)
> Yes, that makes sense. In fact, the call that you need here is about
> activating images (bdrv_invalidate_cache_all), which can already be called
> multiple times without crashing.
> 
> We probably need to also remove the assertion from bdrv_inactivate_all(),
> but for a different reason: If one of both operations fails, you end up with
> a state where some images are active and others are inactive. In such
> states, you should be able to retry in either direction. We should never try
> to inactivate images when all of them are already inactivated (it shows that
> we didn't properly keep track of the state we're in), but that restricted
> condition seems hard to assert.

The other side-effect of stopping that asserting though is that bugs like these
turn into a failed migration rather than a dead VM which is much more pleasant,
especially given that even with the command we're suggesting, when something forgets
to use it then the same problem happens.

Can we also have one for telling us what the current state is? (Or is that in one of
the query-block's already?)

As for the commands; how about:

block-claim  claim:bool   

where passing true takes ownership, passing false drops ownership.
(HMP equivalents block_claim -c / -r   for claim/release)

Dave

Comment 47 Ademar Reis 2020-02-05 22:58:11 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 48 Dr. David Alan Gilbert 2020-05-07 15:17:52 UTC
*** Bug 1831103 has been marked as a duplicate of this bug. ***

Comment 49 Dr. David Alan Gilbert 2020-06-29 09:45:51 UTC
Throwing this over to kwolf;  once you have an interface that doesn't assert, please throw the bz back to me and then
I can wire up the migration to fail nicely and provide an explicit ownership hand off.

(RHV hit this one again as a side effect of bz 1819721)

Comment 50 Lee Yarwood 2020-08-21 16:26:32 UTC
*** Bug 1870383 has been marked as a duplicate of this bug. ***

Comment 54 RHEL Program Management 2021-02-01 07:40:59 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 58 RHEL Program Management 2021-08-03 07:27:06 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.