Description of problem: Unexpected "Instance not resizing, skipping migration" is output by resource_tracker 2016-12-02 20:31:16.514 4580 WARNING nova.compute.resource_tracker [req-c376a2a1-b82b-4741-bde0-585758d7de42 - - - - -] [instance: 60dd704d-c5e8-41c6-843e-4d7e5e4d3cf0] Instance not resizing, skipping migration. # mysql -u root -e "select * from migrations" nova +---------------------+---------------------+------------+-----+---------------------------+---------------------------+------------+-----------+--------------------------------------+----------------------+----------------------+---------------------------+---------------------------+---------+----------------+--------+ | created_at | updated_at | deleted_at | id | source_compute | dest_compute | dest_host | status | instance_uuid | old_instance_type_id | new_instance_type_id | source_node | dest_node | deleted | migration_type | hidden | +---------------------+---------------------+------------+-----+---------------------------+---------------------------+------------+-----------+--------------------------------------+----------------------+----------------------+---------------------------+---------------------------+---------+----------------+--------+ | 2016-11-30 09:42:18 | 2016-11-30 09:46:53 | NULL | 20 | adc-r3-az2-compute002.bky | adc-r3-az2-compute014.bky | 10.1.17.54 | completed | 60dd704d-c5e8-41c6-843e-4d7e5e4d3cf0 | NULL | NULL | adc-r3-az2-compute002.bky | adc-r3-az2-compute014.bky | 0 | evacuation | 0 | Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
I can see the status of this migration record is "completed" while the type is "evacuation". compute/resource_tracker.py 778 for migration in migrations: 779 uuid = migration.instance_uuid 780 781 try: 782 if uuid not in instances: 783 instances[uuid] = migration.instance 784 except exception.InstanceNotFound as e: 785 # migration referencing deleted instance 786 LOG.debug('Migration instance not found: %s', e) 787 continue 788 789 # skip migration if instance isn't in a resize state: 790 if not _instance_in_resize_state(instances[uuid]): 791 LOG.warning(_LW("Instance not resizing, skipping migration."), 792 instance_uuid=uuid) 793 continue 517 migrations = objects.MigrationList.get_in_progress_by_host_and_node( 518 context, self.host, self.nodename) The "migrations" should get from a SQL query. db/sqlalchemy/api.py 4478 def migration_get_in_progress_by_host_and_node(context, host, node): 4479 4480 return model_query(context, models.Migration).\ 4481 filter(or_(and_(models.Migration.source_compute == host, 4482 models.Migration.source_node == node), 4483 and_(models.Migration.dest_compute == host, 4484 models.Migration.dest_node == node))).\ 4485 filter(~models.Migration.status.in_(['accepted', 'confirmed', 4486 'reverted', 'error', 4487 'failed'])).\ 4488 options(joinedload_all('instance.system_metadata')).\ 4489 all() Here comes the problem. It seems that our code didn't filter the instance 60dd704d-c5e8-41c6-843e-4d7e5e4d3cf0. As a result in the logs we can see warning keeps outputting. So wondering whether we could hack the DB to remove those entries safely ?
How to reproduce the issue: How to reproduce: 1. Create instances like below. [root@hp-dl160g8-7 nova(keystone_admin)]# nova list --field host,status,name +--------------------------------------+-------------------------------------+--------+------------+ | ID | Host | Status | Name | +--------------------------------------+-------------------------------------+--------+------------+ | d46f455b-8b6c-4b3f-9c5d-69fb446f028b | hp-dl160g8-7.gsslab.pek2.redhat.com | ACTIVE | instance-1 | | 4c1036e2-4efd-4f43-9701-35c5f1351cc6 | hp-dl160g8-8.gsslab.pek2.redhat.com | ACTIVE | instance-2 | +--------------------------------------+-------------------------------------+--------+------------+ 2. Stop the nova-compute on hp-dl160g8-8.gsslab.pek2.redhat.com. 3. Wait for nova hypervisor-list to show it is down. [root@hp-dl160g8-7 nova(keystone_admin)]# nova hypervisor-list +----+-------------------------------------+-------+---------+ | ID | Hypervisor hostname | State | Status | +----+-------------------------------------+-------+---------+ | 1 | hp-dl160g8-7.gsslab.pek2.redhat.com | up | enabled | | 2 | hp-dl160g8-8.gsslab.pek2.redhat.com | down | enabled | +----+-------------------------------------+-------+---------+ 4. Evacuate the instance-2 [root@hp-dl160g8-7 nova(keystone_admin)]# nova evacuate instance-2 +-----------+--------------+ | Property | Value | +-----------+--------------+ | adminPass | pWU6AdpFmX7d | +-----------+--------------+ 5. Confirm instance-2 becomes active [root@hp-dl160g8-7 nova(keystone_admin)]# nova list +--------------------------------------+------------+--------+------------+-------------+-----------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+------------+--------+------------+-------------+-----------------------+ | d46f455b-8b6c-4b3f-9c5d-69fb446f028b | instance-1 | ACTIVE | - | Running | private-1=192.168.1.2 | | 4c1036e2-4efd-4f43-9701-35c5f1351cc6 | instance-2 | ACTIVE | - | Running | private-1=192.168.1.4 | +--------------------------------------+------------+--------+------------+-------------+-----------------------+ 6. Check the log [root@hp-dl160g8-7 nova(keystone_admin)]# tailf /var/log/nova/nova-compute.log | grep d46f455b-8b6c-4b3f-9c5d-69fb446f028b 2016-12-12 09:24:23.013 16316 WARNING nova.compute.resource_tracker [req-17679e9c-86e4-4a01-84cb-9d0fb97e7ccf - - - - -] [instance: d46f455b-8b6c-4b3f-9c5d-69fb446f028b] Instance not resizing, skipping migration. 2016-12-12 09:25:22.765 16316 WARNING nova.compute.resource_tracker [req-17679e9c-86e4-4a01-84cb-9d0fb97e7ccf - - - - -] [instance: d46f455b-8b6c-4b3f-9c5d-69fb446f028b] Instance not resizing, skipping migration. 2016-12-12 09:26:25.077 16316 WARNING nova.compute.resource_tracker [req-17679e9c-86e4-4a01-84cb-9d0fb97e7ccf - - - - -] [instance: d46f455b-8b6c-4b3f-9c5d-69fb446f028b] Instance not resizing, skipping migration.
(In reply to Chen from comment #2) Though we have many fix around migration, but I also found that we added completed status to the filter here. Egohan, can we take a look into this ? Diff: filter on OSP8 vs OSP9 [root@el72-osp8-all-virbr1 ~]# rpm -qf /usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py python-nova-12.0.4-4.el7ost.noarch [root@el72-osp9-all-virbr1 ~]# rpm -qf /usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py python-nova-13.1.1-2.el7ost.noarch # diff -up osp8/api.py /usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py ... +@pick_context_manager_reader def migration_get_in_progress_by_host_and_node(context, host, node): return model_query(context, models.Migration).\ @@ -4484,16 +4771,34 @@ def migration_get_in_progress_by_host_an models.Migration.dest_node == node))).\ filter(~models.Migration.status.in_(['accepted', 'confirmed', 'reverted', 'error', - 'failed'])).\ + 'failed', 'completed'])).\ options(joinedload_all('instance.system_metadata')).\ all() +@pick_context_manager_reader +def migration_get_in_progress_by_instance(context, instance_uuid, + migration_type=None): + # TODO(Shaohe Feng) we should share the in-progress list. + # TODO(Shaohe Feng) will also summarize all status to a new + # MigrationStatus class. + query = model_query(context, models.Migration).\ + filter_by(instance_uuid=instance_uuid).\ + filter(models.Migration.status.in_(['queued', 'preparing', + 'running', + 'post-migrating'])) + if migration_type: + query = query.filter(models.Migration.migration_type == migration_type) + + return query.all() + + +@pick_context_manager_reader ... ...
(In reply to Masaki Furuta from comment #3) > (In reply to Chen from comment #2) > > Though we have many fix around migration, but I also found that we added > completed status to the filter here. > > Egohan, can we take a look into this ? I think this upstream fix adding completed status. - "Migration instance not found" is logged repeatedly to nova-compute.log after an instance was deleted https://code.launchpad.net/bugs/1583720 https://git.openstack.org/cgit/openstack/nova/commit/?id=e5b8060c08bb972a1960575749f240d7160bc114
I was assigned to backport patch [1] to OSP8 to resolve this issue. After creating test package for OSP8 with patch [1] changes, what I found out that it solves this issue partially. For eg.: i) Let say, you evacuate instance from host 'A' (which is down) to host B. ii) After completion of evacuation process, the migration status is marked as "done" in migrations table and it will be in this status as long as host A is down. iii) When we start host 'A' and when nova-compute is up and running on it, "_destroy_evacuated_instances" method will be called by init_host method, which would change migration status form "done" to "completed". Since we are only taking "completed" migration status into consideration while applying filtering criterion in patch[1], we would get "Instance not resizing, skipping migration." warning messages in nova-compute logs, unless step (iii) is performed. As long as the the host from which instance is evacuated is down, migration status won't be set to 'completed' status. I have submitted for review to OSP8 [2]. [1] https://review.openstack.org/#/q/I7aafab9abdbfafe9479846f06068ba8a963d290a,n,z [2] https://code.engineering.redhat.com/gerrit/#/c/95329/
Resource tracker outputs the warning message, but after starting the compute node again the messages stop. ********* VERSION ********* 1:12.0.6-9.el7ost ****** LOGS ****** # nova list --field host,status,name +--------------------------------------+------------------------------+--------+------+ | ID | Host | Status | Name | +--------------------------------------+------------------------------+--------+------+ | ea6af1b0-f0ed-4a7a-8569-c07f18ea54be | compute-0 | ACTIVE | vm2 | | 274b50e1-4a31-45a9-b7c5-917992c39eca | compute-1 | ACTIVE | vm1 | +--------------------------------------+------------------------------+--------+------+ # nova hypervisor-list +----+------------------------------+-------+---------+ | ID | Hypervisor hostname | State | Status | +----+------------------------------+-------+---------+ | 1 | compute-1 | down | enabled | | 2 | compute-0 | up | enabled | +----+------------------------------+-------+---------+ # nova evacuate vm1 +-----------+--------------+ | Property | Value | +-----------+--------------+ | adminPass | EcNjjnfK7VzS | +-----------+--------------+ # nova list --field host,status,name +--------------------------------------+------------------------------+--------+------+ | ID | Host | Status | Name | +--------------------------------------+------------------------------+--------+------+ | 274b50e1-4a31-45a9-b7c5-917992c39eca | compute-0 | ACTIVE | vm1 | | ea6af1b0-f0ed-4a7a-8569-c07f18ea54be | compute-0 | ACTIVE | vm2 | +--------------------------------------+------------------------------+--------+------+ # tail -f /var/log/nova/nova-compute.log | grep 274b50e1-4a31-45a9-b7c5-917992c39eca 2017-03-06 17:13:50.521 26086 INFO nova.compute.manager [req-c4ed3458-103a-42a6-9a07-6ac52266310e - - - - -] [instance: 274b50e1-4a31-45a9-b7c5-917992c39eca] VM Resumed (Lifecycle Event) 2017-03-06 17:13:50.558 26086 INFO nova.virt.libvirt.driver [-] [instance: 274b50e1-4a31-45a9-b7c5-917992c39eca] Instance spawned successfully. 2017-03-06 17:13:50.679 26086 INFO nova.compute.manager [req-c4ed3458-103a-42a6-9a07-6ac52266310e - - - - -] [instance: 274b50e1-4a31-45a9-b7c5-917992c39eca] During the sync_power process the instance has moved from host compute-1 to host compute-0 2017-03-06 17:13:50.680 26086 INFO nova.compute.manager [req-c4ed3458-103a-42a6-9a07-6ac52266310e - - - - -] [instance: 274b50e1-4a31-45a9-b7c5-917992c39eca] VM Started (Lifecycle Event) 2017-03-06 17:13:50.948 26086 INFO nova.compute.manager [req-c4ed3458-103a-42a6-9a07-6ac52266310e - - - - -] [instance: 274b50e1-4a31-45a9-b7c5-917992c39eca] During the sync_power process the instance has moved from host compute-1 to host compute-0 2017-03-06 17:14:05.196 26086 WARNING nova.compute.resource_tracker [req-6c717967-4c30-4d25-839c-670ebedc9749 - - - - -] [instance: 274b50e1-4a31-45a9-b7c5-917992c39eca] Instance not resizing, skipping migration. 2017-03-06 17:15:07.190 26086 WARNING nova.compute.resource_tracker [req-6c717967-4c30-4d25-839c-670ebedc9749 - - - - -] [instance: 274b50e1-4a31-45a9-b7c5-917992c39eca] Instance not resizing, skipping migration. 2017-03-06 17:16:09.185 26086 WARNING nova.compute.resource_tracker [req-6c717967-4c30-4d25-839c-670ebedc9749 - - - - -] [instance: 274b50e1-4a31-45a9-b7c5-917992c39eca] Instance not resizing, skipping migration.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0465.html