Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1592123

Summary: New VM creation fails - logged message is not informative/detailed enough
Product: Red Hat OpenStack Reporter: Arkady Shtempler <ashtempl>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED WONTFIX QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 13.0 (Queens)CC: berrange, dasmith, eglynn, jhakimra, kchamart, mbooth, sbauza, sferdjao, sgordon, srevivo, vromanso
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-21 15:41:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Nova logs of both: Compute and Controller are inside the attached file none

Description Arkady Shtempler 2018-06-17 13:43:24 UTC
Created attachment 1452372 [details]
Nova logs of both: Compute and Controller are inside the attached file

Setup:
OSP 13 is up and running (Undercloud + Overcloud)


Scenario:
1. SSH to your OSP host and run:
2. ssh undercloud-0 (SSH to undercloud)
3. su – stack (Switch to user stack)
4. . stackrc  (Source stack)
5. openstack server list (Get nodes + save Nova IP)
6. ssh heat-admin@<Nova_IP> (SSH to Nova as heat-admin user)
7. free -m (Check free memory on Nova)

Now let’s create a new VM  the will use more RAM memory then availible free memory on Nova node.

1. openstack flavor create --ram <2*NovaFreeMemory> --disk 10  --vcpus 1 --public <flavor_name> (Create flavor)
2. openstack image create --disk-format qcow2 --file <*.qcow2 Image>  --public <image_name> (Create image, download Cirros if you don’t have one)
3. openstack server create --flavor <flavor_name> --image <image_name> --nic net-id=<netID> <VM_Name> (Create VM)


Actual Result
VM creation fails, as expected. It’s possible to see ERROR in Status 
(overcloud) [stack@undercloud-0 ~]$ openstack server list
+--------------------------------------+------+---------+--------------------+-------+--------------+
| ID                                   | Name | Status  | Networks           | Image | Flavor       |
+--------------------------------------+------+---------+--------------------+-------+--------------+

| 9cfb4914-c573-4b4d-88d1-e238dbfc531e | vm1  | ERROR   |                    | rhel  | ark_flavor   |


The problem is when you trying to find the reason of this failure using nova logs, there is nothing about “Not enough memory” in it, the only ERROR I found is in nova-conductor.log :

2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager [req-3a553b29-e33e-4e4a-abc6-9a3e0ccdf7d4 b4332e1592ab480f96fc87e5af797895 f3f03848a45746c7bcbe95b625d7e1d8 - default default] Failed to schedule instances: NoValidHost_Remote: No valid host was found. 
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 226, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 139, in select_destinations
    raise exception.NoValidHost(reason="")

NoValidHost: No valid host was found.
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager Traceback (most recent call last):
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 1116, in schedule_and_build_instances
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     instance_uuids, return_alternates=True)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 716, in _schedule_instances
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     return_alternates=return_alternates)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/utils.py", line 726, in wrapped
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     return func(*args, **kwargs)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 53, in select_destinations
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     instance_uuids, return_objects, return_alternates)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     return getattr(self.instance, __name)(*args, **kwargs)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/query.py", line 42, in select_destinations
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     instance_uuids, return_objects, return_alternates)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/rpcapi.py", line 158, in select_destinations
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     return cctxt.call(ctxt, 'select_destinations', **msg_args)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 174, in call
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     retry=self.retry)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 131, in _send
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     timeout=timeout, retry=retry)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in send
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     retry=retry)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 550, in _send
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     raise result
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager NoValidHost_Remote: No valid host was found. 
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager Traceback (most recent call last):
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager 
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 226, in inner
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     return func(*args, **kwargs)
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager 
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 139, in select_destinations
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager     raise exception.NoValidHost(reason="")
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager 
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager NoValidHost: No valid host was found. 
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager 
2018-06-17 09:52:37.109 24 ERROR nova.conductor.manager

Comment 1 Matthew Booth 2018-06-21 15:41:05 UTC
This has been discussed multiple times before. The problem is that the issue really is that there is 'No valid host'. There may be multiple reasons per compute host that the instance was not scheduled there, so 'not enough memory' is an edge case at best. We may have been out of memory on compute A, out of disk on compute B, out of CPU on compute C, didn't have the requested GPU type on compute D, ...