This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours

Bug 891051

Summary: [nova][python-novaclient] improve the error info of the nova show command to be more informative as to why the instance wasn't created/is in error state.
Product: Red Hat OpenStack Reporter: Nir Magnezi <nmagnezi>
Component: openstack-novaAssignee: Russell Bryant <rbryant>
Status: CLOSED NOTABUG QA Contact: Gabriel Szasz <gszasz>
Severity: low Docs Contact:
Priority: high    
Version: 2.0 (Folsom)CC: ajeain, apevec, jkt, jpichon, jruzicka, ndipanov, oblaut, sgordon
Target Milestone: ---Keywords: Triaged
Target Release: 6.0 (Juno)   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1072194 (view as bug list) Environment:
Last Closed: 2014-07-09 11:01:58 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 895586, 1072194    
Attachments:
Description Flags
nova scheduler log none

Description Nir Magnezi 2013-01-01 07:31:47 EST
Description of problem:
=======================
Instance creation failed and does not return the reason for the error.
The error caused by issuficuate host resorces.


Version-Release number of selected component (if applicable):
=============================================================
Folsom
# rpm -qa | grep nova
python-nova-2012.2.1-2.el6ost.noarch
openstack-nova-2012.2.1-2.el6ost.noarch
openstack-nova-api-2012.2.1-2.el6ost.noarch
openstack-nova-scheduler-2012.2.1-2.el6ost.noarch
openstack-nova-console-2012.2.1-2.el6ost.noarch
openstack-nova-network-2012.2.1-2.el6ost.noarch
openstack-nova-common-2012.2.1-2.el6ost.noarch
openstack-nova-compute-2012.2.1-2.el6ost.noarch
openstack-nova-objectstore-2012.2.1-2.el6ost.noarch
openstack-nova-novncproxy-0.4-2.el6.noarch
openstack-nova-cert-2012.2.1-2.el6ost.noarch
python-novaclient-2.10.0-1.el6ost.noarch


How reproducible:
=================
100%


Steps to Reproduce:
===================
1. Create a few large instances to consume the host resources, till the first instance fails with an error.

For example:
# nova boot --flavor m1.small --image f0ca2452-5b0b-41f9-ad03-0ad4aecd27c6  test111
# nova boot --flavor m1.xlarge --image f0ca2452-5b0b-41f9-ad03-0ad4aecd27c6  test222
# nova boot --flavor m1.xlarge --image f0ca2452-5b0b-41f9-ad03-0ad4aecd27c6  test333

2. Check instances status:

# nova list


Actual results:
===============
1. instance test333 failed with an error, but the reason is not displayed via CLI nore via horizon.

+-------------------------------------+---------------------------------------------------------------------------------+
| Property                            | Value                                                                           |
+-------------------------------------+---------------------------------------------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                                                          |
| OS-EXT-SRV-ATTR:host                | None                                                                            |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None                                                                            |
| OS-EXT-SRV-ATTR:instance_name       | instance-00000011                                                               |
| OS-EXT-STS:power_state              | 0                                                                               |
| OS-EXT-STS:task_state               | None                                                                            |
| OS-EXT-STS:vm_state                 | error                                                                           |
| accessIPv4                          |                                                                                 |
| accessIPv6                          |                                                                                 |
| adminPass                           | z5f32mBNGKjR                                                                    |
| config_drive                        |                                                                                 |
| created                             | 2013-01-01T10:37:28Z                                                            |
| fault                               | {u'message': u'NoValidHost', u'code': 500, u'created': u'2013-01-01T10:37:21Z'} |
| flavor                              | m1.xlarge                                                                       |
| hostId                              |                                                                                 |
| id                                  | 9c130d5e-ed87-40b2-b0de-9ced0c37a952                                            |
| image                               | Fedora17                                                                        |
| key_name                            | None                                                                            |
| metadata                            | {}                                                                              |
| name                                | test333                                                                         |
| security_groups                     | [{u'name': u'default'}]                                                         |
| status                              | ERROR                                                                           |
| tenant_id                           | 60f660d0412d4968a961d01a95c3bf75                                                |
| updated                             | 2013-01-01T10:37:21Z                                                            |
| user_id                             | 795aa8daf1a24719a56487c19648b8d2                                                |
+-------------------------------------+---------------------------------------------------------------------------------+

2. instances status:

+--------------------------------------+---------+--------+------------------------------------+
| ID                                   | Name    | Status | Networks                           |
+--------------------------------------+---------+--------+------------------------------------+
| 191dee86-1064-4fa3-aad7-1ced892d87a8 | test111 | ACTIVE | novanetwork=192.168.32.5, 10.3.4.1 |
| dde4264b-f70e-460f-a737-59973f6fd212 | test222 | ACTIVE | novanetwork=192.168.32.2           |
| 9c130d5e-ed87-40b2-b0de-9ced0c37a952 | test333 | ERROR  |                                    |
+--------------------------------------+---------+--------+------------------------------------+

nova scheduler log:

host 'blond-vdsf.qa.lab.tlv.redhat.com': free_ram_mb:-3415 free_disk_mb:-163840 does not have 16384 MB usable ram, it only has 3581.5 MB usable ram. host_passes /usr/lib/python2.6/site-packages/nova/scheduler/filters/ram_filter.py:48


Expected results:
=================
When instance creation fails, The reason for the error should be returned by the api.
Comment 1 Nir Magnezi 2013-01-01 07:36:18 EST
Created attachment 670949 [details]
nova scheduler log
Comment 3 Nikola Dipanov 2013-01-04 15:01:00 EST
The reason is not immediately returned but as you show above - 'nova show' does give you a NoValidHost in the fault field so it is not completely uninformative (although I agree that it could be improved).

What I think we should do here is have a bug to improve the output of 'nova show' when it comes to error states. So I will rename this bug and push it to 3.0
Comment 4 Russell Bryant 2013-05-07 11:16:44 EDT
The instance-actions feature in Grizzly is the answer for this.  It keeps track of the history of an instance and lets you retrieve it via the API.
Comment 6 Dave Allan 2013-06-03 11:56:37 EDT
*** Bug 895588 has been marked as a duplicate of this bug. ***
Comment 7 Jakub Ruzicka 2013-06-03 12:47:25 EDT
Current novaclient 2.13.0 supports instance-actions, but the don't provide anything useful for reporter's scenario. `nova show` fault has better description though.

$ nova show instance
...
| fault | {u'message': u'NoValidHost', u'code': 500, u'details': u'No valid host was found.', u'created': u'2013-06-03T16:03:09Z'} |
...


$ nova instance-action-list instance
+--------+------------------------------------------+---------+
| Action | Request_ID                               | Message |
+--------+------------------------------------------+---------+
| create | req-8ee995d0-96ec-4b91-b621-f069ecc36929 | None    |
+--------+------------------------------------------+---------+


$ nova instance-action instance req-8ee995d0-96ec-4b91-b621-f069ecc36929
+---------------+--------------------------------------------------+
| Property      | Value                                            |
+---------------+--------------------------------------------------+
| instance_uuid | 17b7b84f-38a5-4012-8334-51bd0e804417             |
| user_id       | 3e32631677c049918d730ef28774852e                 |
| start_time    | 2013-06-03T16:03:08.000000                       |
| request_id    | req-8ee995d0-96ec-4b91-b621-f069ecc36929         |
| action        | create                                           |
| message       | None                                             |
| project_id    | 3741434e387b4b3e9ab8641a34447795                 |
| events        | [{u'event': u'schedule',                         |
|               |   u'finish_time': u'2013-06-03T16:03:09.000000', |
|               |   u'result': u'Success',                         |
|               |   u'start_time': u'2013-06-03T16:03:08.000000',  |
|               |   u'traceback': None}]                           |
+---------------+--------------------------------------------------+
Comment 8 Russell Bryant 2013-06-03 15:56:16 EDT
Ok, the data from the instance fault should definitely be included in instance actions.  That's certainly the expectation here.  We even have a blueprint to completely remove the instance fault table because we expect instance actions to have all of that info (and more).  I need to follow up with an upstream bug to resolve this.
Comment 10 Russell Bryant 2014-07-09 11:01:58 EDT
I think this is something that should just be tracked upstream if needed.  Bugs with instance actions should be put in launchpad.  There's also a new "tasks API" being worked on that will provide another type of detail via the nova API that would be useful here.