Bug 891051 - [nova][python-novaclient] improve the error info of the nova show command to be more informative as to why the instance wasn't created/is in error state.
[nova][python-novaclient] improve the error info of the nova show command to ...
Status: CLOSED NOTABUG
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova (Show other bugs)
2.0 (Folsom)
Unspecified Linux
high Severity low
: ---
: 6.0 (Juno)
Assigned To: Russell Bryant
Gabriel Szasz
: Triaged
: 895588 (view as bug list)
Depends On:
Blocks: 895586 1072194
  Show dependency treegraph
 
Reported: 2013-01-01 07:31 EST by Nir Magnezi
Modified: 2016-04-26 09:57 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1072194 (view as bug list)
Environment:
Last Closed: 2014-07-09 11:01:58 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
nova scheduler log (8.29 KB, text/x-log)
2013-01-01 07:36 EST, Nir Magnezi
no flags Details

  None (edit)
Description Nir Magnezi 2013-01-01 07:31:47 EST
Description of problem:
=======================
Instance creation failed and does not return the reason for the error.
The error caused by issuficuate host resorces.


Version-Release number of selected component (if applicable):
=============================================================
Folsom
# rpm -qa | grep nova
python-nova-2012.2.1-2.el6ost.noarch
openstack-nova-2012.2.1-2.el6ost.noarch
openstack-nova-api-2012.2.1-2.el6ost.noarch
openstack-nova-scheduler-2012.2.1-2.el6ost.noarch
openstack-nova-console-2012.2.1-2.el6ost.noarch
openstack-nova-network-2012.2.1-2.el6ost.noarch
openstack-nova-common-2012.2.1-2.el6ost.noarch
openstack-nova-compute-2012.2.1-2.el6ost.noarch
openstack-nova-objectstore-2012.2.1-2.el6ost.noarch
openstack-nova-novncproxy-0.4-2.el6.noarch
openstack-nova-cert-2012.2.1-2.el6ost.noarch
python-novaclient-2.10.0-1.el6ost.noarch


How reproducible:
=================
100%


Steps to Reproduce:
===================
1. Create a few large instances to consume the host resources, till the first instance fails with an error.

For example:
# nova boot --flavor m1.small --image f0ca2452-5b0b-41f9-ad03-0ad4aecd27c6  test111
# nova boot --flavor m1.xlarge --image f0ca2452-5b0b-41f9-ad03-0ad4aecd27c6  test222
# nova boot --flavor m1.xlarge --image f0ca2452-5b0b-41f9-ad03-0ad4aecd27c6  test333

2. Check instances status:

# nova list


Actual results:
===============
1. instance test333 failed with an error, but the reason is not displayed via CLI nore via horizon.

+-------------------------------------+---------------------------------------------------------------------------------+
| Property                            | Value                                                                           |
+-------------------------------------+---------------------------------------------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                                                          |
| OS-EXT-SRV-ATTR:host                | None                                                                            |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None                                                                            |
| OS-EXT-SRV-ATTR:instance_name       | instance-00000011                                                               |
| OS-EXT-STS:power_state              | 0                                                                               |
| OS-EXT-STS:task_state               | None                                                                            |
| OS-EXT-STS:vm_state                 | error                                                                           |
| accessIPv4                          |                                                                                 |
| accessIPv6                          |                                                                                 |
| adminPass                           | z5f32mBNGKjR                                                                    |
| config_drive                        |                                                                                 |
| created                             | 2013-01-01T10:37:28Z                                                            |
| fault                               | {u'message': u'NoValidHost', u'code': 500, u'created': u'2013-01-01T10:37:21Z'} |
| flavor                              | m1.xlarge                                                                       |
| hostId                              |                                                                                 |
| id                                  | 9c130d5e-ed87-40b2-b0de-9ced0c37a952                                            |
| image                               | Fedora17                                                                        |
| key_name                            | None                                                                            |
| metadata                            | {}                                                                              |
| name                                | test333                                                                         |
| security_groups                     | [{u'name': u'default'}]                                                         |
| status                              | ERROR                                                                           |
| tenant_id                           | 60f660d0412d4968a961d01a95c3bf75                                                |
| updated                             | 2013-01-01T10:37:21Z                                                            |
| user_id                             | 795aa8daf1a24719a56487c19648b8d2                                                |
+-------------------------------------+---------------------------------------------------------------------------------+

2. instances status:

+--------------------------------------+---------+--------+------------------------------------+
| ID                                   | Name    | Status | Networks                           |
+--------------------------------------+---------+--------+------------------------------------+
| 191dee86-1064-4fa3-aad7-1ced892d87a8 | test111 | ACTIVE | novanetwork=192.168.32.5, 10.3.4.1 |
| dde4264b-f70e-460f-a737-59973f6fd212 | test222 | ACTIVE | novanetwork=192.168.32.2           |
| 9c130d5e-ed87-40b2-b0de-9ced0c37a952 | test333 | ERROR  |                                    |
+--------------------------------------+---------+--------+------------------------------------+

nova scheduler log:

host 'blond-vdsf.qa.lab.tlv.redhat.com': free_ram_mb:-3415 free_disk_mb:-163840 does not have 16384 MB usable ram, it only has 3581.5 MB usable ram. host_passes /usr/lib/python2.6/site-packages/nova/scheduler/filters/ram_filter.py:48


Expected results:
=================
When instance creation fails, The reason for the error should be returned by the api.
Comment 1 Nir Magnezi 2013-01-01 07:36:18 EST
Created attachment 670949 [details]
nova scheduler log
Comment 3 Nikola Dipanov 2013-01-04 15:01:00 EST
The reason is not immediately returned but as you show above - 'nova show' does give you a NoValidHost in the fault field so it is not completely uninformative (although I agree that it could be improved).

What I think we should do here is have a bug to improve the output of 'nova show' when it comes to error states. So I will rename this bug and push it to 3.0
Comment 4 Russell Bryant 2013-05-07 11:16:44 EDT
The instance-actions feature in Grizzly is the answer for this.  It keeps track of the history of an instance and lets you retrieve it via the API.
Comment 6 Dave Allan 2013-06-03 11:56:37 EDT
*** Bug 895588 has been marked as a duplicate of this bug. ***
Comment 7 Jakub Ruzicka 2013-06-03 12:47:25 EDT
Current novaclient 2.13.0 supports instance-actions, but the don't provide anything useful for reporter's scenario. `nova show` fault has better description though.

$ nova show instance
...
| fault | {u'message': u'NoValidHost', u'code': 500, u'details': u'No valid host was found.', u'created': u'2013-06-03T16:03:09Z'} |
...


$ nova instance-action-list instance
+--------+------------------------------------------+---------+
| Action | Request_ID                               | Message |
+--------+------------------------------------------+---------+
| create | req-8ee995d0-96ec-4b91-b621-f069ecc36929 | None    |
+--------+------------------------------------------+---------+


$ nova instance-action instance req-8ee995d0-96ec-4b91-b621-f069ecc36929
+---------------+--------------------------------------------------+
| Property      | Value                                            |
+---------------+--------------------------------------------------+
| instance_uuid | 17b7b84f-38a5-4012-8334-51bd0e804417             |
| user_id       | 3e32631677c049918d730ef28774852e                 |
| start_time    | 2013-06-03T16:03:08.000000                       |
| request_id    | req-8ee995d0-96ec-4b91-b621-f069ecc36929         |
| action        | create                                           |
| message       | None                                             |
| project_id    | 3741434e387b4b3e9ab8641a34447795                 |
| events        | [{u'event': u'schedule',                         |
|               |   u'finish_time': u'2013-06-03T16:03:09.000000', |
|               |   u'result': u'Success',                         |
|               |   u'start_time': u'2013-06-03T16:03:08.000000',  |
|               |   u'traceback': None}]                           |
+---------------+--------------------------------------------------+
Comment 8 Russell Bryant 2013-06-03 15:56:16 EDT
Ok, the data from the instance fault should definitely be included in instance actions.  That's certainly the expectation here.  We even have a blueprint to completely remove the instance fault table because we expect instance actions to have all of that info (and more).  I need to follow up with an upstream bug to resolve this.
Comment 10 Russell Bryant 2014-07-09 11:01:58 EDT
I think this is something that should just be tracked upstream if needed.  Bugs with instance actions should be put in launchpad.  There's also a new "tasks API" being worked on that will provide another type of detail via the nova API that would be useful here.

Note You need to log in before you can comment on or make changes to this bug.