557125 – OVIRT35 - [RFE] internationalize exitMessage; use meaningful exitCode

Bug 557125 - OVIRT35 - [RFE] internationalize exitMessage; use meaningful exitCode

Summary: OVIRT35 - [RFE] internationalize exitMessage; use meaningful exitCode

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	oVirt
Classification:	Retired
Component:	vdsm
Sub Component:
Version:	unspecified
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	3.5.0
Assignee:	Francesco Romani
QA Contact:	Lukas Svaty
Docs Contact:
URL:
Whiteboard:	virt
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-01-20 14:19 UTC by Dan Kenigsberg
Modified:	2016-02-10 19:51 UTC (History)
CC List:	11 users (show)
Fixed In Version:	ovirt-3.5.0-beta2
Clone Of:
Environment:
Last Closed:	2014-10-17 12:33:39 UTC
oVirt Team:	Virt
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
oVirt gerrit	22631	None	MERGED	stats: report detailed VM down status	Never
oVirt gerrit	31354	master	MERGED	vm: simplify the shutdown exit reason	Never
oVirt gerrit	31901	ovirt-3.5	MERGED	vm: do not use _dom for powerdown	Never
oVirt gerrit	31902	ovirt-3.5	MERGED	vm: simplify the shutdown exit reason	Never

Description Dan Kenigsberg 2010-01-20 14:19:41 UTC

Description of problem:
When a VM goes down it leaves two fields behind - exitMessage and exitCode - to be collected by RHEV-M and reported to its admin.

On failure (exitCode == 1), the exitMessage is passed verbatim to the VM log.

Instead, I suggest to have multiple, meaningful, exitCode values, to be translated into exit text by RHEV-M according to locale. Similar method is used for command results.

Livnat suggests that it is too late to fix it in rhev-2.2, so I'm opening this for 6.0-3.0

Comment 2 RHEL Program Management 2010-05-26 15:23:13 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 3 RHEL Program Management 2010-05-27 18:00:17 UTC

This feature request did not get resolved in time for Feature Freeze
for the current Red Hat Enterprise Linux release and has now been
denied. You may re-open your request by requesting your support
representative to propose it for the next release.

Comment 5 Itamar Heim 2013-01-30 22:50:44 UTC

Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.

Comment 7 Lukas Svaty 2014-07-08 13:28:56 UTC

Ovirt wiki would be useful for this to specify where user can see the messages, and where to find the exitReason (vdsm log / vm_dynamic in database)

testplan will be edited after further specification from DEV:
https://tcms.engineering.redhat.com/plan/14582/rhevm-compute-virt-internationalize-exitmessage-use-meaningful-exitcode#reviewcases

Comment 8 Michal Skrivanek 2014-07-08 13:48:42 UTC

Francesco, would you please add some description here? It may be worth adding a Doc Text

Comment 9 Francesco Romani 2014-07-10 06:59:26 UTC

What was added here is actually the foundation for the exitMessage internazionalization, the 'use meaningful exitCode' part.

This feature affects the end of life of a VM: when a VM goes Down for whatever reason.

Before this patch, VDSM used to report, along with 'status=Down', one very terse exit code:
* SUCCESS for planned, expected VM down, e.g. after a user-initiated shutdown
* ERROR for unplanned, unexpected VM down, e.g if the QEMU process died

and an opaque 'exitMessage', a english free form text description of the event.

After this patch, VDSM now reports also an exit reason code, 'exitReason' which, akin of the UNIX 'errno' variable, describes why a VM has gone down.

This field is observable:
* in VDSM output, when it reports a VM down
* in the Engine's DB, in the vmDynamic table

There are some known values as per oVirt 3.5:
code    summary                 description
0       success                 The VM has exited gracefully
1       generic error           Unspecified error code
2       lost qemu connection    The VM has lost the connection with QEMU
3       libvirt start failed    The VM failed to start thorugh libvirt
4       migration succeeded     The VM was succesfully migrated and now runs on the destination host
5       save state succeeded    The VM state was succesfully saved
6       admin shutdown          The VM was shut down by the admin from the engine UI
7       user shutdown           The VM was shut down by an user from within the guest
8       migration failed        The VM failed to migrate, and do not moved from the source host
9       libvirt domain missing  Failed to find the libvirt domain for the VM

more codes will be added in the future

On top of that, we can build the 'internazionalitation of the exitMessage' part,
by mapping those exitReason into localized strings to be shown in the engine UI.

It is worth to be mentioned that engine can (and should) utilize this exitReason information for its own decisions about VM lifecycle, to achieve
a more detailed understanding of what happened to a VM.

Comment 10 Francesco Romani 2014-07-10 07:00:27 UTC

Sorry, forgot to mention that the most up-to-date source for the exitReason values is always the VDSM API schema.

In a VDSM (>= 4.15.0) source tree:

vdsm/rpc/vdsmapi-schema.json

lookup for the 'VmExitReason' key.

Comment 11 Lukas Svaty 2014-07-30 14:44:04 UTC

tested this on basic scenario
1. power off from webadmin - vmID - cde10514-6396-43ee-bc72-41827c146120
2. shutdown from webadmin - vmID - 19bb1992-47fc-4ce1-8669-307784ab1e95
3. shutdown from VM - vm ID -c8490d6e-e326-4d68-aaae-f0472f8641c2
4. reboot from VM - vmID - 4d7aa507-1b32-4618-a5a2-884500dbbbc1

1. no log reporting exitMessage or exitReason
this should be reported

2. exitReason: 7, but the shutdown was iniciated by admin so should be 6
Thread-199::DEBUG::2014-07-30 16:31:54,556::BindingXMLRPC::1134::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', 'exitMessage': 'User shut down from within the guest', 'vmId': '19bb1992-47fc-4ce1-8669-307784ab1e95', 'exitReason': 7, 'timeOffset': '7200', 'exitCode': 0}]}

3. correctly reported
Thread-199::DEBUG::2014-07-30 16:31:45,269::BindingXMLRPC::1134::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', 'exitMessage': 'User shut down from within the guest', 'vmId': 'c8490d6e-e326-4d68-aaae-f0472f8641c2', 'exitReason': 7, 'timeOffset': '0', 'exitCode': 0}]}

4. Nothing in vdsm log with exitReason - this is correct

Due to the testcase 1. 2. moving back to ASSIGNED.
Please add exitReason for poweroff and make different exitReason when shutdown is iniciated by admin from WAP and by user from vm.

Comment 12 Francesco Romani 2014-08-07 08:16:35 UTC

(In reply to Lukas Svaty from comment #11)
> tested this on basic scenario
> 1. power off from webadmin - vmID - cde10514-6396-43ee-bc72-41827c146120
> 2. shutdown from webadmin - vmID - 19bb1992-47fc-4ce1-8669-307784ab1e95
> 3. shutdown from VM - vm ID -c8490d6e-e326-4d68-aaae-f0472f8641c2
> 4. reboot from VM - vmID - 4d7aa507-1b32-4618-a5a2-884500dbbbc1
> 
> 1. no log reporting exitMessage or exitReason
> this should be reported

It is reported, although in a different way :)
This flow is different, since Engine sends a synchronous VM.destroy() verb to VDSM:

Thread-13::DEBUG::2014-08-07 09:30:24,930::vm::4592::vm.Vm::(deleteVm) vmId=`56d1c657-dd76-4609-a207-c050699be5be`::Total desktops after destroy of 56d1c657-dd76-4609-a207-c050699be5be is 0
libvirtEventLoop::DEBUG::2014-08-07 09:30:24,930::vm::2397::vm.Vm::(setDownStatus) vmId=`56d1c657-dd76-4609-a207-c050699be5be`::Changed state to Down: Admin shut down from the engine (code=6)
Thread-13::DEBUG::2014-08-07 09:30:24,932::BindingXMLRPC::1145::vds::(wrapper) return vmDestroy with {'status': {'message': 'Machine destroyed', 'code': 0}}

After that, Engine doesn't collect the exitReason, but this may considered an Engine bug (anyway the exitReason is implicit if the call succeeds).

In the other flows, the engine sends an async VM.shutdown requests, and polls until the VM is detected 'Down'; then the ExitReason is collected from these stats.

> 2. exitReason: 7, but the shutdown was iniciated by admin so should be 6
> Thread-199::DEBUG::2014-07-30
> 16:31:54,556::BindingXMLRPC::1134::vds::(wrapper) return vmGetStats with
> {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down',
> 'exitMessage': 'User shut down from within the guest', 'vmId':
> '19bb1992-47fc-4ce1-8669-307784ab1e95', 'exitReason': 7, 'timeOffset':
> '7200', 'exitCode': 0}]}

Still investigating

Comment 13 Francesco Romani 2014-08-11 15:52:45 UTC

(In reply to Francesco Romani from comment #12)
> (In reply to Lukas Svaty from comment #11)

> > 2. exitReason: 7, but the shutdown was iniciated by admin so should be 6
> > Thread-199::DEBUG::2014-07-30
> > 16:31:54,556::BindingXMLRPC::1134::vds::(wrapper) return vmGetStats with
> > {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down',
> > 'exitMessage': 'User shut down from within the guest', 'vmId':
> > '19bb1992-47fc-4ce1-8669-307784ab1e95', 'exitReason': 7, 'timeOffset':
> > '7200', 'exitCode': 0}]}
> 
> Still investigating

Having hard time to reproduce, but this patch could be beneficial anyway

http://gerrit.ovirt.org/#/c/31354/

Comment 14 Michal Skrivanek 2014-08-22 09:31:36 UTC

fix is close, expecting to get in soon. keeping the bug open

Comment 15 Sandro Bonazzola 2014-10-17 12:33:39 UTC

oVirt 3.5 has been released and should include the fix for this issue.

Note You need to log in before you can comment on or make changes to this bug.