Description of problem: When a VM goes down it leaves two fields behind - exitMessage and exitCode - to be collected by RHEV-M and reported to its admin. On failure (exitCode == 1), the exitMessage is passed verbatim to the VM log. Instead, I suggest to have multiple, meaningful, exitCode values, to be translated into exit text by RHEV-M according to locale. Similar method is used for command results. Livnat suggests that it is too late to fix it in rhev-2.2, so I'm opening this for 6.0-3.0
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux major release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Major release. This request is not yet committed for inclusion.
This feature request did not get resolved in time for Feature Freeze for the current Red Hat Enterprise Linux release and has now been denied. You may re-open your request by requesting your support representative to propose it for the next release.
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.
Ovirt wiki would be useful for this to specify where user can see the messages, and where to find the exitReason (vdsm log / vm_dynamic in database) testplan will be edited after further specification from DEV: https://tcms.engineering.redhat.com/plan/14582/rhevm-compute-virt-internationalize-exitmessage-use-meaningful-exitcode#reviewcases
Francesco, would you please add some description here? It may be worth adding a Doc Text
What was added here is actually the foundation for the exitMessage internazionalization, the 'use meaningful exitCode' part. This feature affects the end of life of a VM: when a VM goes Down for whatever reason. Before this patch, VDSM used to report, along with 'status=Down', one very terse exit code: * SUCCESS for planned, expected VM down, e.g. after a user-initiated shutdown * ERROR for unplanned, unexpected VM down, e.g if the QEMU process died and an opaque 'exitMessage', a english free form text description of the event. After this patch, VDSM now reports also an exit reason code, 'exitReason' which, akin of the UNIX 'errno' variable, describes why a VM has gone down. This field is observable: * in VDSM output, when it reports a VM down * in the Engine's DB, in the vmDynamic table There are some known values as per oVirt 3.5: code summary description 0 success The VM has exited gracefully 1 generic error Unspecified error code 2 lost qemu connection The VM has lost the connection with QEMU 3 libvirt start failed The VM failed to start thorugh libvirt 4 migration succeeded The VM was succesfully migrated and now runs on the destination host 5 save state succeeded The VM state was succesfully saved 6 admin shutdown The VM was shut down by the admin from the engine UI 7 user shutdown The VM was shut down by an user from within the guest 8 migration failed The VM failed to migrate, and do not moved from the source host 9 libvirt domain missing Failed to find the libvirt domain for the VM more codes will be added in the future On top of that, we can build the 'internazionalitation of the exitMessage' part, by mapping those exitReason into localized strings to be shown in the engine UI. It is worth to be mentioned that engine can (and should) utilize this exitReason information for its own decisions about VM lifecycle, to achieve a more detailed understanding of what happened to a VM.
Sorry, forgot to mention that the most up-to-date source for the exitReason values is always the VDSM API schema. In a VDSM (>= 4.15.0) source tree: vdsm/rpc/vdsmapi-schema.json lookup for the 'VmExitReason' key.
tested this on basic scenario 1. power off from webadmin - vmID - cde10514-6396-43ee-bc72-41827c146120 2. shutdown from webadmin - vmID - 19bb1992-47fc-4ce1-8669-307784ab1e95 3. shutdown from VM - vm ID -c8490d6e-e326-4d68-aaae-f0472f8641c2 4. reboot from VM - vmID - 4d7aa507-1b32-4618-a5a2-884500dbbbc1 1. no log reporting exitMessage or exitReason this should be reported 2. exitReason: 7, but the shutdown was iniciated by admin so should be 6 Thread-199::DEBUG::2014-07-30 16:31:54,556::BindingXMLRPC::1134::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', 'exitMessage': 'User shut down from within the guest', 'vmId': '19bb1992-47fc-4ce1-8669-307784ab1e95', 'exitReason': 7, 'timeOffset': '7200', 'exitCode': 0}]} 3. correctly reported Thread-199::DEBUG::2014-07-30 16:31:45,269::BindingXMLRPC::1134::vds::(wrapper) return vmGetStats with {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', 'exitMessage': 'User shut down from within the guest', 'vmId': 'c8490d6e-e326-4d68-aaae-f0472f8641c2', 'exitReason': 7, 'timeOffset': '0', 'exitCode': 0}]} 4. Nothing in vdsm log with exitReason - this is correct Due to the testcase 1. 2. moving back to ASSIGNED. Please add exitReason for poweroff and make different exitReason when shutdown is iniciated by admin from WAP and by user from vm.
(In reply to Lukas Svaty from comment #11) > tested this on basic scenario > 1. power off from webadmin - vmID - cde10514-6396-43ee-bc72-41827c146120 > 2. shutdown from webadmin - vmID - 19bb1992-47fc-4ce1-8669-307784ab1e95 > 3. shutdown from VM - vm ID -c8490d6e-e326-4d68-aaae-f0472f8641c2 > 4. reboot from VM - vmID - 4d7aa507-1b32-4618-a5a2-884500dbbbc1 > > 1. no log reporting exitMessage or exitReason > this should be reported It is reported, although in a different way :) This flow is different, since Engine sends a synchronous VM.destroy() verb to VDSM: Thread-13::DEBUG::2014-08-07 09:30:24,930::vm::4592::vm.Vm::(deleteVm) vmId=`56d1c657-dd76-4609-a207-c050699be5be`::Total desktops after destroy of 56d1c657-dd76-4609-a207-c050699be5be is 0 libvirtEventLoop::DEBUG::2014-08-07 09:30:24,930::vm::2397::vm.Vm::(setDownStatus) vmId=`56d1c657-dd76-4609-a207-c050699be5be`::Changed state to Down: Admin shut down from the engine (code=6) Thread-13::DEBUG::2014-08-07 09:30:24,932::BindingXMLRPC::1145::vds::(wrapper) return vmDestroy with {'status': {'message': 'Machine destroyed', 'code': 0}} After that, Engine doesn't collect the exitReason, but this may considered an Engine bug (anyway the exitReason is implicit if the call succeeds). In the other flows, the engine sends an async VM.shutdown requests, and polls until the VM is detected 'Down'; then the ExitReason is collected from these stats. > 2. exitReason: 7, but the shutdown was iniciated by admin so should be 6 > Thread-199::DEBUG::2014-07-30 > 16:31:54,556::BindingXMLRPC::1134::vds::(wrapper) return vmGetStats with > {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', > 'exitMessage': 'User shut down from within the guest', 'vmId': > '19bb1992-47fc-4ce1-8669-307784ab1e95', 'exitReason': 7, 'timeOffset': > '7200', 'exitCode': 0}]} Still investigating
(In reply to Francesco Romani from comment #12) > (In reply to Lukas Svaty from comment #11) > > 2. exitReason: 7, but the shutdown was iniciated by admin so should be 6 > > Thread-199::DEBUG::2014-07-30 > > 16:31:54,556::BindingXMLRPC::1134::vds::(wrapper) return vmGetStats with > > {'status': {'message': 'Done', 'code': 0}, 'statsList': [{'status': 'Down', > > 'exitMessage': 'User shut down from within the guest', 'vmId': > > '19bb1992-47fc-4ce1-8669-307784ab1e95', 'exitReason': 7, 'timeOffset': > > '7200', 'exitCode': 0}]} > > Still investigating Having hard time to reproduce, but this patch could be beneficial anyway http://gerrit.ovirt.org/#/c/31354/
fix is close, expecting to get in soon. keeping the bug open
oVirt 3.5 has been released and should include the fix for this issue.