Bug 1043278

Summary: [RFE] after_hook failure (call to VDSM succeeded) should be reported also to the engine audit log
Product: Red Hat Enterprise Virtualization Manager Reporter: Raz Tamir <ratamir>
Component: vdsmAssignee: Dima Kuznetsov <dkuznets>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: aberezin, acanan, bazulay, danken, dkuznets, emesika, iheim, lpeer, oourfali, pstehlik, ratamir, ybronhei, yeylon
Target Milestone: ---Keywords: FutureFeature, Improvement, Triaged
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-16 20:57:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
vdsm log none

Description Raz Tamir 2013-12-15 16:02:44 UTC
Created attachment 836978 [details]
vdsm log

Description of problem:
in case after hook fails, engine report task finish successfully although the plug completed successfully but the after hook doesn't


Version-Release number of selected component (if applicable):
is27

How reproducible:
100%

Steps to Reproduce:
1. add JPG file to /usr/libexec/vdsm/hooks/after_disk_hotplug/
2. chmod 775 the JPG
3. hotplug DISK

Actual results:
plug completed successfully and after hook fails with no alert about it

Expected results:
task should completed successfully and alert about hook failure

vdsm.log attached - look for 2013-12-15 17:29:06 for more details

Comment 1 Yaniv Bronhaim 2013-12-16 23:03:20 UTC
what alert do we expect to see? in engine's side or in vdsm.log? any RFE about it?

Comment 2 Arthur Berezin 2013-12-21 13:43:17 UTC
Engine side should alert "failed to execute hook /path/to/hook on host {hostanme}" 

We should open another RFE for vdsm to execute only executables (i.e - do not try to execute JPG/PNG files).

Comment 3 Barak 2013-12-31 10:42:07 UTC
This RFE is involves both vdsm and engine side.

at least 2 separate bugs should be opened:


on VDSM - should include the additional information on the hook failure in the optional part of the vdsm success response.


However this is not sufficient to get it into the audit log.
So additonal bug should be opened on engine.

Comment 4 Dima Kuznetsov 2014-03-19 13:51:58 UTC
On VDSM side, for most commands that result in hooks being run we can add a field in the response with all the return codes (or at least the non-zero return codes), for example:

{'status': ..., 'hookErrors': { 'before_event_a': {'/path/to/hook/1': returnCode1 [, ...]} [, ...]}}

And then process on the engine side accordingly.

However, I've noticed that some commands can result in hooks being run *after* response is sent to the engine. One such example (to my best understanding) is cont (vmCont) that runs the after hooks only once libvirt has started the VM. In these cases we cannot report hook failures in the above fashion. Is there any way these cases can be reported to the engine?

Comment 5 Dan Kenigsberg 2014-03-19 21:52:15 UTC
Is there a real-life use case for this RFE? Something that does not involve a malevolent admin trying to execute jpegs?

I think it is wrong for Engine to care about which hook failed, or whether it was a hook that failed an operation. The hooks are intended as a lightweight framework of extending Vdsm without Engine caring about the implementation or knowing that a hook was involved.

Lacking such a use case, Please CLOSE|NOTABUG.

Comment 6 Barak 2014-03-20 13:07:34 UTC
The use case is clear.

Admin installed a post hook, configured it to not fail the operation , but still want to have indication the hook failed.

This is within the hook generic API, and if we don't have it there is no way Admins can tel it had ever happened.

Assuming that putting a hook in place matters to the admin , than this is a valid use case.

Comment 7 Barak 2014-03-20 13:08:13 UTC
Arthur ?

Comment 8 Dan Kenigsberg 2014-03-20 14:27:35 UTC
I am not convinced that such an admin exits in real life.

Comment 9 Arthur Berezin 2014-03-20 18:38:05 UTC
This bug is about error handling, in real life the admin would probably copy directory's content into hooks path without noticing non executables files were copied as well, VDSM should not execute such files. 

There's value to users in reporting back hooks execution failures in a centralized place(Engine > UI > Events)

Comment 10 Dan Kenigsberg 2014-03-21 09:56:12 UTC
Hooks are Vdsm code, written by the user. They should be used with care. In particular, a prudent admin should install them via rpm, as some hook require a specific vdsm version. Certainly not by blindly copying directories. If you do this, having a fake executable there is the least of your worries. This amounts to copying random code into /usr/share/vdsm, which would never end well.

Comment 11 Barak 2014-03-23 17:56:03 UTC
(In reply to Dan Kenigsberg from comment #10)
> Hooks are Vdsm code, written by the user. They should be used with care. In
> particular, a prudent admin should install them via rpm, as some hook
> require a specific vdsm version. Certainly not by blindly copying
> directories. If you do this, having a fake executable there is the least of
> your worries. This amounts to copying random code into /usr/share/vdsm,
> which would never end well.


Danken there is no way we can handle all fat fingers scenarios,
This is about reporting something is wrong ... otherwise the failure of the post hook will go unnoticed.

This is a valid request, the only problem is that it needs to change the API in a backwards compatible way (which is always a challenge).

Comment 12 Dan Kenigsberg 2014-03-24 10:24:44 UTC
An admin that tries to execute a jpeg deserves every punishment he gets.

This request is "valid", but it's not "real". It is based on a creating QE, not on real users or customers. IMO we have much more interesting APIs to define.

Comment 13 Barak 2014-03-24 14:22:18 UTC
I assume there are other scenarios that this may happen as well,
Not sure whether we have these handled:

- not having enough permissions to handle the hook (I think this one is handled)
  please correct me if I'm wrong
- the script exits with error
  * the script exits with error code
  * the script has a syntax error (it may be the same error as above)

Comment 14 Dan Kenigsberg 2014-04-16 11:35:21 UTC
I agree that there are cases where collecting the failure modes of hooks in Engine can help the informed administrator. But I think that the failures are so vast, and unrelated to the code of ovirt and vdsm. Just like we do not have a special way to report that a python module failed to be imported into Vdsm since it lacks the right permissions, we should not worry about the same kind of failures with hooks.

IMO this RFE came to life due to a good quality engineering work - not based on what customers really need. If you believe that user should know which hook ran, which failed, and when, I think we should define a feature from the top down, and not based on a degenerate corner case (after_hook failing since it's a jpeg).

Comment 15 Barak 2014-04-16 20:57:34 UTC
Based on comment #14  moving to CLOSE WONTFIX