Hide Forgot
Created attachment 836978 [details] vdsm log Description of problem: in case after hook fails, engine report task finish successfully although the plug completed successfully but the after hook doesn't Version-Release number of selected component (if applicable): is27 How reproducible: 100% Steps to Reproduce: 1. add JPG file to /usr/libexec/vdsm/hooks/after_disk_hotplug/ 2. chmod 775 the JPG 3. hotplug DISK Actual results: plug completed successfully and after hook fails with no alert about it Expected results: task should completed successfully and alert about hook failure vdsm.log attached - look for 2013-12-15 17:29:06 for more details
what alert do we expect to see? in engine's side or in vdsm.log? any RFE about it?
Engine side should alert "failed to execute hook /path/to/hook on host {hostanme}" We should open another RFE for vdsm to execute only executables (i.e - do not try to execute JPG/PNG files).
This RFE is involves both vdsm and engine side. at least 2 separate bugs should be opened: on VDSM - should include the additional information on the hook failure in the optional part of the vdsm success response. However this is not sufficient to get it into the audit log. So additonal bug should be opened on engine.
On VDSM side, for most commands that result in hooks being run we can add a field in the response with all the return codes (or at least the non-zero return codes), for example: {'status': ..., 'hookErrors': { 'before_event_a': {'/path/to/hook/1': returnCode1 [, ...]} [, ...]}} And then process on the engine side accordingly. However, I've noticed that some commands can result in hooks being run *after* response is sent to the engine. One such example (to my best understanding) is cont (vmCont) that runs the after hooks only once libvirt has started the VM. In these cases we cannot report hook failures in the above fashion. Is there any way these cases can be reported to the engine?
Is there a real-life use case for this RFE? Something that does not involve a malevolent admin trying to execute jpegs? I think it is wrong for Engine to care about which hook failed, or whether it was a hook that failed an operation. The hooks are intended as a lightweight framework of extending Vdsm without Engine caring about the implementation or knowing that a hook was involved. Lacking such a use case, Please CLOSE|NOTABUG.
The use case is clear. Admin installed a post hook, configured it to not fail the operation , but still want to have indication the hook failed. This is within the hook generic API, and if we don't have it there is no way Admins can tel it had ever happened. Assuming that putting a hook in place matters to the admin , than this is a valid use case.
Arthur ?
I am not convinced that such an admin exits in real life.
This bug is about error handling, in real life the admin would probably copy directory's content into hooks path without noticing non executables files were copied as well, VDSM should not execute such files. There's value to users in reporting back hooks execution failures in a centralized place(Engine > UI > Events)
Hooks are Vdsm code, written by the user. They should be used with care. In particular, a prudent admin should install them via rpm, as some hook require a specific vdsm version. Certainly not by blindly copying directories. If you do this, having a fake executable there is the least of your worries. This amounts to copying random code into /usr/share/vdsm, which would never end well.
(In reply to Dan Kenigsberg from comment #10) > Hooks are Vdsm code, written by the user. They should be used with care. In > particular, a prudent admin should install them via rpm, as some hook > require a specific vdsm version. Certainly not by blindly copying > directories. If you do this, having a fake executable there is the least of > your worries. This amounts to copying random code into /usr/share/vdsm, > which would never end well. Danken there is no way we can handle all fat fingers scenarios, This is about reporting something is wrong ... otherwise the failure of the post hook will go unnoticed. This is a valid request, the only problem is that it needs to change the API in a backwards compatible way (which is always a challenge).
An admin that tries to execute a jpeg deserves every punishment he gets. This request is "valid", but it's not "real". It is based on a creating QE, not on real users or customers. IMO we have much more interesting APIs to define.
I assume there are other scenarios that this may happen as well, Not sure whether we have these handled: - not having enough permissions to handle the hook (I think this one is handled) please correct me if I'm wrong - the script exits with error * the script exits with error code * the script has a syntax error (it may be the same error as above)
I agree that there are cases where collecting the failure modes of hooks in Engine can help the informed administrator. But I think that the failures are so vast, and unrelated to the code of ovirt and vdsm. Just like we do not have a special way to report that a python module failed to be imported into Vdsm since it lacks the right permissions, we should not worry about the same kind of failures with hooks. IMO this RFE came to life due to a good quality engineering work - not based on what customers really need. If you believe that user should know which hook ran, which failed, and when, I think we should define a feature from the top down, and not based on a degenerate corner case (after_hook failing since it's a jpeg).
Based on comment #14 moving to CLOSE WONTFIX