Bug 1543103

Summary: Call vdsm 'after_vm_pause' hooks when the VM has been paused because an I/O Error
Product: Red Hat Enterprise Virtualization Manager Reporter: Miguel Martin <mmartinv>
Component: vdsmAssignee: Miguel Martin <mmartinv>
Status: CLOSED ERRATA QA Contact: Polina <pagranat>
Severity: high Docs Contact:
Priority: high    
Version: unspecifiedCC: bugs, lsurette, lveyde, michal.skrivanek, mkalinin, mmartinv, pagranat, ratamir, srevivo, trichard, ycui, ykaul, ylavi
Target Milestone: ovirt-4.2.2Flags: lsvaty: testing_plan_complete-
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: vdsm v4.20.18 Doc Type: Bug Fix
Doc Text:
Previously, the after_vm_pause VDSM hook was not executed after I/O errors. This has now been fixed.
Story Points: ---
Clone Of:
: 1546967 (view as bug list) Environment:
Last Closed: 2018-05-15 17:54:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1540548, 1546967    
Attachments:
Description Flags
after_vm_pause_hook.png
none
vdsm_engine.tar.gz none

Description Miguel Martin 2018-02-07 17:56:42 UTC
Description of problem:

vdsm does not call 'after_vm_pause' hooks when the VM has been paused because an I/O Error

How reproducible:
Always

Steps to Reproduce:
1. Install a vdsm hook in "/usr/libexec/hooks/after_vm_pause" directory
2. Force a VM to be paused because an I/O Error

Actual results:
The hook in "/usr/libexec/hooks/after_vm_pause" directory is not executed

Expected results:
All hooks in "/usr/libexec/hooks/after_vm_pause" directory are executed

Comment 4 RHV bug bot 2018-02-16 16:25:03 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops

Comment 6 Polina 2018-02-21 08:54:56 UTC
Hi  Miguel,

I've tested it on version vdsm-4.20.18-1.el7ev.x86_64 & rhvm-4.2.2-0.1.el7.noarch (rhel7.5)
and see that the problem is not resolved.

could you please see my steps and say maybe I miss something in reproducing.

1. Put some python or shell script (tried both) in /usr/libexec/vdsm/hooks/after_vm_pause on Host where VM is running.
See that script under HostHooks in Admin Portal (attached after_vm_pause_hook.png).

2. block the iscsi storage on the same host which causes VM I/O Error Pause (I use HA VM , no lease), then unblock which causes VM resume. 

Result: the script placed in /usr/libexec/vdsm/hooks/after_vm_pause is not executed.

Comment 7 Polina 2018-02-21 08:56:36 UTC
Created attachment 1398569 [details]
after_vm_pause_hook.png

Comment 8 Michal Skrivanek 2018-02-23 08:00:12 UTC
can you attach vdsm.log please?

Comment 9 Polina 2018-02-25 07:32:22 UTC
please see the vdsm & engine logs attached in vdsm_engine.tar.gz

in vdsm.log please log starting from:

2018-02-25 09:19:56,634+0200 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.01 seconds (__init__:573)


in engine.log starting from:

2018-02-25 09:19:03,277+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-5) [] EVENT_ID: USER_RUN_VM(32), VM golden_env_mixed_virtio_2_0 started on Host host_mixed_1

2018-02-25 09:23:40,047+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-7) [] EVENT_ID: VM_PAUSED_EIO(145), VM golden_env_mixed_virtio_2_0 has been paused due to storage I/O problem.

Comment 10 Polina 2018-02-25 07:33:13 UTC
Created attachment 1400533 [details]
vdsm_engine.tar.gz

Comment 11 Michal Skrivanek 2018-02-26 13:51:22 UTC
in the log you can see it tried to execute it

2018-02-25 09:23:39,844+0200 INFO  (libvirt/events) [root] /usr/libexec/vdsm/hooks/after_vm_pause/test_hook.py: rc=2 err=/usr/libexec/vdsm/hooks/a
fter_vm_pause/test_hook.py: line 1: import: command not found

I guess you missed the #! to run the right interpreter

Comment 12 Polina 2018-02-27 10:12:35 UTC
Yes, you are right. I just checked that the script is executed manually and did't add the #!/usr/bin/python2. Now it is added. Still not executed upon i/o pause/resume . the error in vdsm log:

2018-02-27 10:30:11,084+0200 DEBUG (event/26) [root] FINISH thread <Thread(event/26, started daemon 139786127886080)> (concurrent:195)
2018-02-27 10:30:11,111+0200 DEBUG (libvirt/events) [root] FAILED: <err> = 'taskset: failed to execute /usr/libexec/vdsm/hooks/after_vm_pause/test_hook.py: No such file or directory\n'; <rc> = 1 (commands:86)
2018-02-27 10:30:11,113+0200 INFO  (libvirt/events) [root] /usr/libexec/vdsm/hooks/after_vm_pause/test_hook.py: rc=1 err=taskset: failed to execute /usr/libexec/vdsm/hooks/after_vm_pause/test_hook.py: No such file or directory
 (hooks:110)

Comment 13 Michal Skrivanek 2018-02-27 10:29:39 UTC
well, it's still not executable. Either way, there is an attempt to run it so the bug is verified:)

Comment 14 Polina 2018-02-28 08:01:14 UTC
(In reply to Michal Skrivanek from comment #13)
> well, it's still not executable. Either way, there is an attempt to run it
> so the bug is verified:)

just to confirm:
I've put some simple python script 
/usr/libexec/vdsm/hooks/after_vm_pause/50_create_file (chmod 755) which is manually executed ok (the script is below).
The vdsm recognize the created hook (2018-02-28 09:49:46,511+0200 INFO  (libvirt/events) [root] /usr/libexec/vdsm/hooks/after_vm_pause/50_create_file: rc=0 err= (hooks:110)), but the script itself is not executed. 


#!/usr/bin/python

import os

d = os.path.dirname(__file__) # directory of script
p = r'{}/results/hook_test'.format(d) # path to be created

try:
    os.makedirs(p)
except OSError:
    pass

Please approve if it is enough for this bug verification

Comment 15 Michal Skrivanek 2018-02-28 11:52:47 UTC
dunno, you could have probably just copied an existing code. Maybe the file extension or imports are not correct, but that's really not the point of this verification, the hook mechanism is the same for all the hooks.

I think you've verified this bug good enough

Comment 20 errata-xmlrpc 2018-05-15 17:54:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1489

Comment 21 Franta Kust 2019-05-16 13:06:50 UTC
BZ<2>Jira Resync

Comment 22 Daniel Gur 2019-08-28 13:13:41 UTC
sync2jira

Comment 23 Daniel Gur 2019-08-28 13:17:55 UTC
sync2jira