Bug 1965979 - After vdsm stop hooks are not executed due to soft fencing
Summary: After vdsm stop hooks are not executed due to soft fencing
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.40.60.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Marcin Sobczyk
QA Contact: Guilherme Santos
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-31 10:00 UTC by Petr Matyáš
Modified: 2022-02-01 11:05 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-02-01 11:05:45 UTC
oVirt Team: Infra
Embargoed:
lsvaty: blocker-


Attachments (Terms of Use)

Description Petr Matyáš 2021-05-31 10:00:39 UTC
Description of problem:
Having some trivial hooks in after_vdsm_stop folder and stopping the vdsm service is not executing the hooks even though they are executable due to host being automatically activated right after with soft fencing.
If I put to host to maintenance the hooks are executed.

Version-Release number of selected component (if applicable):
vdsm-4.40.60.7-1.el8ev.x86_64

How reproducible:
always

Steps to Reproduce:
1. echo 'touch /tmp/after' >/usr/libexec/vdsm/hooks/after_vdsm_stop/touching
2. chmod +x /usr/libexec/vdsm/hooks/after_vdsm_stop/touching
3. systemctl stop vdsmd

Actual results:
if the host is up in RHV the hooks are not executed

Expected results:
after vdsm stop hooks should be always executed

Additional info:

Comment 2 Petr Matyáš 2021-05-31 10:13:37 UTC
scripts in before_vdsm_start are executed even with this soft fencing so script in after_vdsm_stop should be executed in the same way

Comment 3 Martin Perina 2021-06-01 06:35:29 UTC
(In reply to Petr Matyáš from comment #0)
> Description of problem:
> Having some trivial hooks in after_vdsm_stop folder and stopping the vdsm
> service is not executing the hooks even though they are executable due to
> host being automatically activated right after with soft fencing.
> If I put to host to maintenance the hooks are executed.
> 
> Version-Release number of selected component (if applicable):
> vdsm-4.40.60.7-1.el8ev.x86_64
> 
> How reproducible:
> always
> 
> Steps to Reproduce:
> 1. echo 'touch /tmp/after' >/usr/libexec/vdsm/hooks/after_vdsm_stop/touching
> 2. chmod +x /usr/libexec/vdsm/hooks/after_vdsm_stop/touching
> 3. systemctl stop vdsmd

Hmm, how those hooks could be loaded into VDSM, when VDSM is stopped right after their creation? Shouldn't the correct flow contain additional VDSM restart step?

1. echo 'touch /tmp/after' >/usr/libexec/vdsm/hooks/after_vdsm_stop/touching
2. chmod +x /usr/libexec/vdsm/hooks/after_vdsm_stop/touching
3. systemctl restart vdsmd
4. systemctl stop vdsmd

Comment 4 Petr Matyáš 2021-06-01 06:56:41 UTC
If you cared to open the log you would see the hook was loaded on multiple occasions, I tried to stop (without changing the script) multiple times (which caused restart on most cases) with different setting in RHV for the host.

Comment 7 Marcin Sobczyk 2021-06-17 09:57:05 UTC
This is effectively happening because of [1]. There was a bug in systemd [2] that could cause real troubles [3]
and an unfortunate side effect of the fix was that 'after_vdsm_stop' hooks are now working less reliably. Since the bug in systemd
got fixed, I'm discussing with vdsm maintainers the possibility of reverting [1]. If we decide to do so, I posted
a patch for OST to check if 'after_vdsm_hooks' work reliably [4].

[1] https://github.com/oVirt/vdsm/commit/f13aa4fe12602777938bf5d36b977ad19053f745
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1761260
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1759388
[4] https://gerrit.ovirt.org/#/c/ovirt-system-tests/+/115299/

Comment 8 Martin Perina 2022-02-01 11:05:45 UTC
Implementing this requires very complicated changes and tests, due resource limitations closing as deferred


Note You need to log in before you can comment on or make changes to this bug.