Bug 1423657

Summary: Add value to engine-config to set timeout after successful fence start
Product: [oVirt] ovirt-engine Reporter: Petr Matyáš <pmatyas>
Component: BLL.InfraAssignee: Ondra Machacek <omachace>
Status: CLOSED CURRENTRELEASE QA Contact: Petr Matyáš <pmatyas>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.1CC: bugs, lsvaty, lveyde, mgoldboi, mperina, pstehlik
Target Milestone: ovirt-4.1.2Flags: rule-engine: ovirt-4.1+
rule-engine: ovirt-4.2+
rule-engine: blocker+
mgoldboi: planning_ack+
mperina: devel_ack+
lsvaty: testing_ack+
Target Release: 4.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
When power management start or restart action is executed, we switch host to REBOOT state and wait for number of seconds which are defined in 'ServerRebootTimeout' engine-config property. After that timeout we switch host to NON_RESPONSIVE state, so the host monitoring can handle the host.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-23 08:13:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Petr Matyáš 2017-02-17 14:15:06 UTC
Description of problem:
Now it's about 2 minutes before fencing mechanism tries to fence the host again, but my host takes about 3-4 minutes to boot, this results in fencing the host again and again.

Version-Release number of selected component (if applicable):
4.1.1-1

How reproducible:
always

Steps to Reproduce:
1. have a host that takes more than 2 minutes to boot
2. fence the host
3.

Actual results:
repeated fencing

Expected results:
one successful fence

Additional info:

Comment 1 Martin Perina 2017-02-17 14:40:03 UTC
It make sense to wait a bit after successful power management start operation in fencing flow before we allow host monitoring to try to contact the host. We are using ServerRebootTimeout (by default 5 minutes) during install host flow when restart of the host is required.

So I'd use the same config value ServerRebootTimeout inside power management start flow:

1. Set host status to Reboot
2. Execute power management start
3. If start was successfully, wait until ServerRebootTimeout interval pass
4. Set host status to Maintenance or NonResponsive (depending of a way how StartVdsCommand was invoked)

Comment 2 Petr Matyáš 2017-04-26 15:14:15 UTC
Verified on 4.1.2-1