Bug 1169007
| Summary: | [RFE] New fence agent fence_beaker | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jan Pokorný [poki] <jpokorny> |
| Component: | fence-agents | Assignee: | Marek Grac <mgrac> |
| Status: | CLOSED WONTFIX | QA Contact: | cluster-qe <cluster-qe> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.0 | CC: | cluster-maint, dcallagh, ebaak, jkortus, mgrac, mspqa-list, ncoghlan |
| Target Milestone: | pre-dev-freeze | Keywords: | FutureFeature |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Enhancement | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-10-10 14:17:02 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Jan Pokorný [poki]
2014-11-28 20:39:35 UTC
There is already the rhts-power command which a task can use to power on/off/reboot another system in the recipe set: https://beaker-project.org/docs/user-guide/task-environment.html#rhts-power But I guess when you talk about "fencing" you mean just isolating from the network temporarily -- not rebooting the entire system? Dan, thanks for pointing me to rhts-power. One question though, does it
take effect instantaneously, without any "gracefulness"? If so, it would
be a good fit.
> But I guess when you talk about "fencing" you mean just isolating from
> the network temporarily -- not rebooting the entire system?
Basically fencing for me is either hard (power), IO (network and/or
storage), or combined incl. suicides (cf. fence_sanlock). And I was
looking primarily at the first one, not sure how much trouble would
be with the network isolation nor I need it ATM.
So it looks the hypothetical fence_beaker could utilize rhts-power right
away, which is perfect. Are there any limitations with rhts-power that
should be considered? I suppose one cannot accidentally kill a machine
completely unrelated to the job at hand.
(In reply to Jan Pokorný from comment #2) > Dan, thanks for pointing me to rhts-power. One question though, does it > take effect instantaneously, without any "gracefulness"? If so, it would > be a good fit. It's not immediate, because the LC only polls for power commands periodically. rhts-power queues the power command on the server, but it doesn't start until the LC picks it up. The polling period is 20 seconds so you can expect the power action to start <= 20 seconds after rhts-power returns. The power command itself can also take some time -- it depends on many things, like how that particular system is power-controlled and how fast its management controller is (if it has one). S/390 VMs in particular take a while to power cycle. One other strange thing I have noticed recently, is that sometimes the power off commands *are* graceful. That is, the system seems to get a normal ACPI shutdown signal and systemd cleanly stops services. Normally the Beaker power commands do not have this effect. I suspect that some of the BMCs are trying to be nice to the system by shutting it down cleanly when told to power off. I haven't yet had time to investigate when or why this is happening or whether it's correct or desirable. > I suppose one cannot accidentally kill a machine > completely unrelated to the job at hand. One can, and so care should be taken not to do that. We have an old open RFE for authenticating lab controller API requests: bug 843687. @Dan: If any 'standard' fence agent works with 'graceful' power off then it is an error and we can fix it. In some cases, using fence agent is slow because we do not believe those devices (like IPMI) too much and we do: power off/wait until it is really off/power on/wait until it is really on this is slower than normal reboot. But our approach allows us to verify if fencing happened or not. What is usually not possible with reboot. Okay, thanks for the info Marek. I will keep an eye out for any systems which are doing a "graceful" shutdown. Many Beaker systems use the ipmitool power script, which calls "ipmitool power off" directly rather than using the fence_* scripts. https://git.beaker-project.org/cgit/beaker/tree/LabController/src/bkr/labcontroller/power-scripts/ipmitool Not to make this whole thing stale for too long, it seems that principially nothing block creating the proposed fence-beaker agent (for usage from machines under beaker's control, presumably in multihost tasks). So Marek, if you agree, could you please reassign this bug under fence-agents as a request for fence-beaker? If there is not enough a throughput, I could put my hands into that, seems quite a simple task. The only thing not doable at this point seem to be "list", "monitor", and "status" commands. |