Bug 1020858 - [RFE] rescue mode for hosted engine
[RFE] rescue mode for hosted engine
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-setup (Show other bugs)
3.3.0
Unspecified Unspecified
high Severity medium
: ---
: 3.3.0
Assigned To: Sandro Bonazzola
Leonid Natapov
integration
: FutureFeature, Improvement, Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-18 07:45 EDT by Pablo Iranzo Gómez
Modified: 2018-03-12 11:42 EDT (History)
12 users (show)

See Also:
Fixed In Version: ovirt-hosted-engine-setup-1.0.0-0.7.beta2.el6ev
Doc Type: Enhancement
Doc Text:
Feature: Hosted engine maintenance mode Reason: Provide a method to stop hosted engine automation in order to maintain the hosted engine VM or hosts on which it runs. Result (if any): The hosted-engine tool has 2 new options added for this feature: --set-maintenance=<local|global|none> - This sets the maintenance mode of the hosted engine agent: - local: shut the engine VM down on the local host (if running), and lower the host's HA score to 0. This allows for maintenance of the local host, including putting the host into maintenance mode in the engine. - global: pause all activity of all HA agents, leaving the engine VM running on its current host. The VM can now be start, shut down, and/or changed as desired without the HA agent taking any action to ensure it is up and running. - none: un-set maintenance mode and resume normal operation on HA agents. Note that this option affects only the scope of maintenance in force at the time; that is, turning off local maintenance will turn it off only on the local host, while turning off global maintenance will turn it off for all hosts of the cluster. --vm-start-paused - This option will create the VM in paused state, allowing the VM's domain XML to be changed before the VM runs. One useful application of this is to add/remove devices, particularly boot media.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-21 11:54:28 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3116881 None None None 2018-03-12 11:42 EDT
oVirt gerrit 20443 None None None Never
oVirt gerrit 20460 None None None Never
oVirt gerrit 20612 None None None Never
oVirt gerrit 20656 None None None Never

  None (edit)
Description Pablo Iranzo Gómez 2013-10-18 07:45:32 EDT
Description of problem:

I did an upgrade of the hosted-engine VM to EL 6.5 beta (as I did with hypervisors), and after that, the VM no longer booted.

I was able to use the --console to see the kernel panic but I was not able to reboot it using keystrokes.

I finally needed to shutdown both hypervisors to ensure that VM was left shutdown, and use qemu from my desktop to boot the hardrive, use grub in the VM to select the working entry, and try to do the 'troubleshooting' from there.

Hosted should have some sort of 'emergency' or 'rescue' mode for engine, allowing to start the vm paused, maybe pressing for a key or whatever (even a custom iso that just reboots until yo press a key in bios or select another boot entry), or that even allows you to put a custom iso (or rhel one) to do the rescue itself.

During that period, no hypervisor should try to start it until the rescue is marked as finished.


Tested on IS19
Comment 1 Itamar Heim 2013-10-20 12:30:09 EDT
greg - we discussed some central "maint" flag which would cause HA activity to stop?
Comment 2 Greg Padgett 2013-10-20 16:23:52 EDT
(In reply to Itamar Heim from comment #1)
> greg - we discussed some central "maint" flag which would cause HA activity
> to stop?

Yes, bug 1015724 covers this part of the rfe.  There's a patch sent for it as well.

The remainder of the rfe seems to fall on the setup/cli side of things, adding Sandro as cc for this.
Comment 3 Sandro Bonazzola 2013-10-21 02:38:09 EDT
So the request is for adding --set-mainteinance=True/False and --vm-start-paused to hosted-engine tool?
Comment 4 Pablo Iranzo Gómez 2013-10-21 03:52:29 EDT
Sandro,(In reply to Sandro Bonazzola from comment #3)
> So the request is for adding --set-mainteinance=True/False and
> --vm-start-paused to hosted-engine tool?

I think that both of them could be helpful, adding the ability to start it with an ISO would be useful too, just to use it as a 'rescue' boot.

Thanks,
Pablo
Comment 5 Greg Padgett 2013-10-21 09:33:55 EDT
(In reply to Sandro Bonazzola from comment #3)
> So the request is for adding --set-mainteinance=True/False and

Probably global/local/none options, see agent api patch at http://gerrit.ovirt.org/#/c/20278/
Comment 6 Pablo Iranzo Gómez 2013-10-21 09:47:32 EDT
(In reply to Greg Padgett from comment #5)
> (In reply to Sandro Bonazzola from comment #3)
> > So the request is for adding --set-mainteinance=True/False and
> 
> Probably global/local/none options, see agent api patch at
> http://gerrit.ovirt.org/#/c/20278/

Looks good for me to use that flag, it can also be used by yum or third-party tools to avoid messing the system ;-)
Comment 7 Sandro Bonazzola 2013-10-23 10:23:35 EDT
(In reply to Pablo Iranzo Gómez from comment #4)
> Sandro,(In reply to Sandro Bonazzola from comment #3)
> > So the request is for adding --set-mainteinance=True/False and
> > --vm-start-paused to hosted-engine tool?
> 
> I think that both of them could be helpful, adding the ability to start it
> with an ISO would be useful too, just to use it as a 'rescue' boot.

once you start with --vm-start-paused you can use vdsClient hotplugDisk to attach a cd-rom image. I would like to avoid to replicate vdsClient commands in hosted-engine command if not really needed.

> 
> Thanks,
> Pablo
Comment 9 Sandro Bonazzola 2013-10-29 10:35:52 EDT
Patches merged on upstream master and 1.0 branches.
Comment 14 errata-xmlrpc 2014-01-21 11:54:28 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0083.html

Note You need to log in before you can comment on or make changes to this bug.