Bug 625362
Summary: | libvirt-guests should start and shut down guests in parallel | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Dan Kenigsberg <danken> |
Component: | libvirt | Assignee: | Peter Krempa <pkrempa> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | low | Docs Contact: | |
Priority: | high | ||
Version: | 6.0 | CC: | atodorov, cpelland, dallan, dyuan, eblake, mzhan, rwu, xen-maint |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | libvirt-0.9.10-4.el6 | Doc Type: | Bug Fix |
Doc Text: |
Cause:
The libvirt-guests script executed operations on guests serially.
Consequence:
On machines with lots of guests the shutdown procedure took long as guests were waiting for shutdown of others. The shutdown procedure was inefficient as guests didn't use up all resources available.
Fix:
The libvirt-guests init script was tweaked to enable parallel operation on domains allowing to shorten the time of shutdown of the host.
Result:
The guests start and shutdown in parallel and utilize the host system's resources more efficiently. The shutdown time of the host will decrease in most cases.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2012-06-20 06:24:48 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dan Kenigsberg
2010-08-19 07:50:40 UTC
Sorry, no need to stress up for this one (from RHEV perspective). We are disabling libvirt-guests on our nodes, anyway. *** Bug 729114 has been marked as a duplicate of this bug. *** I see. the BOOT_TIMEOUT will be of no effect when the behavior is in parallel mode, yes ? BOOT_TIMEOUT (in the script it's actually called START_DELAY) actualy configures if the startup of the machines should be parallel (START_DELAY=0) or the script should wait the specified amount of time for the quest. Unfortunately, there's no way to reliably wait and detect a full guest boot-up, so we have to use a timeout. I'll add documentation to the START_DELAY variable about the serial/parallel behavior of the script depending on the configuration. With this, the user may specify independently the startup and shutdown behaviors of the libvirt-guests script (eg. parallel startup and serial shutdown ...). Please help to check the scenario 2, thanks. ####Scenario 1#### okay. ON_SHUTDOWN=shutdown PARALLEL_SHUTDOWN=3 SHUTDOWN_TIMEOUT=300 # service libvirt-guests stop Running guests on default URI: rhel58, rhel62, rhel62-1, vr-guest_managedsave Shutting down guests on default URI... Starting shutdown on guest: rhel58 Starting shutdown on guest: rhel62 Starting shutdown on guest: rhel62-1 Shutdown of guest rhel62 complete. Starting shutdown on guest: vr-guest_managedsave Shutdown of guest rhel62-1 complete. Shutdown of guest rhel58 complete. Shutdown of guest vr-guest_managedsave complete. ####Scenario 2#### miss the 4th guest. ON_SHUTDOWN=shutdown PARALLEL_SHUTDOWN=3 SHUTDOWN_TIMEOUT=1 # service libvirt-guests stop Running guests on default URI: rhel58, rhel62, rhel62-1, vr-guest_managedsave Shutting down guests on default URI... Starting shutdown on guest: rhel58 Starting shutdown on guest: rhel62 Starting shutdown on guest: rhel62-1 Timeout expired while shutting down domains but the UUID is recorded in libvirt-guests. # cat /var/lib/libvirt/libvirt-guests default 83e69755-f692-6413-0c90-2213eddbbbde 6a6839c3-8b51-125e-262f-f2d384367c49 05d9a9f8-3def-491c-e649-87718ea2d98a 3862afa0-3 ff8-80d1-51f2-cff6ec3880a6 ####Scenario 3#### ON_SHUTDOWN=shutdown PARALLEL_SHUTDOWN=0 SHUTDOWN_TIMEOUT=300 # service libvirt-guests stop Running guests on default URI: rhel58, rhel62, rhel62-1, vr-guest_managedsave Shutting down guests on default URI... Shutting down rhel58: done Shutting down rhel62: done Shutting down rhel62-1: done Shutting down vr-guest_managedsave: done ####Scenario 4#### ON_SHUTDOWN=shutdown PARALLEL_SHUTDOWN=0 SHUTDOWN_TIMEOUT=1 # service libvirt-guests stop Running guests on default URI: rhel58, rhel62, rhel62-1, vr-guest_managedsave Shutting down guests on default URI... Shutting down rhel58: failed to shutdown in time Shutting down rhel62: failed to shutdown in time Shutting down rhel62-1: failed to shutdown in time Shutting down vr-guest_managedsave: failed to shutdown in time In scenario 2 the timeout expires while the first three machines are still shutting down. As you only requested to shutdown 3 machines at time the fourth was never attempted because the timeout expired. Shutdown timeout in case of parallel shutdown operation is applied as a timeout to attempt to shut down all machines on a single URI. This is documented in the sysconfig file above the HUTDOWN_TIMEOUT variable: # Number of seconds we're willing to wait for a guest to shut down. If parallel # shutdown is enabled, this timeout applies as a timeout for shutting down all # guests on a single URI defined in the variable URIS. Thanks Peter, move to VERIFIED according to comment 17 and comment 18. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: The libvirt-guests script executed operations on guests serially. Consequence: On machines with lots of guests the shutdown procedure took long as guests were waiting for shutdown of others. The shutdown procedure was inefficient as guests didn't use up all resources available. Fix: The libvirt-guests init script was tweaked to enable parallel operation on domains allowing to shorten the time of shutdown of the host. Result: The guests start and shutdown in parallel and utilize the host system's resources more efficiently. The shutdown time of the host will decrease in most cases. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0748.html |