Bug 1119699 - ovirt-ha-agent dead but subsys locked
Summary: ovirt-ha-agent dead but subsys locked
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-ha
Version: 3.4.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: 3.4.2
Assignee: Jiri Moskovcak
QA Contact: Nikolai Sednev
URL:
Whiteboard: sla
Depends On: 1097767
Blocks: 1123858
TreeView+ depends on / blocked
 
Reported: 2014-07-15 10:12 UTC by rhev-integ
Modified: 2016-02-10 20:18 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Rebase: Bug Fixes Only
Doc Text:
Rebase package(s) to version: 1.1.5 Highlights and important bug fixes: The rebase just drop patches included in the rpm, now included in the source tarball. Other bugs addressed in the rebase will be attached to the errata. About this bug (leaving to Jiri to complete): Previously, ovirt-ha-agent did not always wait long enough before attempting to connect to storage, which would result in a failure to connect. Now, the wait time is configurable so that the agent will wait long enough, and will retry if necessary, to successfully connect to storage.
Clone Of: 1097767
Environment:
Last Closed: 2014-09-04 12:47:32 UTC
oVirt Team: SLA


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1155 normal SHIPPED_LIVE ovirt-hosted-engine-ha bug fix and enhancement update 2014-09-04 16:47:05 UTC
oVirt gerrit 28047 None None None Never
oVirt gerrit 29937 ovirt-hosted-engine-ha-1.1 MERGED try harder when initializing Never

Comment 2 Nikolai Sednev 2014-08-12 18:04:16 UTC
Verified on these components:

2 Hosts with:
Linux version 2.6.32-431.23.3.el6.x86_64 (mockbuild@x86-027.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Wed Jul 16 06:12:23 EDT 2014

vdsm-4.14.13-1.el6ev.x86_64
qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64
ovirt-hosted-engine-setup-1.1.5-1.el6ev.noarch
ovirt-host-deploy-1.3.0-0.0.master.20140629072144.gitdc1f589.el6.noarch
libvirt-0.10.2-29.el6_5.10.x86_64
sanlock-2.8-1.el6.x86_64
ovirt-hosted-engine-ha-1.1.5-1.el6ev.noarch
qemu-kvm-rhev-tools-0.12.1.2-2.415.el6_5.14.x86_64
ovirt-host-deploy-java-1.3.0-0.0.master.20140629072144.gitdc1f589.el6.noarch

Engine av11:
Linux version 2.6.32-431.23.3.el6.x86_64 (mockbuild@x86-027.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Wed Jul 16 06:12:23 EDT 2014

rhevm-3.4.2-0.1.el6ev.noarch

Comment 4 Jiri Moskovcak 2014-09-01 08:56:09 UTC
Cause: agent doesn't wait long enough for vdsm to connect the storage
Consequence: agent tries to access the storage before it's ready
Fix: wait longer and retry a few times (this is configurable in case some systems needs a different grace time)
Result: agent waits long enough and the storage is successfully connected

(note: yes, this is similar to 1119702, but the fix was on a different place)

Comment 6 errata-xmlrpc 2014-09-04 12:47:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1155.html


Note You need to log in before you can comment on or make changes to this bug.