Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1109544

Summary:

host fail to become "UP", from maintenance, following VMs migration.

Product:

Red Hat Enterprise Virtualization Manager

Reporter:

Ilanit Stein <istein>

Component:

ovirt-engine

Assignee:

Nobody <nobody>

Status:

CLOSED NOTABUG

QA Contact:

Ilanit Stein <istein>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

3.4.0

CC:

acathrow, gklein, iheim, lpeer, oourfali, Rhev-m-bugs, yeylon

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

virt

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2014-06-17 10:43:50 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
host1 event log	none
engine log	none
host2 vdsm log	none
host 2 libvirt log	none
host1 vdsm log	none
host1 libvirt log	none

Description Ilanit Stein 2014-06-15 07:02:16 UTC

Description of problem:
host1(rhev-h) run 5 VMs. Turn host1 to maintenance. All VMs migrated to host2 (rhel6.5). Activate host1 - fail to become "UP":

Fail to access storage domain (iscsi), cannot access storage pool, 
and "Available memory of host1 [0 MB] is under defined threshold [1024 MB]" single event.
Eventually after ~1 hour, host was removed, and reinstalled, after network error, and recovery from crash, it became up, shortly after installed.
 

Version-Release number of selected component (if applicable):
engine: av9.4
host1 rhev-h 6.5 20140603.2.el6ev
host2 rhel6.5)
vdsm, libvirt for both hosts:
vdsm-4.14.7-3.el6ev.x86_64, 
libvirt-0.10.2-29.el6_5.8.x86_64

How reproducible:
Did not try to reproduce

Actual results:
host1 should have turned up. There should not have been error on access to storage domain, or warning on host available memory 0

Additional info:
Each VM has 1024M memory, installed with rhel6.5 + guest agent.

There is automatic test, which passes: 2 rhel hosts, with 5 VMs running. Put host1 to maintenance, and after migration, put host 1 back up, and turn host2 to maintenance.

Comment 1 Ilanit Stein 2014-06-15 07:06:22 UTC

Created attachment 908886 [details]
host1 event log

Comment 2 Ilanit Stein 2014-06-15 07:19:58 UTC

Created attachment 908887 [details]
engine log

Comment 3 Ilanit Stein 2014-06-15 07:39:59 UTC

Created attachment 908889 [details]
host2 vdsm log

Comment 4 Ilanit Stein 2014-06-15 07:40:33 UTC

Created attachment 908890 [details]
host 2 libvirt log

Comment 5 Ilanit Stein 2014-06-15 07:48:25 UTC

Created attachment 908891 [details]
host1 vdsm log

Comment 6 Ilanit Stein 2014-06-15 07:48:57 UTC

Created attachment 908892 [details]
host1 libvirt log

Comment 8 Ilanit Stein 2014-06-15 08:55:56 UTC

Problem did not reproduce:

Run 5 VMs on host1,
Move host1 to maintenance.
error event: Failed to switch to maintenance,
But right after this error, events for all VMs migrations to host2 were completed,
and host1 became in maintenance.
Then, activate host1 worked fine.

Comment 9 Ilanit Stein 2014-06-17 10:43:50 UTC

QE storage guys investigation showed the problem to activate host, occurred because the hosts contained many old storage connections, that made the storage domain connection very long, more than 3 min, which is the default timeout configured in /etc/multipath.conf.
The host connection to storage succeeded eventually, but as timeout expired, engine considered it as a failure.

As this is not a bug, but a matter of configuration, and "slow" host, I am closing the bug.