Bug 876572 - [rhevm] host remains in 'UP' state although vdsm is not functional (getVdsCaps doesn't return) as engine use getVdsStats
Summary: [rhevm] host remains in 'UP' state although vdsm is not functional (getVdsCap...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.1.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Ayal Baron
QA Contact: vvyazmin@redhat.com
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-11-14 13:48 UTC by vvyazmin@redhat.com
Modified: 2016-02-10 19:40 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-11-18 15:14:07 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
## Logs vdsm, rhevm (1.18 MB, application/x-gzip)
2012-11-14 13:48 UTC, vvyazmin@redhat.com
no flags Details

Description vvyazmin@redhat.com 2012-11-14 13:48:19 UTC
Created attachment 644870 [details]
## Logs vdsm, rhevm

Description of problem: VDSM reports wrong state to engine, host stay in “UP” state, although Libvirt not responding. During deadlock Livbirt, host stay in “UP” state.

Version-Release number of selected component (if applicable):
RHEVM 3.1 - SI24.1

RHEVM: rhevm-3.1.0-28.el6ev.noarch
VDSM: vdsm-4.9.6-42.0.el6_3.x86_64
LIBVIRT: libvirt-0.9.10-21.el6_3.5.x86_64
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.295.el6_3.5.x86_64
SANLOCK: sanlock-2.3-4.el6_3.x86_64

How reproducible:
100%

Steps to Reproduce:
Enter Libvirt to deadlock on HSM server
gdb libvirt process 
  
Actual results:
Libvirt enter in deadlock.
VDSM failed respond to “vdsClient -s 0 getVdsCaps”
Engine send “vdsClient -s 0 getVdsStats” command and get respond, and mark that, no problem found on host.
VDSM report status “UP”

Expected results:
Engine need send  “vdsClient -s 0 getVdsCaps” command, and if respond failed, move a host to “Non Responsive” state. 
And in our case only “vdsClient -s 0 getVdsStats” command send and get respond

Additional info:
Real life scenario:
1. Create iSCSI DC with 2 hosts
2. Create VM with multiple disks on multiple storage domains
3. Run VM on HSM
4. Install OS (RHEL 6.3)
5. Install RHEV Agent (Guest Agent)
6. Create a snapshot
7. Snapshot --> Preview
8. Snapshot --> Commit
9. Snapshot --> Delete snapshot 
10. Power-on VM, OS stuck on boot
11. Failed Power OFF
12.  Failed power-on  a new created VM's in same DC

Comment 2 Barak 2012-11-18 15:14:07 UTC
After discussion with Yaniv.K & Miki we reached a decision that
It seems that there is only one scenario that libvirt is going deadlock.
In case more issues come up we'll reopen this BZ.
for now we CLOSE DEFERRED


Note You need to log in before you can comment on or make changes to this bug.