Bug 869258 - [Engine] - error message "VdcBLLException: null" when engine tries to run VM and operation is failing due to network timeout exception
[Engine] - error message "VdcBLLException: null" when engine tries to run VM ...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.1.0
Unspecified Unspecified
unspecified Severity high
: ---
: 3.2.0
Assigned To: Michal Skrivanek
Pavel Stehlik
virt
:
Depends On:
Blocks: 915537
  Show dependency treegraph
 
Reported: 2012-10-23 08:23 EDT by David Botzer
Modified: 2014-01-13 19:04 EST (History)
12 users (show)

See Also:
Fixed In Version: sf2
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
engine (219.15 KB, application/x-gzip)
2012-10-23 08:23 EDT, David Botzer
no flags Details

  None (edit)
Description David Botzer 2012-10-23 08:23:28 EDT
Created attachment 632063 [details]
engine

Description of problem:
error message VdcBLLException: null when libvirt & vdsm are in deadlock

Version-Release number of selected component (if applicable):
3.1/si21

How reproducible:
always

Steps to Reproduce:
1.Installed rhevm+dwh+reports
2.Created setup - DC, Host, Storage(iscsi), vms
3.Changed engine & vds date to 31/12/12
  
Actual results:
All VMs show "Not Responding"
Host CPU jumps to 99%
libvirt in deadlock while trying to destroy qemu process while it can't communicate with its monitor socket

Expected results:
should give such exceptions in engine log

Additional info:
logs
Comment 1 David Botzer 2012-10-23 08:35:47 EDT
The problem is that when Engine tries to connect to vdsm, while vdsm is in deadlock, there is a timeout,
But the engine also sends Exception with NULL to the log, which it shouldnt be.

error message "VdcBLLException: null" when engine tries to run VM and operation is failing due to network timeout exception
Comment 2 Itamar Heim 2012-10-24 01:08:52 EDT
I'd also rename VdcBllException...
Comment 3 Andrew Cathrow 2012-12-03 06:36:08 EST
Is there a BZ for the deadlock which is more important than the error message that we can clean up in 3.2
Comment 4 Haim 2012-12-04 02:37:41 EST
(In reply to comment #3)
> Is there a BZ for the deadlock which is more important than the error
> message that we can clean up in 3.2

deadlock was originated in libvirt and already solved in libvirt-0.9.10-21.el6_3.6.x86_64.
Comment 5 Roy Golan 2012-12-06 10:27:07 EST
better error reporting was merged [1]  so now you should see something like

2012-12-06 17:23:14,424 ERROR [org.ovirt.engine.core.bll.StopVmCommand] (pool-10-thread-48) [7c2d1e0a] Command org.ovirt.engine.core.bll.StopVmCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.util.concurrent.TimeoutException

[1] http://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=commit;h=6b7ae33a8ef9682adf00bf2495487d3617ffc99b
Comment 7 Roy Golan 2012-12-30 07:49:57 EST
to test the change you can block a host to 54321 and see the error underlying exception is now printed

host shell:
iptables -I INPUT --proto tcp --dport 54321 -j REJECT

engine:
2012-12-30 14:37:15,282 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (QuartzScheduler_Worker-13) ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds = ab416c3a-4374-43b1-961b-008897d74b87 : suz, VDS Network Error, continuing.
java.net.NoRouteToHostException: No route to host
Comment 9 Libor Spevak 2013-02-24 09:37:29 EST
Already solved and verified.
Comment 11 Itamar Heim 2013-06-11 04:55:04 EDT
3.2 has been released
Comment 12 Itamar Heim 2013-06-11 04:55:07 EDT
3.2 has been released
Comment 13 Itamar Heim 2013-06-11 04:55:10 EDT
3.2 has been released
Comment 14 Itamar Heim 2013-06-11 04:58:05 EDT
3.2 has been released
Comment 15 Itamar Heim 2013-06-11 05:27:41 EDT
3.2 has been released

Note You need to log in before you can comment on or make changes to this bug.