Bug 869258
Summary: | [Engine] - error message "VdcBLLException: null" when engine tries to run VM and operation is failing due to network timeout exception | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | David Botzer <dbotzer> | ||||
Component: | ovirt-engine | Assignee: | Michal Skrivanek <michal.skrivanek> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Pavel Stehlik <pstehlik> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.1.0 | CC: | acathrow, dyasny, hateya, iheim, lpeer, ofrenkel, pstehlik, rgolan, Rhev-m-bugs, sgrinber, yeylon, ykaul | ||||
Target Milestone: | --- | ||||||
Target Release: | 3.2.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | virt | ||||||
Fixed In Version: | sf2 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | Type: | Bug | |||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 915537 | ||||||
Attachments: |
|
The problem is that when Engine tries to connect to vdsm, while vdsm is in deadlock, there is a timeout, But the engine also sends Exception with NULL to the log, which it shouldnt be. error message "VdcBLLException: null" when engine tries to run VM and operation is failing due to network timeout exception I'd also rename VdcBllException... Is there a BZ for the deadlock which is more important than the error message that we can clean up in 3.2 (In reply to comment #3) > Is there a BZ for the deadlock which is more important than the error > message that we can clean up in 3.2 deadlock was originated in libvirt and already solved in libvirt-0.9.10-21.el6_3.6.x86_64. better error reporting was merged [1] so now you should see something like 2012-12-06 17:23:14,424 ERROR [org.ovirt.engine.core.bll.StopVmCommand] (pool-10-thread-48) [7c2d1e0a] Command org.ovirt.engine.core.bll.StopVmCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.util.concurrent.TimeoutException [1] http://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=commit;h=6b7ae33a8ef9682adf00bf2495487d3617ffc99b to test the change you can block a host to 54321 and see the error underlying exception is now printed host shell: iptables -I INPUT --proto tcp --dport 54321 -j REJECT engine: 2012-12-30 14:37:15,282 WARN [org.ovirt.engine.core.vdsbroker.VdsManager] (QuartzScheduler_Worker-13) ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds = ab416c3a-4374-43b1-961b-008897d74b87 : suz, VDS Network Error, continuing. java.net.NoRouteToHostException: No route to host Already solved and verified. 3.2 has been released 3.2 has been released 3.2 has been released 3.2 has been released 3.2 has been released |
Created attachment 632063 [details] engine Description of problem: error message VdcBLLException: null when libvirt & vdsm are in deadlock Version-Release number of selected component (if applicable): 3.1/si21 How reproducible: always Steps to Reproduce: 1.Installed rhevm+dwh+reports 2.Created setup - DC, Host, Storage(iscsi), vms 3.Changed engine & vds date to 31/12/12 Actual results: All VMs show "Not Responding" Host CPU jumps to 99% libvirt in deadlock while trying to destroy qemu process while it can't communicate with its monitor socket Expected results: should give such exceptions in engine log Additional info: logs