Bug 1323969
| Summary: | race on recovery prevents events to be delivered | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Francesco Romani <fromani> | ||||
| Component: | Core | Assignee: | Piotr Kliczewski <pkliczew> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Pavel Stehlik <pstehlik> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.17.30 | CC: | bugs, fromani, mperina, oourfali, pkliczew, sbonazzo | ||||
| Target Milestone: | ovirt-4.0.0-beta | Flags: | oourfali:
ovirt-4.0.0?
rule-engine: planning_ack+ mperina: devel_ack+ pstehlik: testing_ack+ |
||||
| Target Release: | 4.17.999 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | Infra | ||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-08-17 14:36:58 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Francesco Romani
2016-04-05 08:25:01 UTC
Created attachment 1143709 [details]
full vdsm log attempting container recovery
Francesco, trying to understand the impact, is only the event gets lost, or are there additional issues? (In reply to Oved Ourfali from comment #2) > Francesco, trying to understand the impact, is only the event gets lost, or > are there additional issues? Should we face this bug, the recovery of VMs fail alltogether, Thread-12::INFO::2016-04-05 10:11:22,804::vm::1291::virt.vm::(setDownStatus) vmId=`dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5`::Changed state to Down: 'jsonrpc' (code=1) Thread-12::INFO::2016-04-05 10:11:22,805::guestagent::345::virt.vm::(stop) vmId=`dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5`::Stopping connection Thread-12::DEBUG::2016-04-05 10:11:22,805::vmchannels::234::vds::(unregister) Delete fileno 24 from listener. Thread-12::DEBUG::2016-04-05 10:11:22,806::vmchannels::66::vds::(_unregister_fd) Failed to unregister FD from epoll (ENOENT): 24 Thread-12::DEBUG::2016-04-05 10:11:22,833::__init__::207::jsonrpc.Notification::(emit) Sending event {"params": {"notify_time": 4376331750, "dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5": {"status": "Down", "timeOffset": "0", "exitReason": 1, "exitMessage": "'jsonrpc'", "exitCode": 1}}, "jsonrpc": "2.0", "method": "|virt|VM_status|dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5"} Thread-12::ERROR::2016-04-05 10:11:22,833::utils::374::root::(wrapper) Unhandled exception And the VMs are thus destroyed by Engine: jsonrpc.Executor/7::DEBUG::2016-04-05 10:11:37,813::__init__::511::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.destroy' in bridge with {u'vmID': u'dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5'} jsonrpc.Executor/7::DEBUG::2016-04-05 10:11:37,815::API::310::vds::(destroy) About to destroy VM dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5 jsonrpc.Executor/7::DEBUG::2016-04-05 10:11:37,816::vm::3960::virt.vm::(destroy) vmId=`dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5`::destroy Called jsonrpc.Executor/7::INFO::2016-04-05 10:11:37,816::vm::3875::virt.vm::(releaseVm) vmId=`dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5`::Release VM resources jsonrpc.Executor/7::WARNING::2016-04-05 10:11:37,816::vm::335::virt.vm::(_set_lastStatus) vmId=`dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5`::trying to set state to Powering down when already Down jsonrpc.Executor/7::INFO::2016-04-05 10:11:37,817::guestagent::345::virt.vm::(stop) vmId=`dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5`::Stopping connection jsonrpc.Executor/7::INFO::2016-04-05 10:11:37,817::vm::3907::virt.vm::(_destroyVm) vmId=`dd38b452-0a9a-4e9e-a349-2f4ca73cc8c5`::_destroyVmGraceful attempt #0 OTOH I'd like to remark that I've never encountered this misebehaviour without my experimental container patches: I never seen this happen on Vdsm master. Closed due to capacity, if still reproduce, please reopen. |