Bug 1064471
| Summary: | [vdsm] resuming VM from paused state fails with an AttributeError | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Retired] oVirt | Reporter: | Elad <ebenahar> | ||||||
| Component: | vdsm | Assignee: | Nir Soffer <nsoffer> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Aharon Canan <acanan> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 3.4 | CC: | acathrow, amureini, bazulay, ebenahar, gklein, iheim, mgoldboi, michal.skrivanek, nsednev, nsoffer, yeylon | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 3.4.1 | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | storage | ||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2014-05-08 13:37:03 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1036358 | ||||||||
| Attachments: |
|
||||||||
Looks like a duplicate of bug 1063336. Vm._dom is None, when storage is not available when starting vdsm and a vm is running. (In reply to Elad from comment #0) > Steps to Reproduce: > On shared storage data center with an iSCSI domain: > 1. have a running VM > 2. block connectivity between host to storage server using iptables, wait > for VM to become 'paused' > 2. resume connectivity to the storage server and wait for the domain to > become active > 3. try to start VM from pause Why do you start the vm? it should start when the domain becomes valid. > > Actual results: > Resuming the VM fails with the following message in vdsm.log: > ... > AttributeError: 'NoneType' object has no attribute 'XMLDesc' There is no such error in the attached log. Please reproduce again and provide complete vdsm.log. Note for reproduction: This seems to be a duplicate of bug 1063336. In that bug, a vm was running when vdsm was started but storage was not available. So the vm recovery failed, leading to vm object with a _dom == None. To make sure this is not a duplicate, make sure that storage is up when vdsm start, and start the vm after vdsm starts. 1. start vdsm 2. ensure that the domain is accessible 3. start a vm and wait unitl it is up 4. block connectivity to the storage server 5. wait until vm is paused 6. unblock connectivity to storage server 7. wait until vm is unpaused If you start the vm before storage becomes connected again, the vm is expected to fail - but not with the error in this bug. It takes time to get the connectivity back after blocking the connection. Created attachment 868350 [details] vdsm.log (In reply to Nir Soffer from comment #2) > (In reply to Elad from comment #0) > > Steps to Reproduce: > > On shared storage data center with an iSCSI domain: > > 1. have a running VM > > 2. block connectivity between host to storage server using iptables, wait > > for VM to become 'paused' > > 2. resume connectivity to the storage server and wait for the domain to > > become active > > 3. try to start VM from pause > > Why do you start the vm? it should start when the domain becomes valid. It doesn't resume the VM automatically when the domain becomes active, I waited for ~20 minutes and only then, resumed it manually > > > > > Actual results: > > Resuming the VM fails with the following message in vdsm.log: > > ... > > AttributeError: 'NoneType' object has no attribute 'XMLDesc' > > There is no such error in the attached log. Uploading the right log > > > Note for reproduction: > > This seems to be a duplicate of bug 1063336. In that bug, a vm was running > when vdsm was started but storage was not available. So the vm recovery > failed, leading to vm object with a _dom == None. > > To make sure this is not a duplicate, make sure that storage is up when vdsm > start, and start the vm after vdsm starts. > > 1. start vdsm > 2. ensure that the domain is accessible > 3. start a vm and wait unitl it is up > 4. block connectivity to the storage server > 5. wait until vm is paused > 6. unblock connectivity to storage server > 7. wait until vm is unpaused This is exactly the scenario, only VM didn't get unpaused, manual intervention was required. > If you start the vm before storage becomes connected again, the vm is > expected to fail - but not with the error in this bug. It takes time to get > the connectivity back after blocking the connection. This is an automated message. Re-targeting all non-blocker bugs still open on 3.4.0 to 3.4.1. indeed this is probably bug 1063336. But do we know why it wasn't started automatically? Is it a consequence of bug 1063336? (In reply to Michal Skrivanek from comment #5) > indeed this is probably bug 1063336. But do we know why it wasn't started > automatically? Is it a consequence of bug 1063336? Well if libvirt connection was not created (_dom is None), then how would this vm be started? If this is the only failure this bug is going to be addressed by the fix of bug 1063336 As per Michal's comment, moving this bug to MODIFIED. Once a solution to bug 1063336 is delivered to QA, /this/ bug should be tested with the new build too. (In reply to Allon Mureinik from comment #8) > As per Michal's comment, moving this bug to MODIFIED. > Once a solution to bug 1063336 is delivered to QA, /this/ bug should be > tested with the new build too. Moving to ON_QA, as bug 1063336 is CLOSED CURRENTRELEASE This is an automated message oVirt 3.4.1 has been released: * should fix your issue * should be available at your local mirror within two days. If problems still persist, please make note of it in this bug report. |
Created attachment 862438 [details] engine and vdsm logs Description of problem: Resuming from paused state fails with AttributeError on vdsm. Version-Release number of selected component (if applicable): vdsm-4.14.1-3.el6.x86_64 Steps to Reproduce: On shared storage data center with an iSCSI domain: 1. have a running VM 2. block connectivity between host to storage server using iptables, wait for VM to become 'paused' 2. resume connectivity to the storage server and wait for the domain to become active 3. try to start VM from pause Actual results: Resuming the VM fails with the following message in vdsm.log: Thread-1768::ERROR::2014-02-12 17:45:13,409::BindingXMLRPC::989::vds::(wrapper) unexpected error Traceback (most recent call last): File "/usr/share/vdsm/BindingXMLRPC.py", line 973, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/BindingXMLRPC.py", line 209, in vmCont return vm.cont() File "/usr/share/vdsm/API.py", line 154, in cont return v.cont() File "/usr/share/vdsm/vm.py", line 2576, in cont self._underlyingCont() File "/usr/share/vdsm/vm.py", line 3715, in _underlyingCont hooks.before_vm_cont(self._dom.XMLDesc(0), self.conf) AttributeError: 'NoneType' object has no attribute 'XMLDesc' Additional info: engine and vdsm logs